Searched hist:"6 c49ba4025ee04913eedac508260aefcc2766d1f" (Results 1 – 1 of 1) sorted by relevance
/openbmc/linux/drivers/gpu/drm/amd/amdkfd/ |
H A D | kfd_process.c | diff 6c49ba4025ee04913eedac508260aefcc2766d1f Wed Apr 10 08:14:13 CDT 2024 Lancelot SIX <lancelot.six@amd.com> drm/amdkfd: Flush the process wq before creating a kfd_process
[ Upstream commit f5b9053398e70a0c10aa9cb4dd5910ab6bc457c5 ]
There is a race condition when re-creating a kfd_process for a process. This has been observed when a process under the debugger executes exec(3). In this scenario: - The process executes exec. - This will eventually release the process's mm, which will cause the kfd_process object associated with the process to be freed (kfd_process_free_notifier decrements the reference count to the kfd_process to 0). This causes kfd_process_ref_release to enqueue kfd_process_wq_release to the kfd_process_wq. - The debugger receives the PTRACE_EVENT_EXEC notification, and tries to re-enable AMDGPU traps (KFD_IOC_DBG_TRAP_ENABLE). - When handling this request, KFD tries to re-create a kfd_process. This eventually calls kfd_create_process and kobject_init_and_add.
At this point the call to kobject_init_and_add can fail because the old kfd_process.kobj has not been freed yet by kfd_process_wq_release.
This patch proposes to avoid this race by making sure to drain kfd_process_wq before creating a new kfd_process object. This way, we know that any cleanup task is done executing when we reach kobject_init_and_add.
Signed-off-by: Lancelot SIX <lancelot.six@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
|