Revision tags: v6.6.25, v6.6.24, v6.6.23 |
|
#
26843936 |
| 20-Mar-2024 |
Eric Huang <jinhuieric.huang@amd.com> |
drm/amdkfd: fix TLB flush after unmap for GFX9.4.2
commit 1210e2f1033dc56b666c9f6dfb761a2d3f9f5d6c upstream.
TLB flush after unmap accidentially was removed on gfx9.4.2. It is to add it back.
Sign
drm/amdkfd: fix TLB flush after unmap for GFX9.4.2
commit 1210e2f1033dc56b666c9f6dfb761a2d3f9f5d6c upstream.
TLB flush after unmap accidentially was removed on gfx9.4.2. It is to add it back.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
Revision tags: v6.6.25, v6.6.24, v6.6.23 |
|
#
26843936 |
| 20-Mar-2024 |
Eric Huang <jinhuieric.huang@amd.com> |
drm/amdkfd: fix TLB flush after unmap for GFX9.4.2
commit 1210e2f1033dc56b666c9f6dfb761a2d3f9f5d6c upstream.
TLB flush after unmap accidentially was removed on gfx9.4.2. It is to add it back.
Sign
drm/amdkfd: fix TLB flush after unmap for GFX9.4.2
commit 1210e2f1033dc56b666c9f6dfb761a2d3f9f5d6c upstream.
TLB flush after unmap accidentially was removed on gfx9.4.2. It is to add it back.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
Revision tags: v6.6.16, v6.6.15 |
|
#
5a1bd214 |
| 26-Jan-2024 |
Mukul Joshi <mukul.joshi@amd.com> |
drm/amdkfd: Use correct drm device for cgroup permission check
[ Upstream commit 4119734e06a7f30e7e8eb666692a58b85dca0269 ]
On GFX 9.4.3, for a given KFD node, fetch the correct drm device from XCP
drm/amdkfd: Use correct drm device for cgroup permission check
[ Upstream commit 4119734e06a7f30e7e8eb666692a58b85dca0269 ]
On GFX 9.4.3, for a given KFD node, fetch the correct drm device from XCP manager when checking for cgroup permissions.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9 |
|
#
9b0cc30d |
| 20-Dec-2023 |
Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> |
drm/amdkfd: Fix type of 'dbg_flags' in 'struct kfd_process'
[ Upstream commit 217e85f97031791fb48a2d374c7bdcf439365b21 ]
dbg_flags looks to be defined with incorrect data type; to process multiple
drm/amdkfd: Fix type of 'dbg_flags' in 'struct kfd_process'
[ Upstream commit 217e85f97031791fb48a2d374c7bdcf439365b21 ]
dbg_flags looks to be defined with incorrect data type; to process multiple debug flag options, and hence defined dbg_flags as u32.
Fixes the below:
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_packet_manager_v9.c:117 pm_map_process_aldebaran() warn: maybe use && instead of &
Fixes: 0de4ec9a0353 ("drm/amdgpu: prepare map process for multi-process debug devices") Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3 |
|
#
867ecd89 |
| 22-Nov-2023 |
Mukul Joshi <mukul.joshi@amd.com> |
drm/amdkfd: Use common function for IP version check
[ Upstream commit 2f86bf79b63dbe6963ebc647b77a5f576a906b40 ]
KFD_GC_VERSION was recently updated to use a new function for IP version checks. As
drm/amdkfd: Use common function for IP version check
[ Upstream commit 2f86bf79b63dbe6963ebc647b77a5f576a906b40 ]
KFD_GC_VERSION was recently updated to use a new function for IP version checks. As a result, use KFD_GC_VERSION as the common function for all IP version checks in KFD.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3 |
|
#
edcfe229 |
| 11-Sep-2023 |
Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> |
drm/amdkfd: Insert missing TLB flush on GFX10 and later
Heavy-weight TLB flush is required after unmap on all GPUs for correctness and security.
Signed-off-by: Harish Kasiviswanathan <Harish.Kasivi
drm/amdkfd: Insert missing TLB flush on GFX10 and later
Heavy-weight TLB flush is required after unmap on all GPUs for correctness and security.
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
show more ...
|
Revision tags: v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43 |
|
#
c99a2e7a |
| 28-Jul-2023 |
Alex Deucher <alexander.deucher@amd.com> |
drm/amdkfd: drop IOMMUv2 support
Now that we use the dGPU path for all APUs, drop the IOMMUv2 support.
v2: drop the now unused queue manager functions for gfx7/8 APUs
Reviewed-by: Felix Kuehling <
drm/amdkfd: drop IOMMUv2 support
Now that we use the dGPU path for all APUs, drop the IOMMUv2 support.
v2: drop the now unused queue manager functions for gfx7/8 APUs
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Tested-by: Mike Lothian <mike@fireburn.co.uk> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
Revision tags: v6.1.42, v6.1.41, v6.1.40, v6.1.39 |
|
#
8da0d694 |
| 14-Jul-2023 |
Shashank Sharma <shashank.sharma@amd.com> |
drm/amdgpu: remove unused functions and variables
This patch removes some variables and functions from KFD doorbell handling code, which are no more required since doorbell manager is handling doorb
drm/amdgpu: remove unused functions and variables
This patch removes some variables and functions from KFD doorbell handling code, which are no more required since doorbell manager is handling doorbell calculations.
Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
2105a15a |
| 14-Jul-2023 |
Shashank Sharma <shashank.sharma@amd.com> |
drm/amdgpu: use doorbell mgr for kfd process doorbells
This patch: - adds a doorbell object in kfd pdd structure. - allocates doorbells for a process while creating its queue. - frees the doorbells
drm/amdgpu: use doorbell mgr for kfd process doorbells
This patch: - adds a doorbell object in kfd pdd structure. - allocates doorbells for a process while creating its queue. - frees the doorbells with pdd destroy. - moves doorbell bitmap init function to kfd_doorbell.c
PS: This patch ensures that we don't break the existing KFD functionality, but now KFD userspace library should also create doorbell pages as AMDGPU GEM objects using libdrm functions in userspace. The reference code for the same is available with AMDGPU Usermode queue libdrm MR. Once this is done, we will not need to create process doorbells in kernel.
V2: - Do not use doorbell wrapper API, use amdgpu_bo_create_kernel instead (Alex). - Do not use custom doorbell structure, instead use separate variables for bo and doorbell_bitmap (Alex) V3: - Do not allocate doorbell page with PDD, delay doorbell process page allocation until really needed (Felix)
Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felilx.Kuehling@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
c3186665 |
| 14-Jul-2023 |
Shashank Sharma <shashank.sharma@amd.com> |
drm/amdgpu: use doorbell mgr for kfd kernel doorbells
This patch: - adds a doorbell bo in kfd device structure. - creates doorbell page for kfd kernel usages. - updates the get_kernel_doorbell and f
drm/amdgpu: use doorbell mgr for kfd kernel doorbells
This patch: - adds a doorbell bo in kfd device structure. - creates doorbell page for kfd kernel usages. - updates the get_kernel_doorbell and free_kernel_doorbell functions accordingly
V2: Do not use wrapper API, use direct amdgpu_create_kernel(Alex) V3: - Move single variable declaration below (Christian) - Add a to-do item to reuse the KGD kernel level doorbells for KFD for non-MES cases, instead of reserving one page (Felix)
Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
Revision tags: v6.1.38, v6.1.37, v6.1.36, v6.4 |
|
#
d4300362 |
| 22-Jun-2023 |
Mukul Joshi <mukul.joshi@amd.com> |
drm/amdkfd: Update interrupt handling for GFX 9.4.3
For GFX 9.4.3, interrupt handling needs to be updated for: - Interrupt cookie will have a NodeId field. Each KFD node needs to check the NodeId
drm/amdkfd: Update interrupt handling for GFX 9.4.3
For GFX 9.4.3, interrupt handling needs to be updated for: - Interrupt cookie will have a NodeId field. Each KFD node needs to check the NodeId before processing the interrupt. - For CPX mode, there are additional checks of client ID needed to process the interrupt. - Add NodeId to the process drain interrupt.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
Revision tags: v6.1.35, v6.1.34, v6.1.33, v6.1.32 |
|
#
597364ad |
| 31-May-2023 |
Mukul Joshi <mukul.joshi@amd.com> |
drm/amdkfd: Fix reserved SDMA queues handling
This patch fixes a regression caused by a bad merge where the handling of reserved SDMA queues was accidentally removed. With the fix, the reserved SDMA
drm/amdkfd: Fix reserved SDMA queues handling
This patch fixes a regression caused by a bad merge where the handling of reserved SDMA queues was accidentally removed. With the fix, the reserved SDMA queues are again correctly marked as unavailable for allocation.
Fixes: a805889a1531 ("drm/amdkfd: Update SDMA queue management for GFX9.4.3") Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
Revision tags: v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8, v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80, v6.0.9, v5.15.79, v6.0.8, v5.15.78, v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4, v6.0.3, v6.0.2, v5.15.74, v5.15.73, v6.0.1, v5.15.72, v6.0, v5.15.71, v5.15.70, v5.15.69, v5.15.68, v5.15.67, v5.15.66, v5.15.65, v5.15.64, v5.15.63, v5.15.62, v5.15.61, v5.15.60, v5.15.59, v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55, v5.15.54, v5.15.53, v5.15.52, v5.15.51, v5.15.50, v5.15.49, v5.15.48, v5.15.47, v5.15.46, v5.15.45, v5.15.44, v5.15.43, v5.15.42, v5.18, v5.15.41, v5.15.40, v5.15.39 |
|
#
b17bd5db |
| 10-May-2022 |
Jonathan Kim <jonathan.kim@amd.com> |
drm/amdkfd: add debug queue snapshot operation
Allow the debugger to get a snapshot of a specified number of queues containing various queue property information that is copied to the debugger.
Sin
drm/amdkfd: add debug queue snapshot operation
Allow the debugger to get a snapshot of a specified number of queues containing various queue property information that is copied to the debugger.
Since the debugger doesn't know how many queues exist at any given time, allow the debugger to pass the requested number of snapshots as 0 to get the actual number of potential snapshots to use for a subsequent snapshot request for actual information.
To prevent future ABI breakage, pass in the requested entry_size. The KFD will return it's own entry_size in case the debugger still wants log the information in a core dump on sizing failure.
Also allow the debugger to clear exceptions when doing a snapshot.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
Revision tags: v5.15.38 |
|
#
e0f85f46 |
| 06-May-2022 |
Jonathan Kim <jonathan.kim@amd.com> |
drm/amdkfd: add debug set and clear address watch points operation
Shader read, write and atomic memory operations can be alerted to the debugger as an address watch exception.
Allow the debugger t
drm/amdkfd: add debug set and clear address watch points operation
Shader read, write and atomic memory operations can be alerted to the debugger as an address watch exception.
Allow the debugger to pass in a watch point to a particular memory address per device.
Note that there exists only 4 watch points per devices to date, so have the KFD keep track of what watch points are allocated or not.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
a70a93fa |
| 05-May-2022 |
Jonathan Kim <jonathan.kim@amd.com> |
drm/amdkfd: add debug suspend and resume process queues operation
In order to inspect waves from the saved context at any point during a debug session, the debugger must be able to preempt queues to
drm/amdkfd: add debug suspend and resume process queues operation
In order to inspect waves from the saved context at any point during a debug session, the debugger must be able to preempt queues to trigger context save by suspending them.
On queue suspend, the KFD will copy the context save header information so that the debugger can correctly crawl the appropriate size of the saved context. The debugger must then also be allowed to resume suspended queues.
A queue that is newly created cannot be suspended because queue ids are recycled after destruction so the debugger needs to know that this has occurred. Query functions will be later added that will clear a given queue of its new queue status.
A queue cannot be destroyed while it is suspended to preserve its saved context during debugger inspection. Have queue destruction block while a queue is suspended and unblocked when it is resumed. Likewise, if a queue is about to be destroyed, it cannot be suspended.
Return the number of queues successfully suspended or resumed along with a per queue status array where the upper bits per queue status show that the request was invalid (new/destroyed queue suspend request, missing queue) or an error occurred (HWS in a fatal state so it can't suspend or resume queues).
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
Revision tags: v5.15.37, v5.15.36 |
|
#
12fb1ad7 |
| 22-Apr-2022 |
Jonathan Kim <jonathan.kim@amd.com> |
drm/amdkfd: update process interrupt handling for debug events
The debugger must be notified by any debugger subscribed exception that comes from hardware interrupts.
If a debugger session exits, a
drm/amdkfd: update process interrupt handling for debug events
The debugger must be notified by any debugger subscribed exception that comes from hardware interrupts.
If a debugger session exits, any exceptions it subscribed to may still have interrupts in the interrupt ring buffer or KGD/KFD pipeline. To prevent a new session from inheriting stale interrupts, when a new queue is created, open an interrupt drain and allow the IH ring to drain from a timestamped checkpoint. Then inject a custom IV so that once the custom IV is picked up by the KFD, it's safe to close the drain and proceed with queue creation.
The drain must also be on debug disable as SW interrupts may still be processed. Drain at this time and clear all the exception status.
The debugger may also not be attached nor subscibed to certain exceptions so forward them directly to the runtime.
GFX10 also requires its own IV processing, hence the creation of kfd_int_process_v10.c. This is because the IV from SQ interrupts are packed into a new continguous format unlike GFX9. To make this clear, a separate interrupting handling code file was created.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
Revision tags: v5.15.35, v5.15.34, v5.15.33, v5.15.32, v5.15.31, v5.17, v5.15.30, v5.15.29, v5.15.28, v5.15.27, v5.15.26, v5.15.25, v5.15.24, v5.15.23, v5.15.22, v5.15.21, v5.15.20, v5.15.19, v5.15.18, v5.15.17, v5.4.173, v5.15.16, v5.15.15, v5.16, v5.15.10, v5.15.9, v5.15.8, v5.15.7, v5.15.6, v5.15.5, v5.15.4, v5.15.3, v5.15.2, v5.15.1, v5.15, v5.14.14, v5.14.13, v5.14.12, v5.14.11, v5.14.10, v5.14.9, v5.14.8, v5.14.7, v5.14.6, v5.10.67, v5.10.66, v5.14.5, v5.14.4, v5.10.65, v5.14.3, v5.10.64, v5.14.2, v5.10.63, v5.14.1, v5.10.62, v5.14, v5.10.61, v5.10.60, v5.10.53, v5.10.52, v5.10.51, v5.10.50, v5.10.49, v5.13, v5.10.46, v5.10.43, v5.10.42, v5.10.41, v5.10.40, v5.10.39, v5.4.119, v5.10.36, v5.10.35, v5.10.34, v5.4.116, v5.10.33, v5.12, v5.10.32, v5.10.31, v5.10.30, v5.10.27, v5.10.26, v5.10.25, v5.10.24, v5.10.23, v5.10.22, v5.10.21, v5.10.20 |
|
#
50cff45e |
| 01-Mar-2021 |
Jay Cornwall <jay.cornwall@amd.com> |
drm/amdkfd: add debug trap enabled flag to tma
Trap handler behavior will differ when a debugger is attached.
Make the debug trap flag available in the trap handler TMA. Update it when the debug tr
drm/amdkfd: add debug trap enabled flag to tma
Trap handler behavior will differ when a debugger is attached.
Make the debug trap flag available in the trap handler TMA. Update it when the debug trap ioctl is invoked.
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
455227c4 |
| 08-Apr-2022 |
Jonathan Kim <jonathan.kim@amd.com> |
drm/amdkfd: add runtime enable operation
The debugger can attach to a process prior to HSA enablement (i.e. inferior is spawned by the debugger and attached to immediately before target process has
drm/amdkfd: add runtime enable operation
The debugger can attach to a process prior to HSA enablement (i.e. inferior is spawned by the debugger and attached to immediately before target process has been enabled for HSA dispatches) or it can attach to a running target that is already HSA enabled. Either way, the debugger needs to know the enablement status to know when it can inspect queues.
For the scenario where the debugger spawns the target process, it will have to wait for ROCr's runtime enable request from the target. The runtime enable request will be able to see that its process has been debug attached. ROCr raises an EC_PROCESS_RUNTIME signal to the debugger then blocks the target process while waiting the debugger's response. Once the debugger has received the runtime signal, it will unblock the target process.
For the scenario where the debugger attaches to a running target process, ROCr will set the target process' runtime status as enabled so that on an attach request, the debugger will be able to see this status and will continue with debug enablement as normal.
A secondary requirement is to conditionally enable the trap tempories only if the user requests it (env var HSA_ENABLE_DEBUG=1) or if the debugger attaches with HSA runtime enabled. This is because setting up the trap temporaries incurs a performance overhead that is unacceptable for microbench performance in normal mode for certain customers.
In the scenario where the debugger spawns the target process, when ROCr detects that the debugger has attached during the runtime enable request, it will enable the trap temporaries before it blocks the target process while waiting for the debugger to respond.
In the scenario where the debugger attaches to a running target process, it will enable to trap temporaries itself.
Finally, there is an additional restriction that is required to be enforced with runtime enable and HW debug mode setting. The debugger must first ensure that HW debug mode has been enabled before permitting HW debug mode operations.
With single process debug devices, allowing the debugger to set debug HW modes prior to trap activation means that debug HW mode setting can occur before the KFD has reserved the debug VMID (0xf) from the hardware scheduler's VMID allocation resource pool. This can result in the hardware scheduler assigning VMID 0xf to a non-debugged process and having that process inherit debug HW mode settings intended for the debugged target process instead, which is both incorrect and potentially fatal for normal mode operation.
With multi process debug devices, allowing the debugger to set debug HW modes prior to trap activation means that non-debugged processes migrating to a new VMID could inherit unintended debug settings.
All debug operations that touch HW settings must require trap activation where trap activation is triggered by both debug attach and runtime enablement (target has KFD opened and is ready to dispatch work).
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
c2d2588c |
| 08-Apr-2022 |
Jonathan Kim <jonathan.kim@amd.com> |
drm/amdkfd: add send exception operation
Add a debug operation that allows the debugger to send an exception directly to runtime through a payload address.
For memory violations, normal vmfault sig
drm/amdkfd: add send exception operation
Add a debug operation that allows the debugger to send an exception directly to runtime through a payload address.
For memory violations, normal vmfault signals will be applied to notify runtime instead after passing in the saved exception data when a memory violation was raised to the debugger.
For runtime exceptions, this will unblock the runtime enable function which will be explained and implemented in a follow up patch.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
44b87bb0 |
| 06-Apr-2022 |
Jonathan Kim <jonathan.kim@amd.com> |
drm/amdkfd: add raise exception event function
Exception events can be generated from interrupts or queue activitity.
The raise event function will save exception status of a queue, device or proce
drm/amdkfd: add raise exception event function
Exception events can be generated from interrupts or queue activitity.
The raise event function will save exception status of a queue, device or process then notify the debugger of the status change by writing to a debugger polled file descriptor that the debugger provides during debug attach.
For memory violation exceptions, extra exception data will be saved.
The debugger will be able to query the saved exception states by query operation that will be provided by follow up patches.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
69a8c3ae |
| 01-Sep-2022 |
Jonathan Kim <jonathan.kim@amd.com> |
drm/amdkfd: apply trap workaround for gfx11
Due to a HW bug, waves in only half the shader arrays can enter trap.
When starting a debug session, relocate all waves to the first shader array of each
drm/amdkfd: apply trap workaround for gfx11
Due to a HW bug, waves in only half the shader arrays can enter trap.
When starting a debug session, relocate all waves to the first shader array of each shader engine and mask off the 2nd shader array as unavailable.
When ending a debug session, re-enable the 2nd shader array per shader engine.
User CU masking per queue cannot be guaranteed to remain functional if requested during debugging (e.g. user cu mask requests only 2nd shader array as an available resource leading to zero HW resources available) nor can runtime be alerted of any of these changes during execution.
Make user CU masking and debugging mutual exclusive with respect to availability.
If the debugger tries to attach to a process with a user cu masked queue, return the runtime status as enabled but busy.
If the debugger tries to attach and fails to reallocate queue waves to the first shader array of each shader engine, return the runtime status as enabled but with an error.
In addition, like any other mutli-process debug supported devices, disable trap temporary setup per-process to avoid performance impact from setup overhead.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
0de4ec9a |
| 04-Apr-2022 |
Jonathan Kim <jonathan.kim@amd.com> |
drm/amdgpu: prepare map process for multi-process debug devices
Unlike single process debug devices, multi-process debug devices allow debug mode setting per-VMID (non-device-global).
Because the H
drm/amdgpu: prepare map process for multi-process debug devices
Unlike single process debug devices, multi-process debug devices allow debug mode setting per-VMID (non-device-global).
Because the HWS manages PASID-VMID mapping, the new MAP_PROCESS API allows the KFD to forward the required SPI debug register write requests.
To request a new debug mode setting change, the KFD must be able to preempt all queues then remap all queues with these new setting requests for MAP_PROCESS to take effect.
Note that by default, trap enablement in non-debug mode must be disabled for performance reasons for multi-process debug devices due to setup overhead in FW.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
7cee6a68 |
| 23-Mar-2023 |
Jonathan Kim <jonathan.kim@amd.com> |
drm/amdgpu: add configurable grace period for unmap queues
The HWS schedule allows a grace period for wave completion prior to preemption for better performance by avoiding CWSR on waves that can po
drm/amdgpu: add configurable grace period for unmap queues
The HWS schedule allows a grace period for wave completion prior to preemption for better performance by avoiding CWSR on waves that can potentially complete quickly. The debugger, on the other hand, will want to inspect wave status immediately after it actively triggers preemption (a suspend function to be provided).
To minimize latency between preemption and debugger wave inspection, allow immediate preemption by setting the grace period to 0.
Note that setting the preepmtion grace period to 0 will result in an infinite grace period being set due to a CP FW bug so set it to 1 for now.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
0ab2d753 |
| 25-Mar-2022 |
Jonathan Kim <jonathan.kim@amd.com> |
drm/amdkfd: prepare per-process debug enable and disable
The ROCm debugger will attach to a process to debug by PTRACE and will expect the KFD to prepare a process for the target PID, whether the ta
drm/amdkfd: prepare per-process debug enable and disable
The ROCm debugger will attach to a process to debug by PTRACE and will expect the KFD to prepare a process for the target PID, whether the target PID has opened the KFD device or not.
This patch is to explicity handle this requirement. Further HW mode setting and runtime coordination requirements will be handled in following patches.
In the case where the target process has not opened the KFD device, a new KFD process must be created for the target PID. The debugger as well as the target process for this case will have not acquired any VMs so handle process restoration to correctly account for this.
To coordinate with HSA runtime, the debugger must be aware of the target process' runtime enablement status and will copy the runtime status information into the debugged KFD process for later query.
On enablement, the debugger will subscribe to a set of exceptions where each exception events will notify the debugger through a pollable FIFO file descriptor that the debugger provides to the KFD to manage.
Finally on process termination of either the debugger or the target, debugging must be disabled if it has not been done so.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
315e29ec |
| 20-Mar-2023 |
Mukul Joshi <mukul.joshi@amd.com> |
drm/amdkfd: Move local_mem_info to kfd_node
We need to track memory usage on a per partition basis. To do that, store the local memory information in KFD node instead of kfd device.
v2: squash in f
drm/amdkfd: Move local_mem_info to kfd_node
We need to track memory usage on a per partition basis. To do that, store the local memory information in KFD node instead of kfd device.
v2: squash in fix ("amdkfd: Use mem_id to access mem_partition info")
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|