#
4edb0a84 |
| 28-Apr-2024 |
Dan Carpenter <dan.carpenter@linaro.org> |
drm/amdgpu: Fix signedness bug in sdma_v4_0_process_trap_irq()
commit 6769a23697f17f9bf9365ca8ed62fe37e361a05a upstream.
The "instance" variable needs to be signed for the error handling to work.
drm/amdgpu: Fix signedness bug in sdma_v4_0_process_trap_irq()
commit 6769a23697f17f9bf9365ca8ed62fe37e361a05a upstream.
The "instance" variable needs to be signed for the error handling to work.
Fixes: 8b2faf1a4f3b ("drm/amdgpu: add error handle to avoid out-of-bounds") Reviewed-by: Bob Zhou <bob.zhou@amd.com> Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Siddh Raman Pant <siddh.raman.pant@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
5b0a3dc3 |
| 23-Apr-2024 |
Bob Zhou <bob.zhou@amd.com> |
drm/amdgpu: add error handle to avoid out-of-bounds
commit 8b2faf1a4f3b6c748c0da36cda865a226534d520 upstream.
if the sdma_v4_0_irq_id_to_seq return -EINVAL, the process should be stop to avoid out-
drm/amdgpu: add error handle to avoid out-of-bounds
commit 8b2faf1a4f3b6c748c0da36cda865a226534d520 upstream.
if the sdma_v4_0_irq_id_to_seq return -EINVAL, the process should be stop to avoid out-of-bounds read, so directly return -EINVAL.
Signed-off-by: Bob Zhou <bob.zhou@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
e5df16d9 |
| 07-Jun-2023 |
Alex Deucher <alexander.deucher@amd.com> |
drm/amdgpu/sdma4: set align mask to 255
The wptr needs to be incremented at at least 64 dword intervals, use 256 to align with windows. This should fix potential hangs with unaligned updates.
Revi
drm/amdgpu/sdma4: set align mask to 255
The wptr needs to be incremented at at least 64 dword intervals, use 256 to align with windows. This should fix potential hangs with unaligned updates.
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
61c31b8b |
| 12-May-2023 |
Guchun Chen <guchun.chen@amd.com> |
drm/amdgpu/sdma: set sched.ready status after ring/IB test in sdma
sched.ready is nothing with ring initialization, it needs to set to be true after ring/IB test in amdgpu_ring_test_helper to tell t
drm/amdgpu/sdma: set sched.ready status after ring/IB test in sdma
sched.ready is nothing with ring initialization, it needs to set to be true after ring/IB test in amdgpu_ring_test_helper to tell the ring is ready for submission.
Signed-off-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
0ee20b86 |
| 28-Feb-2022 |
Le Ma <le.ma@amd.com> |
drm/amdgpu: assign the doorbell index in 1st page to sdma page queue
Previously for vega10, the sdma_doorbell_range is only enough for sdma gfx queue, thus the index on second doorbell page is alloc
drm/amdgpu: assign the doorbell index in 1st page to sdma page queue
Previously for vega10, the sdma_doorbell_range is only enough for sdma gfx queue, thus the index on second doorbell page is allocated for sdma page queue. From vega20, the sdma_doorbell_range on 1st page is enlarged. Therefore, just leverage these index instead of allocation on 2nd page.
v2: change "(x << 1) + 2" to "(x + 1) << 1" for readability and add comments.
Signed-off-by: Le Ma <le.ma@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
f4caf584 |
| 14-Sep-2022 |
Hawking Zhang <Hawking.Zhang@amd.com> |
drm/amdgpu: introduce vmhub definition for multi-partition cases (v3)
v1: Each partition has its own gfxhub or mmhub. adjust the num of MAX_VMHUBS and the GFXHUB/MMHUB layout (Le)
v2: re-design the
drm/amdgpu: introduce vmhub definition for multi-partition cases (v3)
v1: Each partition has its own gfxhub or mmhub. adjust the num of MAX_VMHUBS and the GFXHUB/MMHUB layout (Le)
v2: re-design the AMDGPU_GFXHUB/AMDGPU_MMHUB layout (Le)
v3: apply the gfxhub/mmhub layout to new IPs (Hawking)
v4: fix up gmc11 (Alex)
v5: rebase (Alex)
Signed-off-by: Le Ma <le.ma@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
eb4f0178 |
| 06-May-2023 |
Guchun Chen <guchun.chen@amd.com> |
drm/amdgpu: disable sdma ecc irq only when sdma RAS is enabled in suspend
sdma_v4_0_ip is shared on a few asics, but in sdma_v4_0_hw_fini, driver unconditionally disables ecc_irq which is only enabl
drm/amdgpu: disable sdma ecc irq only when sdma RAS is enabled in suspend
sdma_v4_0_ip is shared on a few asics, but in sdma_v4_0_hw_fini, driver unconditionally disables ecc_irq which is only enabled on those asics enabling sdma ecc. This will introduce a warning in suspend cycle on those chips with sdma ip v4.0, while without sdma ecc. So this patch correct this.
[ 7283.166354] RIP: 0010:amdgpu_irq_put+0x45/0x70 [amdgpu] [ 7283.167001] RSP: 0018:ffff9a5fc3967d08 EFLAGS: 00010246 [ 7283.167019] RAX: ffff98d88afd3770 RBX: 0000000000000001 RCX: 0000000000000000 [ 7283.167023] RDX: 0000000000000000 RSI: ffff98d89da30390 RDI: ffff98d89da20000 [ 7283.167025] RBP: ffff98d89da20000 R08: 0000000000036838 R09: 0000000000000006 [ 7283.167028] R10: ffffd5764243c008 R11: 0000000000000000 R12: ffff98d89da30390 [ 7283.167030] R13: ffff98d89da38978 R14: ffffffff999ae15a R15: ffff98d880130105 [ 7283.167032] FS: 0000000000000000(0000) GS:ffff98d996f00000(0000) knlGS:0000000000000000 [ 7283.167036] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 7283.167039] CR2: 00000000f7a9d178 CR3: 00000001c42ea000 CR4: 00000000003506e0 [ 7283.167041] Call Trace: [ 7283.167046] <TASK> [ 7283.167048] sdma_v4_0_hw_fini+0x38/0xa0 [amdgpu] [ 7283.167704] amdgpu_device_ip_suspend_phase2+0x101/0x1a0 [amdgpu] [ 7283.168296] amdgpu_device_suspend+0x103/0x180 [amdgpu] [ 7283.168875] amdgpu_pmops_freeze+0x21/0x60 [amdgpu] [ 7283.169464] pci_pm_freeze+0x54/0xc0
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2522 Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
0530553b |
| 19-May-2022 |
Le Ma <le.ma@amd.com> |
drm/amdgpu: move vmhub out of amdgpu_ring_funcs (v4)
It looks better to place this field in ring structure. Also drop the repeated ring funcs definitions if there's no difference except for vmhub fi
drm/amdgpu: move vmhub out of amdgpu_ring_funcs (v4)
It looks better to place this field in ring structure. Also drop the repeated ring funcs definitions if there's no difference except for vmhub field.
v2: rename the field to vm_hub like others (Le) v3: apply the changes to new ip blocks (Hawking) v4: fix vcn sw ring (Alex)
Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
5e08e9c7 |
| 06-Apr-2023 |
lyndonli <Lyndon.Li@amd.com> |
drm/amdgpu: Fix sdma v4 sw fini error
Fix sdma v4 sw fini error for sdma 4.2.2 to solve the following general protection fault
[ +0.108196] general protection fault, probably for non-canonical add
drm/amdgpu: Fix sdma v4 sw fini error
Fix sdma v4 sw fini error for sdma 4.2.2 to solve the following general protection fault
[ +0.108196] general protection fault, probably for non-canonical address 0xd5e5a4ae79d24a32: 0000 [#1] PREEMPT SMP PTI [ +0.000018] RIP: 0010:free_fw_priv+0xd/0x70 [ +0.000022] Call Trace: [ +0.000012] <TASK> [ +0.000011] release_firmware+0x55/0x80 [ +0.000021] amdgpu_ucode_release+0x11/0x20 [amdgpu] [ +0.000415] amdgpu_sdma_destroy_inst_ctx+0x4f/0x90 [amdgpu] [ +0.000360] sdma_v4_0_sw_fini+0xce/0x110 [amdgpu]
Signed-off-by: lyndonli <Lyndon.Li@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
2cfb737b |
| 17-Jan-2023 |
YiPeng Chai <YiPeng.Chai@amd.com> |
drm/amdgpu: Optimize sdma ras block initialization code for sdma v4_0
Optimize sdma ras block initialization code for sdma v4_0.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zh
drm/amdgpu: Optimize sdma ras block initialization code for sdma v4_0
Optimize sdma ras block initialization code for sdma v4_0.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
1336b4e7 |
| 28-Dec-2022 |
Mario Limonciello <mario.limonciello@amd.com> |
drm/amd: Convert SDMA to use `amdgpu_ucode_ip_version_decode`
Simplifies the code so that all SDMA versions will get the firmware name from `amdgpu_ucode_ip_version_decode`.
v2: squash in fix from
drm/amd: Convert SDMA to use `amdgpu_ucode_ip_version_decode`
Simplifies the code so that all SDMA versions will get the firmware name from `amdgpu_ucode_ip_version_decode`.
v2: squash in fix from Srinivasan
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
b35a2a12 |
| 30-Nov-2022 |
Prike Liang <Prike.Liang@amd.com> |
drm/amdgpu/sdma_v4_0: turn off SDMA ring buffer in the s2idle suspend
In the SDMA s0ix save process requires to turn off SDMA ring buffer for avoiding the SDMA in-flight request, otherwise will suff
drm/amdgpu/sdma_v4_0: turn off SDMA ring buffer in the s2idle suspend
In the SDMA s0ix save process requires to turn off SDMA ring buffer for avoiding the SDMA in-flight request, otherwise will suffer from SDMA page fault which causes by page request from in-flight SDMA ring accessing at SDMA restore phase.
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2248 Cc: stable@vger.kernel.org # 6.0,5.15+ Fixes: f8f4e2a51834 ("drm/amdgpu: skipping SDMA hw_init and hw_fini for S0ix.") Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
50b0e4d4 |
| 19-Oct-2022 |
Alex Deucher <alexander.deucher@amd.com> |
drm/amdgpu: fix sdma doorbell init ordering on APUs
Commit 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()") uncovered a bug in amdgpu that required a reorder
drm/amdgpu: fix sdma doorbell init ordering on APUs
Commit 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()") uncovered a bug in amdgpu that required a reordering of the driver init sequence to avoid accessing a special register on the GPU before it was properly set up leading to an PCI AER error. This reordering uncovered a different hw programming ordering dependency in some APUs where the SDMA doorbells need to be programmed before the GFX doorbells. To fix this, move the SDMA doorbell programming back into the soc15 common code, but use the actual doorbell range values directly rather than the values stored in the ring structure since those will not be initialized at this point.
This is a partial revert, but with the doorbell assignment fixed so the proper doorbell index is set before it's used.
Fixes: e3163bc8ffdfdb ("drm/amdgpu: move nbio sdma_doorbell_range() into sdma code for vega") Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: skhan@linuxfoundation.org Cc: stable@vger.kernel.org
show more ...
|
#
a98cec22 |
| 06-Oct-2022 |
Alex Deucher <alexander.deucher@amd.com> |
drm/amdgpu: fix SDMA suspend/resume on SR-IOV
Update all SDMA versions that support SR-IOV to properly tear down the ttm buffer functions on suspend.
Tested-by: Bokun Zhang <Bokun.Zhang@amd.com> Si
drm/amdgpu: fix SDMA suspend/resume on SR-IOV
Update all SDMA versions that support SR-IOV to properly tear down the ttm buffer functions on suspend.
Tested-by: Bokun Zhang <Bokun.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
571c0536 |
| 06-Oct-2022 |
Alex Deucher <alexander.deucher@amd.com> |
drm/amdgpu: switch sdma buffer function tear down to a helper
Switch all of the SDMA implementations to use the helper to tear down the ttm buffer manager.
Tested-by: Bokun Zhang <Bokun.Zhang@amd.c
drm/amdgpu: switch sdma buffer function tear down to a helper
Switch all of the SDMA implementations to use the helper to tear down the ttm buffer manager.
Tested-by: Bokun Zhang <Bokun.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
baf28cc1 |
| 28-Sep-2022 |
Likun Gao <Likun.Gao@amd.com> |
drm/amdgpu: fix sdma v4 init microcode error
Fix init SDMA microcode error for sdma v4, which caused by mistake when rearch sdma init microcode function (coding 4.2.2 to 4.2.0).
Signed-off-by: Liku
drm/amdgpu: fix sdma v4 init microcode error
Fix init SDMA microcode error for sdma v4, which caused by mistake when rearch sdma init microcode function (coding 4.2.2 to 4.2.0).
Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
a2d3b4b8 |
| 22-Sep-2022 |
Likun Gao <Likun.Gao@amd.com> |
drm/amdgpu/sdma4: use common function to init sdma fw
Use common function to init sdma v4 firmware ucode.
Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.
drm/amdgpu/sdma4: use common function to init sdma fw
Use common function to init sdma v4 firmware ucode.
Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
59c43748 |
| 09-Sep-2022 |
Alex Deucher <alexander.deucher@amd.com> |
drm/amdgpu: move nbio sdma_doorbell_range() into sdma code for vega
This mirrors what we do for other asics and this way we are sure the sdma doorbell range is properly initialized.
There is a comm
drm/amdgpu: move nbio sdma_doorbell_range() into sdma code for vega
This mirrors what we do for other asics and this way we are sure the sdma doorbell range is properly initialized.
There is a comment about the way doorbells on gfx9 work that requires that they are initialized for other IPs before GFX is initialized. However, the statement says that it applies to multimedia as well, but the VCN code currently initializes doorbells after GFX and there are no known issues there. In my testing at least I don't see any problems on SDMA.
This is a prerequisite for fixing the Unsupported Request error reported through AER during driver load.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373
The error was unnoticed before and got visible because of the commit referenced below. This doesn't fix anything in the commit below, rather fixes the issue in amdgpu exposed by the commit. The reference is only to associate this commit with below one so that both go together.
Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")
Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
91a95887 |
| 30-Aug-2022 |
ye xingchen <ye.xingchen@zte.com.cn> |
drm/amdgpu: Remove the unneeded result variable 'r'
Return the value sdma_v4_0_start() directly instead of storing it in another redundant variable.
Reported-by: Zeal Robot <zealci@zte.com.cn> Sign
drm/amdgpu: Remove the unneeded result variable 'r'
Return the value sdma_v4_0_start() directly instead of storing it in another redundant variable.
Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
3748424b |
| 19-Mar-2020 |
Jack Xiao <Jack.Xiao@amd.com> |
drm/amdgpu: use ring structure to access rptr/wptr v2
Use ring structure to access the cpu/gpu address of rptr/wptr.
v2: merge gfx10/sdma5/sdma5.2 patches
Signed-off-by: Jack Xiao <Jack.Xiao@amd.c
drm/amdgpu: use ring structure to access rptr/wptr v2
Use ring structure to access the cpu/gpu address of rptr/wptr.
v2: merge gfx10/sdma5/sdma5.2 patches
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
7dba6e83 |
| 25-Apr-2022 |
Haohui Mai <ricetons@gmail.com> |
drm/amdgpu/sdma: Fix incorrect calculations of the wptr of the doorbells
This patch fixes the issue where the driver miscomputes the 64-bit values of the wptr of the SDMA doorbell when initializing
drm/amdgpu/sdma: Fix incorrect calculations of the wptr of the doorbells
This patch fixes the issue where the driver miscomputes the 64-bit values of the wptr of the SDMA doorbell when initializing the hardware. SDMA engines v4 and later on have full 64-bit registers for wptr thus they should be set properly.
Older generation hardwares like CIK / SI have only 16 / 20 / 24bits for the WPTR, where the calls of lower_32_bits() will be removed in a following patch.
Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Haohui Mai <ricetons@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
25faeddc |
| 25-Mar-2022 |
Evan Quan <evan.quan@amd.com> |
drm/amdgpu: expand cg_flags from u32 to u64
With this, we can support more CG flags.
Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Haw
drm/amdgpu: expand cg_flags from u32 to u64
With this, we can support more CG flags.
Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
8c0f11ff |
| 15-Mar-2022 |
Lang Yu <Lang.Yu@amd.com> |
drm/amdgpu: only allow secure submission on rings which support that
Only GFX ring, SDMA ring and VCN decode ring support secure submission at the moment.
Suggested-by: Christian König <christian.k
drm/amdgpu: only allow secure submission on rings which support that
Only GFX ring, SDMA ring and VCN decode ring support secure submission at the moment.
Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
80e0c2cb |
| 17-Feb-2022 |
yipechai <YiPeng.Chai@amd.com> |
drm/amdgpu: Remove redundant .ras_fini initialization in some ras blocks
1. Define amdgpu_ras_block_late_fini_default in amdgpu_ras.c as .ras_fini common function, which is called when .ras_fi
drm/amdgpu: Remove redundant .ras_fini initialization in some ras blocks
1. Define amdgpu_ras_block_late_fini_default in amdgpu_ras.c as .ras_fini common function, which is called when .ras_fini of ras block isn't initialized. 2. Remove the code of using amdgpu_ras_block_late_fini to initialize .ras_fini in ras blocks.
Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
149d7ba1 |
| 14-Feb-2022 |
yipechai <YiPeng.Chai@amd.com> |
drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in sdma ras block
Remove redundant calls of amdgpu_ras_block_late_fini in sdma ras block.
Signed-off-by: yipechai <YiPeng.Chai@amd.c
drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in sdma ras block
Remove redundant calls of amdgpu_ras_block_late_fini in sdma ras block.
Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|