Revision tags: v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4 |
|
#
cd378c37 |
| 01-Dec-2023 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Use internal class when counting engine resets
[ Upstream commit 1f721a93a528268fa97875cff515d1fcb69f4f44 ]
Commit 503579448db9 ("drm/i915/gsc: Mark internal GSC engine with reserved uabi
drm/i915: Use internal class when counting engine resets
[ Upstream commit 1f721a93a528268fa97875cff515d1fcb69f4f44 ]
Commit 503579448db9 ("drm/i915/gsc: Mark internal GSC engine with reserved uabi class") made the GSC0 engine not have a valid uabi class and so broke the engine reset counting, which in turn was made class based in cb823ed9915b ("drm/i915/gt: Use intel_gt as the primary object for handling resets").
Despite the title and commit text of the latter is not mentioning it (and has left the storage array incorrectly sized), tracking by class, despite it adding aliasing in hypthotetical multi-tile systems, is handy for virtual engines which for instance do not have a valid engine->id.
Therefore we keep that but just change it to use the internal class which is always valid. We also add a helper to increment the count, which aligns with the existing getter.
What was broken without this fix were out of bounds reads every time a reset would happen on the GSC0 engine, or during selftests when storing and cross-checking the counts in igt_live_test_begin and igt_live_test_end.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: 503579448db9 ("drm/i915/gsc: Mark internal GSC engine with reserved uabi class") [tursulin: fixed Fixes tag] Reported-by: Alan Previn Teres Alexis <alan.previn.teres.alexis@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231201122109.729006-2-tvrtko.ursulin@linux.intel.com (cherry picked from commit cf9cb028ac56696ff879af1154c4b2f0b12701fd) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36 |
|
#
4ae7eb92 |
| 27-Jun-2023 |
Jani Nikula <jani.nikula@intel.com> |
drm/i915: separate display info printing from the rest
Add new function intel_display_device_info_print() and print the display device info there instead of intel_device_info_print(). This also fixe
drm/i915: separate display info printing from the rest
Add new function intel_display_device_info_print() and print the display device info there instead of intel_device_info_print(). This also fixes the display runtime info printing to use the actual runtime info instead of the static defaults.
Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/30d4f93c58839bc9312b43423cd43bc0ef655a35.1687878757.git.jani.nikula@intel.com
show more ...
|
Revision tags: v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25 |
|
#
6197cff3 |
| 18-Apr-2023 |
John Harrison <John.C.Harrison@Intel.com> |
drm/i915: Dump error capture to kernel log
This is useful for getting debug information out in certain situations, such as failing kernel selftests and CI runs that don't log error captures. It is e
drm/i915: Dump error capture to kernel log
This is useful for getting debug information out in certain situations, such as failing kernel selftests and CI runs that don't log error captures. It is especially useful for things like retrieving GuC logs as GuC operation can't be tracked by adding printk or ftrace entries.
v2: Add CONFIG_DRM_I915_DEBUG_GEM wrapper (review feedback by Rodrigo).
Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230418181744.3251240-2-John.C.Harrison@Intel.com
show more ...
|
Revision tags: v6.1.24, v6.1.23, v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17 |
|
#
c8a76df6 |
| 11-Mar-2023 |
John Harrison <John.C.Harrison@Intel.com> |
drm/i915: Include timeline seqno in error capture
The seqno value actually written out to memory is no longer in the regular HWSP. Instead, it is now in its own private timeline buffer. Thus, it is
drm/i915: Include timeline seqno in error capture
The seqno value actually written out to memory is no longer in the regular HWSP. Instead, it is now in its own private timeline buffer. Thus, it is no longer visible in an error capture. So, explicitly read the value and include that in the capture.
v2: %d -> %u (Alan)
Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230311063714.570389-4-John.C.Harrison@Intel.com
show more ...
|
Revision tags: v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9 |
|
#
583ebae7 |
| 26-Jan-2023 |
John Harrison <John.C.Harrison@Intel.com> |
drm/i915/guc: Rename GuC register state capture node to be more obvious
The GuC specific register state entry in the error capture object was just called 'capture'. Although the companion 'node' ent
drm/i915/guc: Rename GuC register state capture node to be more obvious
The GuC specific register state entry in the error capture object was just called 'capture'. Although the companion 'node' entry was called 'guc_capture_node'. Rename the base entry to be 'guc_capture' instead so that it is a) more consistent and b) more obvious what it is.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-9-John.C.Harrison@Intel.com
show more ...
|
Revision tags: v6.1.8, v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80, v6.0.9, v5.15.79, v6.0.8, v5.15.78, v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4, v6.0.3, v6.0.2, v5.15.74, v5.15.73, v6.0.1, v5.15.72, v6.0, v5.15.71, v5.15.70, v5.15.69, v5.15.68, v5.15.67, v5.15.66, v5.15.65, v5.15.64, v5.15.63, v5.15.62, v5.15.61, v5.15.60, v5.15.59, v5.19, v5.15.58 |
|
#
c5de70f6 |
| 27-Jul-2022 |
John Harrison <John.C.Harrison@Intel.com> |
drm/i915/guc: Record CTB info in error logs
When debugging GuC communication issues, it is useful to have the CTB info available. So add the state and buffer contents to the error capture log.
Also
drm/i915/guc: Record CTB info in error logs
When debugging GuC communication issues, it is useful to have the CTB info available. So add the state and buffer contents to the error capture log.
Also, add a sub-structure for the GuC specific error capture info as it is now becoming numerous.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220728022028.2190627-5-John.C.Harrison@Intel.com
show more ...
|
#
368d179a |
| 27-Jul-2022 |
John Harrison <John.C.Harrison@Intel.com> |
drm/i915/guc: Add GuC <-> kernel time stamp translation information
It is useful to be able to match GuC events to kernel events when looking at the GuC log. That requires being able to convert GuC
drm/i915/guc: Add GuC <-> kernel time stamp translation information
It is useful to be able to match GuC events to kernel events when looking at the GuC log. That requires being able to convert GuC timestamps to kernel time. So, when dumping error captures and/or GuC logs, include a stamp in both time zones plus the clock frequency.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220728022028.2190627-4-John.C.Harrison@Intel.com
show more ...
|
Revision tags: v5.15.57, v5.15.56, v5.15.55, v5.15.54, v5.15.53, v5.15.52, v5.15.51, v5.15.50, v5.15.49, v5.15.48, v5.15.47, v5.15.46, v5.15.45 |
|
#
b729cfee |
| 01-Jun-2022 |
Stuart Summers <stuart.summers@intel.com> |
drm/i915: Add extra registers to GPU error dump
Our internal teams have identified a few additional engine registers that are worth inspecting in error state dumps during development & debug. Let's
drm/i915: Add extra registers to GPU error dump
Our internal teams have identified a few additional engine registers that are worth inspecting in error state dumps during development & debug. Let's capture and print them as part of our error dump.
For simplicity we'll just dump these registers on gen11 and beyond. Most of these registers have existed since earlier platforms (e.g., gen6 or gen7) but were initially introduced only for a subset of the platforms' engines; gen11 seems to be where they became available on all engines.
Signed-off-by: Stuart Summers <stuart.summers@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220601210646.615946-1-matthew.d.roper@intel.com
show more ...
|
Revision tags: v5.15.44, v5.15.43, v5.15.42, v5.18, v5.15.41, v5.15.40, v5.15.39, v5.15.38, v5.15.37, v5.15.36, v5.15.35, v5.15.34, v5.15.33 |
|
#
bb6287cb |
| 01-Apr-2022 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Track context current active time
Track context active (on hardware) status together with the start timestamp.
This will be used to provide better granularity of context runtime reporting
drm/i915: Track context current active time
Track context active (on hardware) status together with the start timestamp.
This will be used to provide better granularity of context runtime reporting in conjunction with already tracked pphwsp accumulated runtime.
The latter is only updated on context save so does not give us visibility to any currently executing work.
As part of the patch the existing runtime tracking data is moved under the new ce->stats member and updated under the seqlock. This provides the ability to atomically read out accumulated plus active runtime.
v2: * Rename and make __intel_context_get_active_time unlocked.
v3: * Use GRAPHICS_VER.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> # v1 Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220401142205.3123159-6-tvrtko.ursulin@linux.intel.com
show more ...
|
#
5efde05f |
| 30-Mar-2022 |
Jani Nikula <jani.nikula@intel.com> |
drm/i915/dmc: abstract GPU error state dump
Only intel_dmc.c should be accessing dmc details directly.
Need to add an i915_error_printf() stub for CONFIG_DRM_I915_CAPTURE_ERROR=n.
v2: Add the stub
drm/i915/dmc: abstract GPU error state dump
Only intel_dmc.c should be accessing dmc details directly.
Need to add an i915_error_printf() stub for CONFIG_DRM_I915_CAPTURE_ERROR=n.
v2: Add the stub (kernel test robot <lkp@intel.com>)
Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> # v1 Link: https://patchwork.freedesktop.org/patch/msgid/20220330113417.220964-1-jani.nikula@intel.com
show more ...
|
Revision tags: v5.15.32, v5.15.31 |
|
#
a0f1f7b4 |
| 21-Mar-2022 |
Alan Previn <alan.previn.teres.alexis@intel.com> |
drm/i915/guc: Print the GuC error capture output register list.
Print the GuC captured error state register list (string names and values) when gpu_coredump_state printout is invoked via the i915 de
drm/i915/guc: Print the GuC error capture output register list.
Print the GuC captured error state register list (string names and values) when gpu_coredump_state printout is invoked via the i915 debugfs for flushing the gpu error-state that was captured prior.
Since GuC could have reported multiple engine register dumps in a single notification event, parse the captured data (appearing as a stream of structures) to identify each dump as a different 'engine-capture-group-output'.
Finally, for each 'engine-capture-group-output' that is found, verify if the engine register dump corresponds to the engine_coredump content that was previously populated by the i915_gpu_coredump function. That function would have copied the context's vma's including the bacth buffer during the G2H-context-reset notification that occurred earlier. Perform this verification check by comparing guc_id, lrca and engine- instance obtained from the 'engine-capture-group-output' vs a copy of that same info taken during i915_gpu_coredump. If they match, then print those vma's as well (such as the batch buffers).
NOTE: the output format was verified using the gem_exec_capture IGT test.
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220321164527.2500062-14-alan.previn.teres.alexis@intel.com
show more ...
|
#
a6f0f9cf |
| 21-Mar-2022 |
Alan Previn <alan.previn.teres.alexis@intel.com> |
drm/i915/guc: Plumb GuC-capture into gpu_coredump
Add a flags parameter through all of the coredump creation functions. Add a bitmask flag to indicate if the top level gpu_coredump event is triggere
drm/i915/guc: Plumb GuC-capture into gpu_coredump
Add a flags parameter through all of the coredump creation functions. Add a bitmask flag to indicate if the top level gpu_coredump event is triggered in response to a GuC context reset notification.
Using that flag, ensure all coredump functions that read or print mmio-register values related to work submission or command-streamer engines are skipped and replaced with a calls guc-capture module equivalent functions to retrieve or print the register dump.
While here, split out display related register reading and printing into its own function that is called agnostic to whether GuC had triggered the reset.
For now, introduce an empty printing function that can filled in on a subsequent patch just to handle formatting.
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220321164527.2500062-13-alan.previn.teres.alexis@intel.com
show more ...
|
Revision tags: v5.17, v5.15.30, v5.15.29, v5.15.28 |
|
#
239bbb2f |
| 11-Mar-2022 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915/gt: Remove GEN12_SFC_DONE_MAX from register defs header
We shouldn't really be keeping track of how many SFC_DONE registers our platforms can have, but rather how many SFC hardware units th
drm/i915/gt: Remove GEN12_SFC_DONE_MAX from register defs header
We shouldn't really be keeping track of how many SFC_DONE registers our platforms can have, but rather how many SFC hardware units there can be (each SFC unit will have one corresponding SFC_DONE register). So drop the stray GEN12_SFC_DONE_MAX definition we had in the register definition file and replace it with an I915_MAX_SFC that follows the pattern we use for other hardware units. Note that our hardware has a 2:1:1 ratio of VD:VE:SFC, and as far as we know that pattern should carry forward to future platforms, so we'll define it as #VCS/2.
Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220311062835.163744-1-matthew.d.roper@intel.com
show more ...
|
Revision tags: v5.15.27, v5.15.26, v5.15.25, v5.15.24, v5.15.23 |
|
#
f9bf77df |
| 10-Feb-2022 |
Jani Nikula <jani.nikula@intel.com> |
drm/i915: move i915_reset_count()/i915_reset_engine_count() out of i915_drv.h
It doesn't help much, as i915_drv.h includes i915_gpu_error.h, but it's a step in the right direction.
Cc: Tvrtko Ursul
drm/i915: move i915_reset_count()/i915_reset_engine_count() out of i915_drv.h
It doesn't help much, as i915_drv.h includes i915_gpu_error.h, but it's a step in the right direction.
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/7af2f698a320c1efb0563f56a432c6d122d40b94.1644507885.git.jani.nikula@intel.com
show more ...
|
Revision tags: v5.15.22, v5.15.21, v5.15.20, v5.15.19, v5.15.18, v5.15.17, v5.4.173, v5.15.16, v5.15.15, v5.16, v5.15.10, v5.15.9, v5.15.8, v5.15.7, v5.15.6, v5.15.5, v5.15.4, v5.15.3, v5.15.2 |
|
#
e45b98ba |
| 08-Nov-2021 |
Thomas Hellström <thomas.hellstrom@linux.intel.com> |
drm/i915: Avoid allocating a page array for the gpu coredump
The gpu coredump typically takes place in a dma_fence signalling critical path, and hence can't use GFP_KERNEL allocations, as that means
drm/i915: Avoid allocating a page array for the gpu coredump
The gpu coredump typically takes place in a dma_fence signalling critical path, and hence can't use GFP_KERNEL allocations, as that means we might hit deadlocks under memory pressure. However changing to __GFP_KSWAPD_RECLAIM which will be done in an upcoming patch will instead mean a lower chance of the allocation succeeding. In particular large contigous allocations like the coredump page vector. Remove the page vector in favor of a linked list of single pages. Use the page lru list head as the list link, as the page owner is allowed to do that.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Ramalingam C <ramalingam.c@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211108174547.979714-2-thomas.hellstrom@linux.intel.com
show more ...
|
Revision tags: v5.15.1, v5.15, v5.14.14, v5.14.13, v5.14.12, v5.14.11, v5.14.10, v5.14.9, v5.14.8, v5.14.7, v5.14.6, v5.10.67, v5.10.66, v5.14.5, v5.14.4, v5.10.65, v5.14.3, v5.10.64, v5.14.2, v5.10.63, v5.14.1, v5.10.62, v5.14, v5.10.61, v5.10.60, v5.10.53, v5.10.52, v5.10.51, v5.10.50, v5.10.49, v5.13, v5.10.46, v5.10.43, v5.10.42, v5.10.41, v5.10.40, v5.10.39 |
|
#
e52e4a31 |
| 19-May-2021 |
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> |
gpu: drm: replace occurrences of invalid character
There are some places at drm that ended receiving a REPLACEMENT CHARACTER U+fffd ('�'), probably because of some bad charset conversion.
Fix them
gpu: drm: replace occurrences of invalid character
There are some places at drm that ended receiving a REPLACEMENT CHARACTER U+fffd ('�'), probably because of some bad charset conversion.
Fix them by using what it seems to be the proper character.
Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/e606930c73029f16673849c57acac061dd923866.1621412009.git.mchehab+huawei@kernel.org
show more ...
|
Revision tags: v5.4.119, v5.10.36, v5.10.35 |
|
#
e7c46e43 |
| 05-May-2021 |
Ville Syrjälä <ville.syrjala@linux.intel.com> |
drm/i915: Nuke display error state
I doubt anyone has used the display error state since CS flips went the way of the dodo. Just nuke it.
It might be semi interesting to have something like this fo
drm/i915: Nuke display error state
I doubt anyone has used the display error state since CS flips went the way of the dodo. Just nuke it.
It might be semi interesting to have something like this for FIFO underruns and the like, but as it stands this wouldn't provide a sufficient amount of information. So would need an extensive rewrite anyway.
The lockless power well handling is also racy, so this could just be contributing noise to test results if we end up accessing something with the relevant power well already disabled.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210505191140.14215-1-ville.syrjala@linux.intel.com Acked-by: Jani Nikula <jani.nikula@intel.com>
show more ...
|
Revision tags: v5.10.34, v5.4.116, v5.10.33, v5.12, v5.10.32, v5.10.31, v5.10.30, v5.10.27, v5.10.26, v5.10.25, v5.10.24, v5.10.23, v5.10.22, v5.10.21, v5.10.20, v5.10.19, v5.4.101, v5.10.18, v5.10.17, v5.11, v5.10.16, v5.10.15, v5.10.14, v5.10 |
|
#
bda30024 |
| 04-Nov-2020 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Improve record of hung engines in error state
Between events which trigger engine and GPU resets and capturing the error state we lose information on which engine triggered the reset. Impr
drm/i915: Improve record of hung engines in error state
Between events which trigger engine and GPU resets and capturing the error state we lose information on which engine triggered the reset. Improve this by passing in the hung engine mask down to error capture.
Result is that the list of engines in user visible "GPU HANG: ecode <gen>:<engines>:<ecode>, <process>" is now a list of hanging and not just active engines. Most importantly the displayed process is now the one which was actually hung.
v2: * Stub prototype. (Chris)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20201104134743.916027-1-tvrtko.ursulin@linux.intel.com
show more ...
|
Revision tags: v5.8.17, v5.8.16, v5.8.15, v5.9, v5.8.14, v5.8.13, v5.8.12, v5.8.11, v5.8.10, v5.8.9, v5.8.8, v5.8.7, v5.8.6, v5.4.62, v5.8.5, v5.8.4, v5.4.61, v5.8.3, v5.4.60, v5.8.2, v5.4.59, v5.8.1, v5.4.58, v5.4.57, v5.4.56, v5.8, v5.7.12, v5.4.55, v5.7.11, v5.4.54, v5.7.10, v5.4.53, v5.4.52, v5.7.9, v5.7.8, v5.4.51 |
|
#
792592e7 |
| 07-Jul-2020 |
Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> |
drm/i915: Move the engine mask to intel_gt_info
Since the engines belong to the GT, move the runtime-updated list of available engines to the intel_gt struct. The original mask has been renamed to i
drm/i915: Move the engine mask to intel_gt_info
Since the engines belong to the GT, move the runtime-updated list of available engines to the intel_gt struct. The original mask has been renamed to indicate it contains the maximum engine list that can be found on a matching device.
In preparation for other info being moved to the gt in follow up patches (sseu), introduce an intel_gt_info structure to group all gt-related runtime info.
v2: s/max_engine_mask/platform_engine_mask (tvrtko), fix selftest
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v1 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200708003952.21831-5-daniele.ceraolospurio@intel.com
show more ...
|
Revision tags: v5.4.50, v5.7.7, v5.4.49, v5.7.6, v5.7.5, v5.4.48, v5.7.4, v5.7.3, v5.4.47, v5.4.46, v5.7.2, v5.4.45, v5.7.1, v5.4.44, v5.7, v5.4.43, v5.4.42, v5.4.41, v5.4.40 |
|
#
f1e79c7e |
| 07-May-2020 |
Gustavo A. R. Silva <gustavoars@kernel.org> |
drm/i915: Replace zero-length array with flexible-array
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variabl
drm/i915: Replace zero-length array with flexible-array
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99:
struct foo { int stuff; struct boo array[]; };
By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on.
Also, notice that, dynamic memory allocations won't be affected by this change:
"Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1]
sizeof(flexible-array-member) triggers a warning because flexible array members have incomplete type[1]. There are some instances of code in which the sizeof operator is being incorrectly/erroneously applied to zero-length arrays and the result is zero. Such instances may be hiding some bugs. So, this work (flexible-array member conversions) will also help to get completely rid of those sorts of issues.
This issue was found with the help of Coccinelle.
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200507185408.GA14561@embeddedor
show more ...
|
Revision tags: v5.4.39, v5.4.38, v5.4.37, v5.4.36 |
|
#
9669a507 |
| 24-Apr-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Drop rq->ring->vma peeking from error capture
We only hold the active spinlock while dumping the error state, and this does not prevent another thread from retiring the request -- as it is
drm/i915: Drop rq->ring->vma peeking from error capture
We only hold the active spinlock while dumping the error state, and this does not prevent another thread from retiring the request -- as it is quite possible that despite us capturing the current state, the GPU has completed the request. As such, it is dangerous to dereference state below the request as it may already be freed, and the simplest way to avoid the danger is not include it in the error state.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1788 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200424191410.27570-1-chris@chris-wilson.co.uk
show more ...
|
Revision tags: v5.4.35, v5.4.34, v5.4.33, v5.4.32, v5.4.31, v5.4.30, v5.4.29, v5.6, v5.4.28, v5.4.27, v5.4.26, v5.4.25, v5.4.24, v5.4.23, v5.4.22, v5.4.21 |
|
#
1883a0a4 |
| 16-Feb-2020 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Track hw reported context runtime
GPU saves accumulated context runtime (in CS timestamp units) in PPHWSP which will be useful for us in cases when we are not able to track context busynes
drm/i915: Track hw reported context runtime
GPU saves accumulated context runtime (in CS timestamp units) in PPHWSP which will be useful for us in cases when we are not able to track context busyness ourselves (like with GuC). Keep a copy of this in struct intel_context from where it can be easily read even if the context is not pinned.
v2: (Chris) * Do not store pphwsp address in intel_context. * Log CS wrap-around. * Simplify calculation by relying on integer wraparound. v3: * Include total/avg in traces and error state for debugging
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200216133620.394962-1-chris@chris-wilson.co.uk
show more ...
|
Revision tags: v5.4.20, v5.4.19, v5.4.18, v5.4.17, v5.4.16 |
|
#
70a76a9b |
| 28-Jan-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Hook up CS_MASTER_ERROR_INTERRUPT
Now that we have offline error capture and can reset an engine from inside an atomic context while also preserving the GPU state for post-mortem analys
drm/i915/gt: Hook up CS_MASTER_ERROR_INTERRUPT
Now that we have offline error capture and can reset an engine from inside an atomic context while also preserving the GPU state for post-mortem analysis, it is time to handle error interrupts thrown by the command parser.
This provides a much, much faster mechanism for us to detect known problems than using heartbeats/hangchecks, and also provides a mechanism for when those are disabled. However, it is limited to problems the HW can detect in the CS and so not a complete solution for detecting lockups.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200128204318.4182039-2-chris@chris-wilson.co.uk
show more ...
|
Revision tags: v5.5, v5.4.15 |
|
#
7e36505d |
| 24-Jan-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Stub out i915_gpu_coredump_put
i915_gpu_coreddump_put is currently only defined if CONFIG_DRM_I915_CAPTURE_ERROR is enabled, provide a stub otherwise.
Reported-by: Mike Lothian <mike@fire
drm/i915: Stub out i915_gpu_coredump_put
i915_gpu_coreddump_put is currently only defined if CONFIG_DRM_I915_CAPTURE_ERROR is enabled, provide a stub otherwise.
Reported-by: Mike Lothian <mike@fireburn.co.uk> Fixes: 742379c0c400 ("drm/i915: Start chopping up the GPU error capture") Fixes: 748317386afb ("drm/i915/execlists: Offline error capture") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mike Lothian <mike@fireburn.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200124192255.541355-1-chris@chris-wilson.co.uk
show more ...
|
Revision tags: v5.4.14, v5.4.13 |
|
#
04062c58 |
| 17-Jan-2020 |
Zhang Xiaoxu <zhangxiaoxu5@huawei.com> |
drm/i915: Fix i915_error_state_store error defination
Since commit 742379c0c4001 ("drm/i915: Start chopping up the GPU error capture"), function 'i915_error_state_store' was defined and used with on
drm/i915: Fix i915_error_state_store error defination
Since commit 742379c0c4001 ("drm/i915: Start chopping up the GPU error capture"), function 'i915_error_state_store' was defined and used with only one parameter.
But if no 'CONFIG_DRM_I915_CAPTURE_ERROR', this function was defined with two parameter.
This may lead compile error. This patch fix it.
Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200117073436.6507-1-zhangxiaoxu5@huawei.com
show more ...
|