Revision tags: v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27 |
|
#
5aa857db |
| 27-Apr-2023 |
Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> |
i915/pmu: Add support for total context runtime for GuC back-end
GPU accumulates the context runtime in a 32 bit counter - CTX_TIMESTAMP in the context image. This value is saved/restored on context
i915/pmu: Add support for total context runtime for GuC back-end
GPU accumulates the context runtime in a 32 bit counter - CTX_TIMESTAMP in the context image. This value is saved/restored on context switches. KMD accumulates these values into a 64 bit counter taking care of any overflows as needed. This count provides the basis for client specific busyness in the fdinfo interface.
KMD accumulation happens just before the context is unpinned and when context switches out. This works for execlist back-end since execlist scheduling has visibility into context switches. With GuC mode, KMD does not have visibility into context switches and this counter is accumulated only when context is unpinned. Context is unpinned once the context scheduling is successfully disabled. Disabling context scheduling is an asynchronous operation. Also if a context is servicing frequent requests, scheduling may never be disabled on it.
For GuC mode, since updates to the context runtime may be delayed, add hooks to update the context runtime in a worker thread as well as when a user queries for it.
Limitation: - If a context is never switched out or runs for a long period of time, the runtime value of CTX_TIMESTAMP may never be updated, so the counter value may be unreliable. This patch does not support such cases. Such support must be available from the GuC FW and it is WIP.
This patch is an extract from previous work authored by John/Umesh here - https://patchwork.freedesktop.org/patch/496441/?series=105085&rev=4
v2: (Ashutosh) - Drop COPS_RUNTIME_ACTIVE_TOTAL - s/guc_context_update_clks/__guc_context_update_stats - Pin context before accessing in guc_timestamp_ping - In guc_context_unpin, use spinlock to serialize access to runtime stats
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Co-developed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230427224705.2785566-2-umesh.nerlige.ramappa@intel.com
show more ...
|
Revision tags: v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9 |
|
#
86d8ddc7 |
| 26-Jan-2023 |
John Harrison <John.C.Harrison@Intel.com> |
drm/i915: Fix request ref counting during error capture & debugfs dump
When GuC support was added to error capture, the reference counting around the request object was broken. Fix it up.
The conte
drm/i915: Fix request ref counting during error capture & debugfs dump
When GuC support was added to error capture, the reference counting around the request object was broken. Fix it up.
The context based search manages the spinlocking around the search internally. So it needs to grab the reference count internally as well. The execlist only request based search relies on external locking, so it needs an external reference count but within the spinlock not outside it.
The only other caller of the context based search is the code for dumping engine state to debugfs. That code wasn't previously getting an explicit reference at all as it does everything while holding the execlist specific spinlock. So, that needs updaing as well as that spinlock doesn't help when using GuC submission. Rather than trying to conditionally get/put depending on submission model, just change it to always do the get/put.
v2: Explicitly document adding an extra blank line in some dense code (Andy Shevchenko). Fix multiple potential null pointer derefs in case of no request found (some spotted by Tvrtko, but there was more!). Also fix a leaked request in case of !started and another in __guc_reset_context now that intel_context_find_active_request is actually reference counting the returned request. v3: Add a _get suffix to intel_context_find_active_request now that it grabs a reference (Daniele). v4: Split the intel_guc_find_hung_context change to a separate patch and rename intel_context_find_active_request_get to intel_context_get_active_request (Tvrtko). v5: s/locking/reference counting/ in commit message (Tvrtko)
Fixes: dc0dad365c5e ("drm/i915/guc: Fix for error capture after full GPU reset with GuC") Fixes: 573ba126aef3 ("drm/i915/guc: Capture error state on context reset") Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Andrzej Hajda <andrzej.hajda@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: Michael Cheng <michael.cheng@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Bruce Chang <yu.bruce.chang@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-3-John.C.Harrison@Intel.com (cherry picked from commit 3700e353781e27f1bc7222f51f2cc36cbeb9b4ec) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
show more ...
|
#
3700e353 |
| 26-Jan-2023 |
John Harrison <John.C.Harrison@Intel.com> |
drm/i915: Fix request ref counting during error capture & debugfs dump
When GuC support was added to error capture, the reference counting around the request object was broken. Fix it up.
The conte
drm/i915: Fix request ref counting during error capture & debugfs dump
When GuC support was added to error capture, the reference counting around the request object was broken. Fix it up.
The context based search manages the spinlocking around the search internally. So it needs to grab the reference count internally as well. The execlist only request based search relies on external locking, so it needs an external reference count but within the spinlock not outside it.
The only other caller of the context based search is the code for dumping engine state to debugfs. That code wasn't previously getting an explicit reference at all as it does everything while holding the execlist specific spinlock. So, that needs updaing as well as that spinlock doesn't help when using GuC submission. Rather than trying to conditionally get/put depending on submission model, just change it to always do the get/put.
v2: Explicitly document adding an extra blank line in some dense code (Andy Shevchenko). Fix multiple potential null pointer derefs in case of no request found (some spotted by Tvrtko, but there was more!). Also fix a leaked request in case of !started and another in __guc_reset_context now that intel_context_find_active_request is actually reference counting the returned request. v3: Add a _get suffix to intel_context_find_active_request now that it grabs a reference (Daniele). v4: Split the intel_guc_find_hung_context change to a separate patch and rename intel_context_find_active_request_get to intel_context_get_active_request (Tvrtko). v5: s/locking/reference counting/ in commit message (Tvrtko)
Fixes: dc0dad365c5e ("drm/i915/guc: Fix for error capture after full GPU reset with GuC") Fixes: 573ba126aef3 ("drm/i915/guc: Capture error state on context reset") Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Andrzej Hajda <andrzej.hajda@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: Michael Cheng <michael.cheng@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Bruce Chang <yu.bruce.chang@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-3-John.C.Harrison@Intel.com
show more ...
|
Revision tags: v6.1.8, v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80, v6.0.9, v5.15.79, v6.0.8, v5.15.78, v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4, v6.0.3, v6.0.2, v5.15.74, v5.15.73, v6.0.1, v5.15.72 |
|
#
70234728 |
| 03-Oct-2022 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915/guc: Fix revocation of non-persistent contexts
Patch which added graceful exit for non-persistent contexts missed the fact it is not enough to set the exiting flag on a context and let the
drm/i915/guc: Fix revocation of non-persistent contexts
Patch which added graceful exit for non-persistent contexts missed the fact it is not enough to set the exiting flag on a context and let the backend handle it from there.
GuC backend cannot handle it because it runs independently in the firmware and driver might not see the requests ever again. Patch also missed the fact some usages of intel_context_is_banned in the GuC backend needed replacing with newly introduced intel_context_is_schedulable.
Fix the first issue by calling into backend revoke when we know this is the last chance to do it. Fix the second issue by replacing intel_context_is_banned with intel_context_is_schedulable, which should always be safe since latter is a superset of the former.
v2: * Just call ce->ops->revoke unconditionally. (Andrzej)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: 45c64ecf97ee ("drm/i915: Improve user experience and driver robustness under SIGINT or similar") Cc: Andrzej Hajda <andrzej.hajda@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: <stable@vger.kernel.org> # v6.0+ Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221003121630.694249-1-tvrtko.ursulin@linux.intel.com (cherry picked from commit 0add082cebac8555ee3972ba768ae5c01db7a498) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
show more ...
|
#
0add082c |
| 03-Oct-2022 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915/guc: Fix revocation of non-persistent contexts
Patch which added graceful exit for non-persistent contexts missed the fact it is not enough to set the exiting flag on a context and let the
drm/i915/guc: Fix revocation of non-persistent contexts
Patch which added graceful exit for non-persistent contexts missed the fact it is not enough to set the exiting flag on a context and let the backend handle it from there.
GuC backend cannot handle it because it runs independently in the firmware and driver might not see the requests ever again. Patch also missed the fact some usages of intel_context_is_banned in the GuC backend needed replacing with newly introduced intel_context_is_schedulable.
Fix the first issue by calling into backend revoke when we know this is the last chance to do it. Fix the second issue by replacing intel_context_is_banned with intel_context_is_schedulable, which should always be safe since latter is a superset of the former.
v2: * Just call ce->ops->revoke unconditionally. (Andrzej)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: 45c64ecf97ee ("drm/i915: Improve user experience and driver robustness under SIGINT or similar") Cc: Andrzej Hajda <andrzej.hajda@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: <stable@vger.kernel.org> # v6.0+ Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221003121630.694249-1-tvrtko.ursulin@linux.intel.com
show more ...
|
Revision tags: v6.0, v5.15.71, v5.15.70, v5.15.69, v5.15.68, v5.15.67, v5.15.66, v5.15.65, v5.15.64, v5.15.63, v5.15.62, v5.15.61, v5.15.60, v5.15.59, v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55, v5.15.54, v5.15.53, v5.15.52, v5.15.51, v5.15.50, v5.15.49, v5.15.48, v5.15.47, v5.15.46, v5.15.45, v5.15.44 |
|
#
45c64ecf |
| 27-May-2022 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Improve user experience and driver robustness under SIGINT or similar
We have long standing customer complaints that pressing Ctrl-C (or to the effect of) causes engine resets with otherwi
drm/i915: Improve user experience and driver robustness under SIGINT or similar
We have long standing customer complaints that pressing Ctrl-C (or to the effect of) causes engine resets with otherwise well behaving programs.
Not only is logging engine resets during normal operation not desirable since it creates support incidents, but more fundamentally we should avoid going the engine reset path when we can since any engine reset introduces a chance of harming an innocent context.
Reason for this undesirable behaviour is that the driver currently does not distinguish between banned contexts and non-persistent contexts which have been closed.
To fix this we add the distinction between the two reasons for revoking contexts, which then allows the strict timeout only be applied to banned, while innocent contexts (well behaving) can preempt cleanly and exit without triggering the engine reset path.
Note that the added context exiting category applies both to closed non- persistent context, and any exiting context when hangcheck has been disabled by the user.
At the same time we rename the backend operation from 'ban' to 'revoke' which more accurately describes the actual semantics. (There is no ban at the backend level since banning is a concept driven by the scheduling frontend. Backends are simply able to revoke a running context so that is the more appropriate name chosen.)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220527072452.2225610-1-tvrtko.ursulin@linux.intel.com
show more ...
|
Revision tags: v5.15.43, v5.15.42, v5.18, v5.15.41, v5.15.40, v5.15.39, v5.15.38, v5.15.37, v5.15.36, v5.15.35, v5.15.34, v5.15.33 |
|
#
bb6287cb |
| 01-Apr-2022 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Track context current active time
Track context active (on hardware) status together with the start timestamp.
This will be used to provide better granularity of context runtime reporting
drm/i915: Track context current active time
Track context active (on hardware) status together with the start timestamp.
This will be used to provide better granularity of context runtime reporting in conjunction with already tracked pphwsp accumulated runtime.
The latter is only updated on context save so does not give us visibility to any currently executing work.
As part of the patch the existing runtime tracking data is moved under the new ce->stats member and updated under the seqlock. This provides the ability to atomically read out accumulated plus active runtime.
v2: * Rename and make __intel_context_get_active_time unlocked.
v3: * Use GRAPHICS_VER.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> # v1 Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220401142205.3123159-6-tvrtko.ursulin@linux.intel.com
show more ...
|
Revision tags: v5.15.32, v5.15.31, v5.17, v5.15.30, v5.15.29, v5.15.28, v5.15.27, v5.15.26 |
|
#
d1249022 |
| 01-Mar-2022 |
John Harrison <John.C.Harrison@Intel.com> |
drm/i915/guc: Better name for context id limit
The LRC descriptor pool is going away. So, stop using it as the limit for how many context ids are available. Instead, size the pool according to the n
drm/i915/guc: Better name for context id limit
The LRC descriptor pool is going away. So, stop using it as the limit for how many context ids are available. Instead, size the pool according to the number of contexts allowed. Note that this is just a naming change, the actual limit is identical in value.
While at it, also update a kzalloc(sizeof()*count) to be a kcalloc(count,size).
Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220302003357.4188363-4-John.C.Harrison@Intel.com
show more ...
|
Revision tags: v5.15.25, v5.15.24, v5.15.23, v5.15.22, v5.15.21, v5.15.20, v5.15.19, v5.15.18, v5.15.17, v5.4.173, v5.15.16, v5.15.15, v5.16 |
|
#
a88afcfa |
| 22-Dec-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915/execlists: Weak parallel submission support for execlists
A weak implementation of parallel submission (multi-bb execbuf IOCTL) for execlists. Doing as little as possible to support this in
drm/i915/execlists: Weak parallel submission support for execlists
A weak implementation of parallel submission (multi-bb execbuf IOCTL) for execlists. Doing as little as possible to support this interface for execlists - basically just passing submit fences between each request generated and virtual engines are not allowed. This is on par with what is there for the existing (hopefully soon deprecated) bonding interface.
We perma-pin these execlists contexts to align with GuC implementation.
v2: (John Harrison) - Drop siblings array as num_siblings must be 1 v3: (John Harrison) - Drop single submission v4: (John Harrison) - Actually drop single submission - Use IS_ERR check on return value from intel_context_create - Set last request to NULL on unpin
Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211222223532.28698-1-matthew.brost@intel.com
show more ...
|
Revision tags: v5.15.10, v5.15.9, v5.15.8, v5.15.7, v5.15.6, v5.15.5, v5.15.4, v5.15.3 |
|
#
44505168 |
| 16-Nov-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915: Drop stealing of bits from i915_sw_fence function pointer
Rather than stealing bits from i915_sw_fence function pointer use separate fields for function pointer and flags. If using two dif
drm/i915: Drop stealing of bits from i915_sw_fence function pointer
Rather than stealing bits from i915_sw_fence function pointer use separate fields for function pointer and flags. If using two different fields, the 4 byte alignment for the i915_sw_fence function pointer can also be dropped.
v2: (CI) - Set new function field rather than flags in __i915_sw_fence_init v3: (Tvrtko) - Remove BUG_ON(!fence->flags) in reinit as that will now blow up - Only define fence->flags if CONFIG_DRM_I915_SW_FENCE_CHECK_DAG is defined v4: - Rebase, resend for CI
Signed-off-by: Matthew Brost <matthew.brost@intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211116194929.10211-1-matthew.brost@intel.com
show more ...
|
#
e6e1a304 |
| 17-Nov-2021 |
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> |
drm/i915: vma is always backed by an object.
vma->obj and vma->resv are now never NULL, and some checks can be removed.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed
drm/i915: vma is always backed by an object.
vma->obj and vma->resv are now never NULL, and some checks can be removed.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211117142024.1043017-4-matthew.auld@intel.com
show more ...
|
#
10ceccb8 |
| 17-Nov-2021 |
Matthew Auld <matthew.auld@intel.com> |
drm/i915: move the pre_pin earlier
In intel_context_do_pin_ww, when calling into the pre_pin hook(which is passed the ww context) it could in theory return -EDEADLK(which is very likely with debug k
drm/i915: move the pre_pin earlier
In intel_context_do_pin_ww, when calling into the pre_pin hook(which is passed the ww context) it could in theory return -EDEADLK(which is very likely with debug kernels), once we start adding more ww locking in there, like in the next patch. If so then we need to be mindful of having to restart the do_pin at this point.
If this is the kernel_context, or some other early in-kernel context where we have yet to setup the default_state, then we always inhibit the context restore, and instead rely on the delayed active_release to set the CONTEXT_VALID_BIT for us(if we even care), which should indicate that we have context switched away, and that our newly saved context state should now be valid. However, since we currently grab the active reference before the potential ww dance, we can end up setting the CONTEXT_VALID_BIT much too early, if we need to backoff, and then upon re-trying the do_pin, we could potentially cause the hardware to incorrectly load some garbage context state when later context switching to that context, but at the very least this will trigger the GEM_BUG_ON() in __engine_unpark. For now let's just move any ww dance stuff prior to arming the active reference.
For normal user contexts this shouldn't be a concern, since we should already have the default_state ready when initialising the lrc state, and so there should be no concern with active_release somehow prematurely setting the CONTEXT_VALID_BIT.
v2(Thomas): - Also re-order the onion unwind
Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211117142024.1043017-1-matthew.auld@intel.com
show more ...
|
Revision tags: v5.15.2, v5.15.1, v5.15, v5.14.14, v5.14.13 |
|
#
5851387a |
| 14-Oct-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915/guc: Implement no mid batch preemption for multi-lrc
For some users of multi-lrc, e.g. split frame, it isn't safe to preempt mid BB. To safely enable preemption at the BB boundary, a handsh
drm/i915/guc: Implement no mid batch preemption for multi-lrc
For some users of multi-lrc, e.g. split frame, it isn't safe to preempt mid BB. To safely enable preemption at the BB boundary, a handshake between parent and child is needed, syncing the set of BBs at the beginning and end of each batch. This is implemented via custom emit_bb_start & emit_fini_breadcrumb functions and enabled by default if a context is configured by set parallel extension.
Lastly, this patch updates the process descriptor to the correct size as the memory used in the handshake is directly after the process descriptor.
v2: (John Harrison) - Fix a few comments wording - Add struture for parent page layout v3: (John Harrison) - A structure for sync semaphore - Use offsetof to calc address - Update commit message v4: (John Harrison) - Fix typos in comment explaining memory map of scratch page
Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-20-matthew.brost@intel.com
show more ...
|
#
872758db |
| 14-Oct-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915/guc: Implement multi-lrc reset
Update context and full GPU reset to work with multi-lrc. The idea is parent context tracks all the active requests inflight for itself and its children. The
drm/i915/guc: Implement multi-lrc reset
Update context and full GPU reset to work with multi-lrc. The idea is parent context tracks all the active requests inflight for itself and its children. The parent context owns the reset replaying / canceling requests as needed.
v2: (John Harrison) - Simply loop in find active request - Add comments to find ative request / reset loop v3: (John Harrison) - s/its'/its/g - Fix comment when searching for active request - Reorder if state in __guc_reset_context v4: (Kernel test robot) - Delete unused is_multi_lrc function
Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-15-matthew.brost@intel.com
show more ...
|
#
3897df4c |
| 14-Oct-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915/guc: Introduce context parent-child relationship
Introduce context parent-child relationship. Once this relationship is created all pinning / unpinning operations are directed to the parent
drm/i915/guc: Introduce context parent-child relationship
Introduce context parent-child relationship. Once this relationship is created all pinning / unpinning operations are directed to the parent context. The parent context is responsible for pinning all of its children and itself.
This is a precursor to the full GuC multi-lrc implementation but aligns to how GuC mutli-lrc interface is defined - a single H2G is used register / deregister all of the contexts simultaneously.
Subsequent patches in the series will implement the pinning / unpinning operations for parent / child contexts.
v2: (Daniel Vetter) - Add kernel doc, add wrapper to access parent to ensure safety v3: (John Harrison) - Fix comment explaing GEM_BUG_ON in to_parent() - Make variable names generic (non-GuC specific) v4: (John Harrison) - s/its'/its/g
Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-8-matthew.brost@intel.com
show more ...
|
#
f61eae18 |
| 14-Oct-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915/guc: Take engine PM when a context is pinned with GuC submission
Taking a PM reference to prevent intel_gt_wait_for_idle from short circuiting while any user context has scheduling enabled.
drm/i915/guc: Take engine PM when a context is pinned with GuC submission
Taking a PM reference to prevent intel_gt_wait_for_idle from short circuiting while any user context has scheduling enabled. Returning GT idle when it is not can cause all sorts of issues throughout the stack.
v2: (Daniel Vetter) - Add might_lock annotations to pin / unpin function v3: (CI) - Drop intel_engine_pm_might_put from unpin path as an async put is used v4: (John Harrison) - Make intel_engine_pm_might_get/put work with GuC virtual engines - Update commit message v5: - Update commit message again
Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-4-matthew.brost@intel.com
show more ...
|
#
1a52faed |
| 14-Oct-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915/guc: Take GT PM ref when deregistering context
Taking a PM reference to prevent intel_gt_wait_for_idle from short circuiting while a deregister context H2G is in flight. To do this must iss
drm/i915/guc: Take GT PM ref when deregistering context
Taking a PM reference to prevent intel_gt_wait_for_idle from short circuiting while a deregister context H2G is in flight. To do this must issue the deregister H2G from a worker as context can be destroyed from an atomic context and taking GT PM ref blows up. Previously we took a runtime PM from this atomic context which worked but will stop working once runtime pm autosuspend in enabled.
So this patch is two fold, stop intel_gt_wait_for_idle from short circuting and fix runtime pm autosuspend.
v2: (John Harrison) - Split structure changes out in different patch (Tvrtko) - Don't drop lock in deregister_destroyed_contexts v3: (John Harrison) - Flush destroyed contexts before destroying context reg pool
Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-3-matthew.brost@intel.com
show more ...
|
Revision tags: v5.14.12, v5.14.11, v5.14.10, v5.14.9, v5.14.8, v5.14.7 |
|
#
b0179f0d |
| 21-Sep-2021 |
Hugh Dickins <hughd@google.com> |
drm/i915: fix blank screen booting crashes
5.15-rc1 crashes with blank screen when booting up on two ThinkPads using i915. Bisections converge convincingly, but arrive at different and surprising "
drm/i915: fix blank screen booting crashes
5.15-rc1 crashes with blank screen when booting up on two ThinkPads using i915. Bisections converge convincingly, but arrive at different and surprising "culprits", none of them the actual culprit.
netconsole (with init_netconsole() hacked to call i915_init() when logging has started, instead of by module_init()) tells the story:
kernel BUG at drivers/gpu/drm/i915/i915_sw_fence.c:245! with RSI: ffffffff814d408b pointing to sw_fence_dummy_notify(). I've been building with CONFIG_CC_OPTIMIZE_FOR_SIZE=y, and that function needs to be 4-byte aligned.
v2: (Jani Nikula) - Change BUG_ON to WARN_ON v3: (Jani / Tvrtko) - Short circuit __i915_sw_fence_init on WARN_ON v4: (Lucas) - Break WARN_ON changes out in a different patch
Fixes: 62eaf0ae217d ("drm/i915/guc: Support request cancellation") Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210922015039.26411-1-matthew.brost@intel.com
show more ...
|
#
d576b31b |
| 24-Sep-2021 |
Matthew Auld <matthew.auld@intel.com> |
drm/i915: remember to call i915_sw_fence_fini
Seems to fix some object-debug splat which appeared while debugging something unrelated.
v2: s/guc_blocked/guc_state.blocked/
Signed-off-by: Matthew A
drm/i915: remember to call i915_sw_fence_fini
Seems to fix some object-debug splat which appeared while debugging something unrelated.
v2: s/guc_blocked/guc_state.blocked/
Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210924144646.4096402-1-matthew.auld@intel.com
show more ...
|
Revision tags: v5.14.6, v5.10.67, v5.10.66, v5.14.5, v5.14.4, v5.10.65, v5.14.3, v5.10.64 |
|
#
af5bc9f2 |
| 09-Sep-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915/guc: Drop guc_active move everything into guc_state
Now that we have locking hierarchy of sched_engine->lock -> ce->guc_state everything from guc_active can be moved into guc_state and prot
drm/i915/guc: Drop guc_active move everything into guc_state
Now that we have locking hierarchy of sched_engine->lock -> ce->guc_state everything from guc_active can be moved into guc_state and protected the guc_state.lock.
Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-23-matthew.brost@intel.com
show more ...
|
#
3cb3e343 |
| 09-Sep-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915/guc: Move fields protected by guc->contexts_lock into sub structure
To make ownership of locking clear move fields (guc_id, guc_id_ref, guc_id_link) to sub structure guc_id in intel_context
drm/i915/guc: Move fields protected by guc->contexts_lock into sub structure
To make ownership of locking clear move fields (guc_id, guc_id_ref, guc_id_link) to sub structure guc_id in intel_context.
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-22-matthew.brost@intel.com
show more ...
|
#
52d66c06 |
| 09-Sep-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915/guc: Move guc_blocked fence to struct guc_state
Move guc_blocked fence to struct guc_state as the lock which protects the fence lives there.
s/ce->guc_blocked/ce->guc_state.blocked/g
v2:
drm/i915/guc: Move guc_blocked fence to struct guc_state
Move guc_blocked fence to struct guc_state as the lock which protects the fence lives there.
s/ce->guc_blocked/ce->guc_state.blocked/g
v2: (Daniele) - s/blocked_fence/blocked/g
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-17-matthew.brost@intel.com
show more ...
|
#
e02083f0 |
| 11-Oct-2021 |
Matthew Auld <matthew.auld@intel.com> |
drm/i915: remember to call i915_sw_fence_fini
Seems to fix some object-debug splat which appeared while debugging something unrelated.
v2: s/guc_blocked/guc_state.blocked/
Signed-off-by: Matthew A
drm/i915: remember to call i915_sw_fence_fini
Seems to fix some object-debug splat which appeared while debugging something unrelated.
v2: s/guc_blocked/guc_state.blocked/
Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Fixes: 62eaf0ae217d ("drm/i915/guc: Support request cancellation") Link: https://patchwork.freedesktop.org/patch/msgid/20210924144646.4096402-1-matthew.auld@intel.com (cherry picked from commit d576b31bdece7b5034047cbe21170e948198d32f) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
show more ...
|
#
cdc1e6e2 |
| 02-Oct-2021 |
Hugh Dickins <hughd@google.com> |
drm/i915: fix blank screen booting crashes
5.15-rc1 crashes with blank screen when booting up on two ThinkPads using i915. Bisections converge convincingly, but arrive at different and suprising "c
drm/i915: fix blank screen booting crashes
5.15-rc1 crashes with blank screen when booting up on two ThinkPads using i915. Bisections converge convincingly, but arrive at different and suprising "culprits", none of them the actual culprit.
netconsole (with init_netconsole() hacked to call i915_init() when logging has started, instead of by module_init()) tells the story:
kernel BUG at drivers/gpu/drm/i915/i915_sw_fence.c:245! with RSI: ffffffff814d408b pointing to sw_fence_dummy_notify(). I've been building with CONFIG_CC_OPTIMIZE_FOR_SIZE=y, and that function needs to be 4-byte aligned.
Fixes: 62eaf0ae217d ("drm/i915/guc: Support request cancellation") Signed-off-by: Hugh Dickins <hughd@google.com> Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
Revision tags: v5.14.2, v5.10.63, v5.14.1, v5.10.62, v5.14, v5.10.61, v5.10.60 |
|
#
2dcec7d3 |
| 27-Jul-2021 |
Daniel Vetter <daniel.vetter@ffwll.ch> |
drm/i915: move intel_context slab to direct module init/exit
With the global kmem_cache shrink infrastructure gone there's nothing special and we can convert them over.
I'm doing this split up into
drm/i915: move intel_context slab to direct module init/exit
With the global kmem_cache shrink infrastructure gone there's nothing special and we can convert them over.
I'm doing this split up into each patch because there's quite a bit of noise with removing the static global.slab_ce to just a slab_ce.
v2: Make slab static (Jason, 0day)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727121037.2041102-4-daniel.vetter@ffwll.ch
show more ...
|