History log of /openbmc/linux/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c (Results 126 – 150 of 373)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v5.8.2, v5.4.59, v5.8.1, v5.4.58, v5.4.57
# da1ea128 06-Aug-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gem: Free the fence after a fence-chain lookup failure

If dma_fence_chain_find_seqno() reports an error, it does so in its
preamble before it disposes of the input fence. On handling the
er

drm/i915/gem: Free the fence after a fence-chain lookup failure

If dma_fence_chain_find_seqno() reports an error, it does so in its
preamble before it disposes of the input fence. On handling the
error, we need to drop the reference to the fence.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2292
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 13149e8bafc4 ("drm/i915: add syncobj timeline support")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200806161056.17593-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

show more ...


Revision tags: v5.4.56, v5.8, v5.7.12, v5.4.55
# 5d934137 31-Jul-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Export a preallocate variant of i915_active_acquire()

Sometimes we have to be very careful not to allocate underneath a mutex
(or spinlock) and yet still want to track activity. Enter
i915

drm/i915: Export a preallocate variant of i915_active_acquire()

Sometimes we have to be very careful not to allocate underneath a mutex
(or spinlock) and yet still want to track activity. Enter
i915_active_acquire_for_context(). This raises the activity counter on
i915_active prior to use and ensures that the fence-tree contains a slot
for the context.

v2: Refactor active_lookup() so it can be called again before/after
locking to resolve contention. Since we protect the rbtree until we
idle, we can do a lockfree lookup, with the caveat that if another
thread performs a concurrent insertion, the rotations from the insert
may cause us to not find our target. A second pass holding the treelock
will find the target if it exists, or the place to perform our
insertion.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200731085015.32368-3-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

show more ...


Revision tags: v5.7.11, v5.4.54
# 27a5dcfe 28-Jul-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gem: Remove disordered per-file request list for throttling

I915_GEM_THROTTLE dates back to the time before contexts where there was
just a single engine, and therefore a single timeline an

drm/i915/gem: Remove disordered per-file request list for throttling

I915_GEM_THROTTLE dates back to the time before contexts where there was
just a single engine, and therefore a single timeline and request list
globally. That request list was in execution/retirement order, and so
walking it to find a particular aged request made sense and could be
split per file.

That is no more. We now have many timelines with a file, as many as the
user wants to construct (essentially per-engine, per-context). Each of
those run independently and so make the single list futile. Remove the
disordered list, and iterate over all the timelines to find a request to
wait on in each to satisfy the criteria that the CPU is no more than 20ms
ahead of its oldest request.

It should go without saying that the I915_GEM_THROTTLE ioctl is no
longer used as the primary means of throttling, so it makes sense to push
the complication into the ioctl where it only impacts upon its few
irregular users, rather than the execbuf/retire where everybody has to
pay the cost. Fortunately, the few users do not create vast amount of
contexts, so the loops over contexts/engines should be concise.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200728152010.30701-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

show more ...


# 13149e8b 04-Aug-2020 Lionel Landwerlin <lionel.g.landwerlin@intel.com>

drm/i915: add syncobj timeline support

Introduces a new parameters to execbuf so that we can specify syncobj
handles as well as timeline points.

v2: Reuse i915_user_extension_fn

v3: Check that the

drm/i915: add syncobj timeline support

Introduces a new parameters to execbuf so that we can specify syncobj
handles as well as timeline points.

v2: Reuse i915_user_extension_fn

v3: Check that the chained extension is only present once (Chris)

v4: Check that dma_fence_chain_find_seqno returns a non NULL fence (Lionel)

v5: Use BIT_ULL (Chris)

v6: Fix issue with already signaled timeline points,
dma_fence_chain_find_seqno() setting fence to NULL (Chris)

v7: Report ENOENT with invalid syncobj handle (Lionel)

v8: Check for out of order timeline point insertion (Chris)

v9: After explanations on
https://lists.freedesktop.org/archives/dri-devel/2019-August/229287.html
drop the ordering check from v8 (Lionel)

v10: Set first extension enum item to 1 (Jason)

v11: Rebase

v12: Allow multiple extension nodes of timeline syncobj (Chris)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Co-authored-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v11)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200804085954.350343-3-lionel.g.landwerlin@intel.com
Link: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2901
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

show more ...


# cda9edd0 04-Aug-2020 Lionel Landwerlin <lionel.g.landwerlin@intel.com>

drm/i915: introduce a mechanism to extend execbuf2

We're planning to use this for a couple of new feature where we need
to provide additional parameters to execbuf.

v2: Check for invalid flags in e

drm/i915: introduce a mechanism to extend execbuf2

We're planning to use this for a couple of new feature where we need
to provide additional parameters to execbuf.

v2: Check for invalid flags in execbuffer2 (Lionel)

v3: Rename I915_EXEC_EXT -> I915_EXEC_USE_EXTENSIONS (Chris)

v4: Rebase
Move array fence parsing in i915_gem_do_execbuffer()

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200804085954.350343-2-lionel.g.landwerlin@intel.com
Link: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2901
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

show more ...


Revision tags: v5.7.10, v5.4.53, v5.4.52, v5.7.9, v5.7.8, v5.4.51
# 792592e7 07-Jul-2020 Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

drm/i915: Move the engine mask to intel_gt_info

Since the engines belong to the GT, move the runtime-updated list of
available engines to the intel_gt struct. The original mask has been
renamed to i

drm/i915: Move the engine mask to intel_gt_info

Since the engines belong to the GT, move the runtime-updated list of
available engines to the intel_gt struct. The original mask has been
renamed to indicate it contains the maximum engine list that can be
found on a matching device.

In preparation for other info being moved to the gt in follow up patches
(sseu), introduce an intel_gt_info structure to group all gt-related
runtime info.

v2: s/max_engine_mask/platform_engine_mask (tvrtko), fix selftest

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v1
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200708003952.21831-5-daniele.ceraolospurio@intel.com

show more ...


# f7ce8639 02-Jul-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gem: Split the context's obj:vma lut into its own mutex

Rather than reuse the common ctx->mutex for locking the execbuffer LUT,
split it into its own lock to avoid being taken [as part of c

drm/i915/gem: Split the context's obj:vma lut into its own mutex

Rather than reuse the common ctx->mutex for locking the execbuffer LUT,
split it into its own lock to avoid being taken [as part of ctx->mutex]
at inappropriate times. In particular to avoid the inversion from taking
the timeline->mutex for the whole execbuf submission in the next patch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200703004306.11117-1-chris@chris-wilson.co.uk

show more ...


# 096a42dd 01-Jul-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gem: Move obj->lut_list under its own lock

The obj->lut_list is traversed when the object is closed as the file
table is destroyed during process termination. As this occurs before we
kill

drm/i915/gem: Move obj->lut_list under its own lock

The obj->lut_list is traversed when the object is closed as the file
table is destroyed during process termination. As this occurs before we
kill any outstanding context if, due to some bug or another, the closure
is blocked, then we fail to shootdown any inflight operations
potentially leaving the GPU spinning forever. As we only need to guard
the list against concurrent closures and insertions, the hold is short
and merits being treated as a simple spinlock.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200701084439.17025-1-chris@chris-wilson.co.uk

show more ...


Revision tags: v5.4.50, v5.7.7, v5.4.49, v5.7.6, v5.7.5, v5.4.48, v5.7.4, v5.7.3, v5.4.47, v5.4.46, v5.7.2, v5.4.45, v5.7.1
# d7466a5a 04-Jun-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gem: Mark the buffer pool as active for the cmdparser

If the execbuf is interrupted after building the cmdparser pipeline, and
before we commit to submitting the request to HW, we would att

drm/i915/gem: Mark the buffer pool as active for the cmdparser

If the execbuf is interrupted after building the cmdparser pipeline, and
before we commit to submitting the request to HW, we would attempt to
clean up the cmdparser early. While we held active references to the vma
being parsed and constructed, we did not hold an active reference for
the buffer pool itself. The result was that an interrupted execbuf could
still have run the cmdparser pipeline, but since the buffer pool was
idle, its target vma could have been recycled.

Note this problem only occurs if the cmdparser is running async due to
pipelined waits on busy fences, and the execbuf is interrupted.

Fixes: 686c7c35abc2 ("drm/i915/gem: Asynchronous cmdparser")
Fixes: 16e87459673a ("drm/i915/gt: Move the batch buffer pool from the engine to the gt")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200604103751.18816-1-chris@chris-wilson.co.uk
(cherry picked from commit 57a78ca4eceab1ecb0299fba8a10211289329889)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

show more ...


# 7ac2d253 05-Jun-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gem: Delete unused code

Unused as of commit 9e0f9464e2ab ("drm/i915/gem: Async GPU relocations
only"), but left behind.

>> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:933:21: error: unu

drm/i915/gem: Delete unused code

Unused as of commit 9e0f9464e2ab ("drm/i915/gem: Async GPU relocations
only"), but left behind.

>> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:933:21: error: unused function 'unmask_page' [-Werror,-Wunused-function]
static inline void *unmask_page(unsigned long p)
^
>> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:938:28: error: unused function 'unmask_flags' [-Werror,-Wunused-function]
static inline unsigned int unmask_flags(unsigned long p)
^
>> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:945:33: error: unused function 'cache_to_ggtt' [-Werror,-Wunused-function]
static inline struct i915_ggtt *cache_to_ggtt(struct reloc_cache *cache)

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200605200357.13069-1-chris@chris-wilson.co.uk

show more ...


# 9e0f9464 04-Jun-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gem: Async GPU relocations only

Reduce the 3 relocation paths down to the single path that accommodates
all. The primary motivation for this is to guard the relocations with a
natural fence

drm/i915/gem: Async GPU relocations only

Reduce the 3 relocation paths down to the single path that accommodates
all. The primary motivation for this is to guard the relocations with a
natural fence (derived from the i915_request used to write the
relocation from the GPU).

The tradeoff in using async gpu relocations is that it increases latency
over using direct CPU relocations, for the cases where the target is
idle and accessible by the CPU. The benefit is greatly reduced lock
contention and improved concurrency by pipelining.

Note that forcing the async gpu relocations does reveal a few issues
they have. Firstly, is that they are visible as writes to gem_busy,
causing to mark some buffers are being to written to by the GPU even
though userspace only reads. Secondly is that, in combination with the
cmdparser, they can cause priority inversions. This should be the case
where the work is being put into a common workqueue losing our priority
information and so being executed in FIFO from the worker, denying us
the opportunity to reorder the requests afterwards.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200604211457.19696-1-chris@chris-wilson.co.uk

show more ...


# 57a78ca4 04-Jun-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gem: Mark the buffer pool as active for the cmdparser

If the execbuf is interrupted after building the cmdparser pipeline, and
before we commit to submitting the request to HW, we would att

drm/i915/gem: Mark the buffer pool as active for the cmdparser

If the execbuf is interrupted after building the cmdparser pipeline, and
before we commit to submitting the request to HW, we would attempt to
clean up the cmdparser early. While we held active references to the vma
being parsed and constructed, we did not hold an active reference for
the buffer pool itself. The result was that an interrupted execbuf could
still have run the cmdparser pipeline, but since the buffer pool was
idle, its target vma could have been recycled.

Note this problem only occurs if the cmdparser is running async due to
pipelined waits on busy fences, and the execbuf is interrupted.

Fixes: 686c7c35abc2 ("drm/i915/gem: Asynchronous cmdparser")
Fixes: 16e87459673a ("drm/i915/gt: Move the batch buffer pool from the engine to the gt")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200604103751.18816-1-chris@chris-wilson.co.uk

show more ...


Revision tags: v5.4.44
# 5a833995 02-Jun-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Drop i915_request.i915 backpointer

We infrequently use the direct i915 backpointer from the i915_request,
so do we really need to waste the space in the struct for it? 8 bytes
from the mos

drm/i915: Drop i915_request.i915 backpointer

We infrequently use the direct i915 backpointer from the i915_request,
so do we really need to waste the space in the struct for it? 8 bytes
from the most frequently allocated struct vs an 3 bytes and pointer
chasing in using rq->engine->i915?

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200602220953.21178-1-chris@chris-wilson.co.uk

show more ...


Revision tags: v5.7, v5.4.43
# ea97c4ca 25-May-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gem: Suppress some random warnings

Leave the error propagation in place, but limit the warnings to only
show up in CI if the unlikely errors are hit.

Signed-off-by: Chris Wilson <chris@chr

drm/i915/gem: Suppress some random warnings

Leave the error propagation in place, but limit the warnings to only
show up in CI if the unlikely errors are hit.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200525141957.3061-2-chris@chris-wilson.co.uk

show more ...


Revision tags: v5.4.42, v5.4.41, v5.4.40, v5.4.39
# 6db20e27 04-May-2020 Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>

drm/i915/gem: Prefer drm_WARN* over WARN*

struct drm_device specific drm_WARN* macros include device information
in the backtrace, so we know what device the warnings originate from.

Prefer drm_WAR

drm/i915/gem: Prefer drm_WARN* over WARN*

struct drm_device specific drm_WARN* macros include device information
in the backtrace, so we know what device the warnings originate from.

Prefer drm_WARN* over WARN* at places where struct drm_device pointer
can be extracted.

Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200504181600.18503-6-pankaj.laxminarayan.bharadiya@intel.com

show more ...


# 18e4af04 13-May-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Drop no-semaphore boosting

Now that we have fast timeslicing on semaphores, we no longer need to
prioritise none-semaphore work as we will yield any work blocked on a
semaphore to the next

drm/i915: Drop no-semaphore boosting

Now that we have fast timeslicing on semaphores, we no longer need to
prioritise none-semaphore work as we will yield any work blocked on a
semaphore to the next in the queue. Previously with no timeslicing,
blocking on the semaphore caused extremely bad scheduling with multiple
clients utilising multiple rings. Now, there is no impact and we can
remove the complication.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200513173504.28322-1-chris@chris-wilson.co.uk

show more ...


# 889333c7 13-May-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gem: Remove redundant exec_fence

Since there can only be one of in_fence/exec_fence, just use the single
in_fence local.

v2: Consolidate lookup

Signed-off-by: Chris Wilson <chris@chris-wi

drm/i915/gem: Remove redundant exec_fence

Since there can only be one of in_fence/exec_fence, just use the single
in_fence local.

v2: Consolidate lookup

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200513180937.28992-1-chris@chris-wilson.co.uk

show more ...


# eec39e44 07-May-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove wait priority boosting

Upon waiting a request (when asked), we gave that request a small
priority boost, not enough for it to cause preemption, but enough for it
to be scheduled nex

drm/i915: Remove wait priority boosting

Upon waiting a request (when asked), we gave that request a small
priority boost, not enough for it to cause preemption, but enough for it
to be scheduled next before all equals. We also used that bit to give
new clients a small priority boost, similar to FQ_CODEL, such that we
favoured short interactive tasks ahead of long running streams.

However, this is causing lots of complications with timeslicing where we
both want to honour the boost and yet ignore it. Those complications
cause unexpected user behaviour (tasks not being timesliced and run
concurrently as epxected), and the easiest way to resolve that is to
remove the boost. Hopefully, we can find a compromise again if we need
to, but in theory timeslicing itself and future more advanced schedulers
should give us the interactivity boost we seek.

Testcase: igt/gem_exec_schedule/lateslice
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200507152338.7452-3-chris@chris-wilson.co.uk

show more ...


# e3d29130 04-May-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gem: Implement legacy MI_STORE_DATA_IMM

The older arches did not convert MI_STORE_DATA_IMM to using the GTT, but
left them writing to a physical address. The notes suggest that the
primary

drm/i915/gem: Implement legacy MI_STORE_DATA_IMM

The older arches did not convert MI_STORE_DATA_IMM to using the GTT, but
left them writing to a physical address. The notes suggest that the
primary reason would be so that the writes were cache coherent, as the
CPU cache uses physical tagging. As such we did not implement the
legacy variant of MI_STORE_DATA_IMM and so left all the relocations
synchronous -- but with a small function to convert from the vma address
into the physical address, we can implement asynchronous relocs on these
older arches, fixing up a few tests that require them.

In order to be able to test the legacy paths, refactor the gpu
relocations so that we can hook them up to a selftest.

v2: Use an array of offsets not enum labels for the selftest
v3: Refactor the common igt_hexdump()

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/757
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200504140629.28240-1-chris@chris-wilson.co.uk

show more ...


# f5b62bdb 04-May-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gem: Specify address type for chained reloc batches

It is required that a chained batch be in the same address domain as its
parent, and also that must be specified in the command for earli

drm/i915/gem: Specify address type for chained reloc batches

It is required that a chained batch be in the same address domain as its
parent, and also that must be specified in the command for earlier gen
as it is not inferred from the chaining until gen6.

Fixes: 964a9b0f611e ("drm/i915/gem: Use chained reloc batches")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200504125149.4396-1-chris@chris-wilson.co.uk

show more ...


Revision tags: v5.4.38, v5.4.37
# 6f576d62 01-May-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gem: Try an alternate engine for relocations

If at first we don't succeed, try try again.

Not all engines may support the MI ops we need to perform asynchronous
relocation patching, and so

drm/i915/gem: Try an alternate engine for relocations

If at first we don't succeed, try try again.

Not all engines may support the MI ops we need to perform asynchronous
relocation patching, and so we end up falling back to a synchronous
operation that has a liability of blocking. However, Tvrtko pointed out
we don't need to use the same engine to perform the relocations as we
are planning to execute the execbuf on, and so if we switch over to a
working engine, we can perform the relocation asynchronously. The user
execbuf will be queued after the relocations by virtue of fencing.

This patch creates a new context per execbuf requiring asynchronous
relocations on an unusable engines. This is perhaps a bit excessive and
can be ameliorated by a small context cache, but for the moment we only
need it for working around a little used engine on Sandybridge, and only
if relocations are actually required to an active batch buffer.

Now we just need to teach the relocation code to handle physical
addressing for gen2/3, and we should then have universal support!

Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Testcase: igt/gem_exec_reloc/basic-spin # snb
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200501192945.22215-3-chris@chris-wilson.co.uk

show more ...


# 0e97fbb0 01-May-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gem: Use a single chained reloc batches for a single execbuf

As we can now keep chaining together a relocation batch to process any
number of relocations, we can keep building that relocati

drm/i915/gem: Use a single chained reloc batches for a single execbuf

As we can now keep chaining together a relocation batch to process any
number of relocations, we can keep building that relocation batch for
all of the target vma. This avoiding emitting a new request into the
ring for each target, consuming precious ring space and a potential
stall.

v2: Propagate the failure from submitting the relocation batch.

Testcase: igt/gem_exec_reloc/basic-wide-active
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200501192945.22215-2-chris@chris-wilson.co.uk

show more ...


# 964a9b0f 01-May-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gem: Use chained reloc batches

The ring is a precious resource: we anticipate to only use a few hundred
bytes for a request, and only try to reserve that before we start. If we
go beyond ou

drm/i915/gem: Use chained reloc batches

The ring is a precious resource: we anticipate to only use a few hundred
bytes for a request, and only try to reserve that before we start. If we
go beyond our guess in building the request, then instead of waiting at
the start of execbuf before we hold any locks or other resources, we
may trigger a wait inside a critical region. One example is in using gpu
relocations, where currently we emit a new MI_BB_START from the ring
every time we overflow a page of relocation entries. However, instead of
insert the command into the precious ring, we can chain the next page of
relocation entries as MI_BB_START from the end of the previous.

v2: Delay the emit_bb_start until after all the chained vma
synchronisation is complete. Since the buffer pool batches are idle, this
_should_ be a no-op, but one day we may some fancy async GPU bindings
for new vma!

v3: Use pool/batch consitently, once we start thinking in terms of the
batch vma, use batch->obj.
v4: Explain the magic number 4.

Tvrtko spotted that we lose propagation of the error for failing to
submit the relocation request; that's easier to fix up in the next
patch.

Testcase: igt/gem_exec_reloc/basic-many-active
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200501192945.22215-1-chris@chris-wilson.co.uk

show more ...


Revision tags: v5.4.36, v5.4.35, v5.4.34, v5.4.33, v5.4.32, v5.4.31
# b44f6873 03-Apr-2020 Christophe Leroy <christophe.leroy@c-s.fr>

drm/i915/gem: Replace user_access_begin by user_write_access_begin

When i915_gem_execbuffer2_ioctl() is using user_access_begin(),
that's only to perform unsafe_put_user() so use
user_write_access_b

drm/i915/gem: Replace user_access_begin by user_write_access_begin

When i915_gem_execbuffer2_ioctl() is using user_access_begin(),
that's only to perform unsafe_put_user() so use
user_write_access_begin() in order to only open write access.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/ebf1250b6d4f351469fb339e5399d8b92aa8a1c1.1585898438.git.christophe.leroy@c-s.fr

show more ...


# 16e87459 30-Apr-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Move the batch buffer pool from the engine to the gt

Since the introduction of 'soft-rc6', we aim to park the device quickly
and that results in frequent idling of the whole device. Cur

drm/i915/gt: Move the batch buffer pool from the engine to the gt

Since the introduction of 'soft-rc6', we aim to park the device quickly
and that results in frequent idling of the whole device. Currently upon
idling we free the batch buffer pool, and so this renders the cache
ineffective for many workloads. If we want to have an effective cache of
recently allocated buffers available for reuse, we need to decouple that
cache from the engine powermanagement and make it timer based. As there
is no reason then to keep it within the engine (where it once made
retirement order easier to track), we can move it up the hierarchy to the
owner of the memory allocations.

v2: Hook up to debugfs/drop_caches to clear the cache on demand.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430111819.10262-2-chris@chris-wilson.co.uk

show more ...


12345678910>>...15