1ece91c88SNiranjana Vishwanathapura==========================================
2ece91c88SNiranjana VishwanathapuraI915 VM_BIND feature design and use cases
3ece91c88SNiranjana Vishwanathapura==========================================
4ece91c88SNiranjana Vishwanathapura
5ece91c88SNiranjana VishwanathapuraVM_BIND feature
6ece91c88SNiranjana Vishwanathapura================
7ece91c88SNiranjana VishwanathapuraDRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM buffer
8ece91c88SNiranjana Vishwanathapuraobjects (BOs) or sections of a BOs at specified GPU virtual addresses on a
9ece91c88SNiranjana Vishwanathapuraspecified address space (VM). These mappings (also referred to as persistent
10ece91c88SNiranjana Vishwanathapuramappings) will be persistent across multiple GPU submissions (execbuf calls)
11ece91c88SNiranjana Vishwanathapuraissued by the UMD, without user having to provide a list of all required
12ece91c88SNiranjana Vishwanathapuramappings during each submission (as required by older execbuf mode).
13ece91c88SNiranjana Vishwanathapura
14ece91c88SNiranjana VishwanathapuraThe VM_BIND/UNBIND calls allow UMDs to request a timeline out fence for
15ece91c88SNiranjana Vishwanathapurasignaling the completion of bind/unbind operation.
16ece91c88SNiranjana Vishwanathapura
17ece91c88SNiranjana VishwanathapuraVM_BIND feature is advertised to user via I915_PARAM_VM_BIND_VERSION.
18ece91c88SNiranjana VishwanathapuraUser has to opt-in for VM_BIND mode of binding for an address space (VM)
19ece91c88SNiranjana Vishwanathapuraduring VM creation time via I915_VM_CREATE_FLAGS_USE_VM_BIND extension.
20ece91c88SNiranjana Vishwanathapura
21ece91c88SNiranjana VishwanathapuraVM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently are
22ece91c88SNiranjana Vishwanathapuranot ordered. Furthermore, parts of the VM_BIND/UNBIND operations can be done
23ece91c88SNiranjana Vishwanathapuraasynchronously, when valid out fence is specified.
24ece91c88SNiranjana Vishwanathapura
25ece91c88SNiranjana VishwanathapuraVM_BIND features include:
26ece91c88SNiranjana Vishwanathapura
27ece91c88SNiranjana Vishwanathapura* Multiple Virtual Address (VA) mappings can map to the same physical pages
28ece91c88SNiranjana Vishwanathapura  of an object (aliasing).
29ece91c88SNiranjana Vishwanathapura* VA mapping can map to a partial section of the BO (partial binding).
30ece91c88SNiranjana Vishwanathapura* Support capture of persistent mappings in the dump upon GPU error.
31ece91c88SNiranjana Vishwanathapura* Support for userptr gem objects (no special uapi is required for this).
32ece91c88SNiranjana Vishwanathapura
33ece91c88SNiranjana VishwanathapuraTLB flush consideration
34ece91c88SNiranjana Vishwanathapura------------------------
35ece91c88SNiranjana VishwanathapuraThe i915 driver flushes the TLB for each submission and when an object's
36ece91c88SNiranjana Vishwanathapurapages are released. The VM_BIND/UNBIND operation will not do any additional
37ece91c88SNiranjana VishwanathapuraTLB flush. Any VM_BIND mapping added will be in the working set for subsequent
38ece91c88SNiranjana Vishwanathapurasubmissions on that VM and will not be in the working set for currently running
39ece91c88SNiranjana Vishwanathapurabatches (which would require additional TLB flushes, which is not supported).
40ece91c88SNiranjana Vishwanathapura
41ece91c88SNiranjana VishwanathapuraExecbuf ioctl in VM_BIND mode
42ece91c88SNiranjana Vishwanathapura-------------------------------
43ece91c88SNiranjana VishwanathapuraA VM in VM_BIND mode will not support older execbuf mode of binding.
44ece91c88SNiranjana VishwanathapuraThe execbuf ioctl handling in VM_BIND mode differs significantly from the
45ece91c88SNiranjana Vishwanathapuraolder execbuf2 ioctl (See struct drm_i915_gem_execbuffer2).
46ece91c88SNiranjana VishwanathapuraHence, a new execbuf3 ioctl has been added to support VM_BIND mode. (See
47ece91c88SNiranjana Vishwanathapurastruct drm_i915_gem_execbuffer3). The execbuf3 ioctl will not accept any
48ece91c88SNiranjana Vishwanathapuraexeclist. Hence, no support for implicit sync. It is expected that the below
49ece91c88SNiranjana Vishwanathapurawork will be able to support requirements of object dependency setting in all
50ece91c88SNiranjana Vishwanathapurause cases:
51ece91c88SNiranjana Vishwanathapura
52ece91c88SNiranjana Vishwanathapura"dma-buf: Add an API for exporting sync files"
53ece91c88SNiranjana Vishwanathapura(https://lwn.net/Articles/859290/)
54ece91c88SNiranjana Vishwanathapura
55ece91c88SNiranjana VishwanathapuraThe new execbuf3 ioctl only works in VM_BIND mode and the VM_BIND mode only
56ece91c88SNiranjana Vishwanathapuraworks with execbuf3 ioctl for submission. All BOs mapped on that VM (through
57ece91c88SNiranjana VishwanathapuraVM_BIND call) at the time of execbuf3 call are deemed required for that
58ece91c88SNiranjana Vishwanathapurasubmission.
59ece91c88SNiranjana Vishwanathapura
60ece91c88SNiranjana VishwanathapuraThe execbuf3 ioctl directly specifies the batch addresses instead of as
61ece91c88SNiranjana Vishwanathapuraobject handles as in execbuf2 ioctl. The execbuf3 ioctl will also not
62ece91c88SNiranjana Vishwanathapurasupport many of the older features like in/out/submit fences, fence array,
63ece91c88SNiranjana Vishwanathapuradefault gem context and many more (See struct drm_i915_gem_execbuffer3).
64ece91c88SNiranjana Vishwanathapura
65ece91c88SNiranjana VishwanathapuraIn VM_BIND mode, VA allocation is completely managed by the user instead of
66ece91c88SNiranjana Vishwanathapurathe i915 driver. Hence all VA assignment, eviction are not applicable in
67ece91c88SNiranjana VishwanathapuraVM_BIND mode. Also, for determining object activeness, VM_BIND mode will not
68ece91c88SNiranjana Vishwanathapurabe using the i915_vma active reference tracking. It will instead use dma-resv
69ece91c88SNiranjana Vishwanathapuraobject for that (See `VM_BIND dma_resv usage`_).
70ece91c88SNiranjana Vishwanathapura
71ece91c88SNiranjana VishwanathapuraSo, a lot of existing code supporting execbuf2 ioctl, like relocations, VA
72ece91c88SNiranjana Vishwanathapuraevictions, vma lookup table, implicit sync, vma active reference tracking etc.,
73ece91c88SNiranjana Vishwanathapuraare not applicable for execbuf3 ioctl. Hence, all execbuf3 specific handling
74ece91c88SNiranjana Vishwanathapurashould be in a separate file and only functionalities common to these ioctls
75ece91c88SNiranjana Vishwanathapuracan be the shared code where possible.
76ece91c88SNiranjana Vishwanathapura
77ece91c88SNiranjana VishwanathapuraVM_PRIVATE objects
78ece91c88SNiranjana Vishwanathapura-------------------
79ece91c88SNiranjana VishwanathapuraBy default, BOs can be mapped on multiple VMs and can also be dma-buf
80ece91c88SNiranjana Vishwanathapuraexported. Hence these BOs are referred to as Shared BOs.
81ece91c88SNiranjana VishwanathapuraDuring each execbuf submission, the request fence must be added to the
82ece91c88SNiranjana Vishwanathapuradma-resv fence list of all shared BOs mapped on the VM.
83ece91c88SNiranjana Vishwanathapura
84ece91c88SNiranjana VishwanathapuraVM_BIND feature introduces an optimization where user can create BO which
85ece91c88SNiranjana Vishwanathapurais private to a specified VM via I915_GEM_CREATE_EXT_VM_PRIVATE flag during
86ece91c88SNiranjana VishwanathapuraBO creation. Unlike Shared BOs, these VM private BOs can only be mapped on
87ece91c88SNiranjana Vishwanathapurathe VM they are private to and can't be dma-buf exported.
88ece91c88SNiranjana VishwanathapuraAll private BOs of a VM share the dma-resv object. Hence during each execbuf
89ece91c88SNiranjana Vishwanathapurasubmission, they need only one dma-resv fence list updated. Thus, the fast
90ece91c88SNiranjana Vishwanathapurapath (where required mappings are already bound) submission latency is O(1)
91ece91c88SNiranjana Vishwanathapuraw.r.t the number of VM private BOs.
92ece91c88SNiranjana Vishwanathapura
93*d56b699dSBjorn HelgaasVM_BIND locking hierarchy
94ece91c88SNiranjana Vishwanathapura-------------------------
95ece91c88SNiranjana VishwanathapuraThe locking design here supports the older (execlist based) execbuf mode, the
96ece91c88SNiranjana Vishwanathapuranewer VM_BIND mode, the VM_BIND mode with GPU page faults and possible future
97ece91c88SNiranjana Vishwanathapurasystem allocator support (See `Shared Virtual Memory (SVM) support`_).
98ece91c88SNiranjana VishwanathapuraThe older execbuf mode and the newer VM_BIND mode without page faults manages
99ece91c88SNiranjana Vishwanathapuraresidency of backing storage using dma_fence. The VM_BIND mode with page faults
100ece91c88SNiranjana Vishwanathapuraand the system allocator support do not use any dma_fence at all.
101ece91c88SNiranjana Vishwanathapura
102ece91c88SNiranjana VishwanathapuraVM_BIND locking order is as below.
103ece91c88SNiranjana Vishwanathapura
104ece91c88SNiranjana Vishwanathapura1) Lock-A: A vm_bind mutex will protect vm_bind lists. This lock is taken in
105ece91c88SNiranjana Vishwanathapura   vm_bind/vm_unbind ioctl calls, in the execbuf path and while releasing the
106ece91c88SNiranjana Vishwanathapura   mapping.
107ece91c88SNiranjana Vishwanathapura
108ece91c88SNiranjana Vishwanathapura   In future, when GPU page faults are supported, we can potentially use a
109ece91c88SNiranjana Vishwanathapura   rwsem instead, so that multiple page fault handlers can take the read side
110ece91c88SNiranjana Vishwanathapura   lock to lookup the mapping and hence can run in parallel.
111ece91c88SNiranjana Vishwanathapura   The older execbuf mode of binding do not need this lock.
112ece91c88SNiranjana Vishwanathapura
113ece91c88SNiranjana Vishwanathapura2) Lock-B: The object's dma-resv lock will protect i915_vma state and needs to
114ece91c88SNiranjana Vishwanathapura   be held while binding/unbinding a vma in the async worker and while updating
115ece91c88SNiranjana Vishwanathapura   dma-resv fence list of an object. Note that private BOs of a VM will all
116ece91c88SNiranjana Vishwanathapura   share a dma-resv object.
117ece91c88SNiranjana Vishwanathapura
118ece91c88SNiranjana Vishwanathapura   The future system allocator support will use the HMM prescribed locking
119ece91c88SNiranjana Vishwanathapura   instead.
120ece91c88SNiranjana Vishwanathapura
121ece91c88SNiranjana Vishwanathapura3) Lock-C: Spinlock/s to protect some of the VM's lists like the list of
122ece91c88SNiranjana Vishwanathapura   invalidated vmas (due to eviction and userptr invalidation) etc.
123ece91c88SNiranjana Vishwanathapura
124ece91c88SNiranjana VishwanathapuraWhen GPU page faults are supported, the execbuf path do not take any of these
125ece91c88SNiranjana Vishwanathapuralocks. There we will simply smash the new batch buffer address into the ring and
126ece91c88SNiranjana Vishwanathapurathen tell the scheduler run that. The lock taking only happens from the page
127ece91c88SNiranjana Vishwanathapurafault handler, where we take lock-A in read mode, whichever lock-B we need to
128ece91c88SNiranjana Vishwanathapurafind the backing storage (dma_resv lock for gem objects, and hmm/core mm for
129ece91c88SNiranjana Vishwanathapurasystem allocator) and some additional locks (lock-D) for taking care of page
130ece91c88SNiranjana Vishwanathapuratable races. Page fault mode should not need to ever manipulate the vm lists,
131ece91c88SNiranjana Vishwanathapuraso won't ever need lock-C.
132ece91c88SNiranjana Vishwanathapura
133ece91c88SNiranjana VishwanathapuraVM_BIND LRU handling
134ece91c88SNiranjana Vishwanathapura---------------------
135ece91c88SNiranjana VishwanathapuraWe need to ensure VM_BIND mapped objects are properly LRU tagged to avoid
136ece91c88SNiranjana Vishwanathapuraperformance degradation. We will also need support for bulk LRU movement of
137ece91c88SNiranjana VishwanathapuraVM_BIND objects to avoid additional latencies in execbuf path.
138ece91c88SNiranjana Vishwanathapura
139ece91c88SNiranjana VishwanathapuraThe page table pages are similar to VM_BIND mapped objects (See
140ece91c88SNiranjana Vishwanathapura`Evictable page table allocations`_) and are maintained per VM and needs to
141ece91c88SNiranjana Vishwanathapurabe pinned in memory when VM is made active (ie., upon an execbuf call with
142ece91c88SNiranjana Vishwanathapurathat VM). So, bulk LRU movement of page table pages is also needed.
143ece91c88SNiranjana Vishwanathapura
144ece91c88SNiranjana VishwanathapuraVM_BIND dma_resv usage
145ece91c88SNiranjana Vishwanathapura-----------------------
146ece91c88SNiranjana VishwanathapuraFences needs to be added to all VM_BIND mapped objects. During each execbuf
147ece91c88SNiranjana Vishwanathapurasubmission, they are added with DMA_RESV_USAGE_BOOKKEEP usage to prevent
148ece91c88SNiranjana Vishwanathapuraover sync (See enum dma_resv_usage). One can override it with either
149ece91c88SNiranjana VishwanathapuraDMA_RESV_USAGE_READ or DMA_RESV_USAGE_WRITE usage during explicit object
150ece91c88SNiranjana Vishwanathapuradependency setting.
151ece91c88SNiranjana Vishwanathapura
152ece91c88SNiranjana VishwanathapuraNote that DRM_I915_GEM_WAIT and DRM_I915_GEM_BUSY ioctls do not check for
153ece91c88SNiranjana VishwanathapuraDMA_RESV_USAGE_BOOKKEEP usage and hence should not be used for end of batch
154ece91c88SNiranjana Vishwanathapuracheck. Instead, the execbuf3 out fence should be used for end of batch check
155ece91c88SNiranjana Vishwanathapura(See struct drm_i915_gem_execbuffer3).
156ece91c88SNiranjana Vishwanathapura
157ece91c88SNiranjana VishwanathapuraAlso, in VM_BIND mode, use dma-resv apis for determining object activeness
158ece91c88SNiranjana Vishwanathapura(See dma_resv_test_signaled() and dma_resv_wait_timeout()) and do not use the
159ece91c88SNiranjana Vishwanathapuraolder i915_vma active reference tracking which is deprecated. This should be
160ece91c88SNiranjana Vishwanathapuraeasier to get it working with the current TTM backend.
161ece91c88SNiranjana Vishwanathapura
162ece91c88SNiranjana VishwanathapuraMesa use case
163ece91c88SNiranjana Vishwanathapura--------------
164ece91c88SNiranjana VishwanathapuraVM_BIND can potentially reduce the CPU overhead in Mesa (both Vulkan and Iris),
165ece91c88SNiranjana Vishwanathapurahence improving performance of CPU-bound applications. It also allows us to
166ece91c88SNiranjana Vishwanathapuraimplement Vulkan's Sparse Resources. With increasing GPU hardware performance,
167ece91c88SNiranjana Vishwanathapurareducing CPU overhead becomes more impactful.
168ece91c88SNiranjana Vishwanathapura
169ece91c88SNiranjana Vishwanathapura
170ece91c88SNiranjana VishwanathapuraOther VM_BIND use cases
171ece91c88SNiranjana Vishwanathapura========================
172ece91c88SNiranjana Vishwanathapura
173ece91c88SNiranjana VishwanathapuraLong running Compute contexts
174ece91c88SNiranjana Vishwanathapura------------------------------
175ece91c88SNiranjana VishwanathapuraUsage of dma-fence expects that they complete in reasonable amount of time.
176ece91c88SNiranjana VishwanathapuraCompute on the other hand can be long running. Hence it is appropriate for
177ece91c88SNiranjana Vishwanathapuracompute to use user/memory fence (See `User/Memory Fence`_) and dma-fence usage
178ece91c88SNiranjana Vishwanathapuramust be limited to in-kernel consumption only.
179ece91c88SNiranjana Vishwanathapura
180ece91c88SNiranjana VishwanathapuraWhere GPU page faults are not available, kernel driver upon buffer invalidation
181ece91c88SNiranjana Vishwanathapurawill initiate a suspend (preemption) of long running context, finish the
182ece91c88SNiranjana Vishwanathapurainvalidation, revalidate the BO and then resume the compute context. This is
183ece91c88SNiranjana Vishwanathapuradone by having a per-context preempt fence which is enabled when someone tries
184ece91c88SNiranjana Vishwanathapurato wait on it and triggers the context preemption.
185ece91c88SNiranjana Vishwanathapura
186ece91c88SNiranjana VishwanathapuraUser/Memory Fence
187ece91c88SNiranjana Vishwanathapura~~~~~~~~~~~~~~~~~~
188ece91c88SNiranjana VishwanathapuraUser/Memory fence is a <address, value> pair. To signal the user fence, the
189ece91c88SNiranjana Vishwanathapuraspecified value will be written at the specified virtual address and wakeup the
190ece91c88SNiranjana Vishwanathapurawaiting process. User fence can be signaled either by the GPU or kernel async
191ece91c88SNiranjana Vishwanathapuraworker (like upon bind completion). User can wait on a user fence with a new
192ece91c88SNiranjana Vishwanathapurauser fence wait ioctl.
193ece91c88SNiranjana Vishwanathapura
194ece91c88SNiranjana VishwanathapuraHere is some prior work on this:
195ece91c88SNiranjana Vishwanathapurahttps://patchwork.freedesktop.org/patch/349417/
196ece91c88SNiranjana Vishwanathapura
197ece91c88SNiranjana VishwanathapuraLow Latency Submission
198ece91c88SNiranjana Vishwanathapura~~~~~~~~~~~~~~~~~~~~~~~
199ece91c88SNiranjana VishwanathapuraAllows compute UMD to directly submit GPU jobs instead of through execbuf
200ece91c88SNiranjana Vishwanathapuraioctl. This is made possible by VM_BIND is not being synchronized against
201ece91c88SNiranjana Vishwanathapuraexecbuf. VM_BIND allows bind/unbind of mappings required for the directly
202ece91c88SNiranjana Vishwanathapurasubmitted jobs.
203ece91c88SNiranjana Vishwanathapura
204ece91c88SNiranjana VishwanathapuraDebugger
205ece91c88SNiranjana Vishwanathapura---------
206ece91c88SNiranjana VishwanathapuraWith debug event interface user space process (debugger) is able to keep track
207ece91c88SNiranjana Vishwanathapuraof and act upon resources created by another process (debugged) and attached
208ece91c88SNiranjana Vishwanathapurato GPU via vm_bind interface.
209ece91c88SNiranjana Vishwanathapura
210ece91c88SNiranjana VishwanathapuraGPU page faults
211ece91c88SNiranjana Vishwanathapura----------------
212ece91c88SNiranjana VishwanathapuraGPU page faults when supported (in future), will only be supported in the
213ece91c88SNiranjana VishwanathapuraVM_BIND mode. While both the older execbuf mode and the newer VM_BIND mode of
214ece91c88SNiranjana Vishwanathapurabinding will require using dma-fence to ensure residency, the GPU page faults
215ece91c88SNiranjana Vishwanathapuramode when supported, will not use any dma-fence as residency is purely managed
216ece91c88SNiranjana Vishwanathapuraby installing and removing/invalidating page table entries.
217ece91c88SNiranjana Vishwanathapura
218ece91c88SNiranjana VishwanathapuraPage level hints settings
219ece91c88SNiranjana Vishwanathapura--------------------------
220ece91c88SNiranjana VishwanathapuraVM_BIND allows any hints setting per mapping instead of per BO. Possible hints
221ece91c88SNiranjana Vishwanathapurainclude placement and atomicity. Sub-BO level placement hint will be even more
222ece91c88SNiranjana Vishwanathapurarelevant with upcoming GPU on-demand page fault support.
223ece91c88SNiranjana Vishwanathapura
224ece91c88SNiranjana VishwanathapuraPage level Cache/CLOS settings
225ece91c88SNiranjana Vishwanathapura-------------------------------
226ece91c88SNiranjana VishwanathapuraVM_BIND allows cache/CLOS settings per mapping instead of per BO.
227ece91c88SNiranjana Vishwanathapura
228ece91c88SNiranjana VishwanathapuraEvictable page table allocations
229ece91c88SNiranjana Vishwanathapura---------------------------------
230ece91c88SNiranjana VishwanathapuraMake pagetable allocations evictable and manage them similar to VM_BIND
231ece91c88SNiranjana Vishwanathapuramapped objects. Page table pages are similar to persistent mappings of a
232ece91c88SNiranjana VishwanathapuraVM (difference here are that the page table pages will not have an i915_vma
233ece91c88SNiranjana Vishwanathapurastructure and after swapping pages back in, parent page link needs to be
234ece91c88SNiranjana Vishwanathapuraupdated).
235ece91c88SNiranjana Vishwanathapura
236ece91c88SNiranjana VishwanathapuraShared Virtual Memory (SVM) support
237ece91c88SNiranjana Vishwanathapura------------------------------------
238ece91c88SNiranjana VishwanathapuraVM_BIND interface can be used to map system memory directly (without gem BO
239ece91c88SNiranjana Vishwanathapuraabstraction) using the HMM interface. SVM is only supported with GPU page
240ece91c88SNiranjana Vishwanathapurafaults enabled.
241ece91c88SNiranjana Vishwanathapura
242ece91c88SNiranjana VishwanathapuraVM_BIND UAPI
243ece91c88SNiranjana Vishwanathapura=============
244ece91c88SNiranjana Vishwanathapura
245ece91c88SNiranjana Vishwanathapura.. kernel-doc:: Documentation/gpu/rfc/i915_vm_bind.h
246