1========================== 2Xe – Merge Acceptance Plan 3========================== 4Xe is a new driver for Intel GPUs that supports both integrated and 5discrete platforms starting with Tiger Lake (first Intel Xe Architecture). 6 7This document aims to establish a merge plan for the Xe, by writing down clear 8pre-merge goals, in order to avoid unnecessary delays. 9 10Xe – Overview 11============= 12The main motivation of Xe is to have a fresh base to work from that is 13unencumbered by older platforms, whilst also taking the opportunity to 14rearchitect our driver to increase sharing across the drm subsystem, both 15leveraging and allowing us to contribute more towards other shared components 16like TTM and drm/scheduler. 17 18This is also an opportunity to start from the beginning with a clean uAPI that is 19extensible by design and already aligned with the modern userspace needs. For 20this reason, the memory model is solely based on GPU Virtual Address space 21bind/unbind (‘VM_BIND’) of GEM buffer objects (BOs) and execution only supporting 22explicit synchronization. With persistent mapping across the execution, the 23userspace does not need to provide a list of all required mappings during each 24submission. 25 26The new driver leverages a lot from i915. As for display, the intent is to share 27the display code with the i915 driver so that there is maximum reuse there. 28 29As for the power management area, the goal is to have a much-simplified support 30for the system suspend states (S-states), PCI device suspend states (D-states), 31GPU/Render suspend states (R-states) and frequency management. It should leverage 32as much as possible all the existent PCI-subsystem infrastructure (pm and 33runtime_pm) and underlying firmware components such PCODE and GuC for the power 34states and frequency decisions. 35 36Repository: 37 38https://gitlab.freedesktop.org/drm/xe/kernel (branch drm-xe-next) 39 40Xe – Platforms 41============== 42Currently, Xe is already functional and has experimental support for multiple 43platforms starting from Tiger Lake, with initial support in userspace implemented 44in Mesa (for Iris and Anv, our OpenGL and Vulkan drivers), as well as in NEO 45(for OpenCL and Level0). 46 47During a transition period, platforms will be supported by both Xe and i915. 48However, the force_probe mechanism existent in both drivers will allow only one 49official and by-default probe at a given time. 50 51For instance, in order to probe a DG2 which PCI ID is 0x5690 by Xe instead of 52i915, the following set of parameters need to be used: 53 54``` 55i915.force_probe=!5690 xe.force_probe=5690 56``` 57 58In both drivers, the ‘.require_force_probe’ protection forces the user to use the 59force_probe parameter while the driver is under development. This protection is 60only removed when the support for the platform and the uAPI are stable. Stability 61which needs to be demonstrated by CI results. 62 63In order to avoid user space regressions, i915 will continue to support all the 64current platforms that are already out of this protection. Xe support will be 65forever experimental and dependent on the usage of force_probe for these 66platforms. 67 68When the time comes for Xe, the protection will be lifted on Xe and kept in i915. 69 70Xe driver will be protected with both STAGING Kconfig and force_probe. Changes in 71the uAPI are expected while the driver is behind these protections. STAGING will 72be removed when the driver uAPI gets to a mature state where we can guarantee the 73‘no regression’ rule. Then force_probe will be lifted only for future platforms 74that will be productized with Xe driver, but not with i915. 75 76Xe – Pre-Merge Goals 77==================== 78 79Drm_scheduler 80------------- 81Xe primarily uses Firmware based scheduling (GuC FW). However, it will use 82drm_scheduler as the scheduler ‘frontend’ for userspace submission in order to 83resolve syncobj and dma-buf implicit sync dependencies. However, drm_scheduler is 84not yet prepared to handle the 1-to-1 relationship between drm_gpu_scheduler and 85drm_sched_entity. 86 87Deeper changes to drm_scheduler should *not* be required to get Xe accepted, but 88some consensus needs to be reached between Xe and other community drivers that 89could also benefit from this work, for coupling FW based/assisted submission such 90as the ARM’s new Mali GPU driver, and others. 91 92As a key measurable result, the patch series introducing Xe itself shall not 93depend on any other patch touching drm_scheduler itself that was not yet merged 94through drm-misc. This, by itself, already includes the reach of an agreement for 95uniform 1 to 1 relationship implementation / usage across drivers. 96 97GPU VA 98------ 99Two main goals of Xe are meeting together here: 100 1011) Have an uAPI that aligns with modern UMD needs. 102 1032) Early upstream engagement. 104 105RedHat engineers working on Nouveau proposed a new DRM feature to handle keeping 106track of GPU virtual address mappings. This is still not merged upstream, but 107this aligns very well with our goals and with our VM_BIND. The engagement with 108upstream and the port of Xe towards GPUVA is already ongoing. 109 110As a key measurable result, Xe needs to be aligned with the GPU VA and working in 111our tree. Missing Nouveau patches should *not* block Xe and any needed GPUVA 112related patch should be independent and present on dri-devel or acked by 113maintainers to go along with the first Xe pull request towards drm-next. 114 115DRM_VM_BIND 116----------- 117Nouveau, and Xe are all implementing ‘VM_BIND’ and new ‘Exec’ uAPIs in order to 118fulfill the needs of the modern uAPI. Xe merge should *not* be blocked on the 119development of a common new drm_infrastructure. However, the Xe team needs to 120engage with the community to explore the options of a common API. 121 122As a key measurable result, the DRM_VM_BIND needs to be documented in this file 123below, or this entire block deleted if the consensus is for independent drivers 124vm_bind ioctls. 125 126Although having a common DRM level IOCTL for VM_BIND is not a requirement to get 127Xe merged, it is mandatory to enforce the overall locking scheme for all major 128structs and list (so vm and vma). So, a consensus is needed, and possibly some 129common helpers. If helpers are needed, they should be also documented in this 130document. 131 132ASYNC VM_BIND 133------------- 134Although having a common DRM level IOCTL for VM_BIND is not a requirement to get 135Xe merged, it is mandatory to have a consensus with other drivers and Mesa. 136It needs to be clear how to handle async VM_BIND and interactions with userspace 137memory fences. Ideally with helper support so people don't get it wrong in all 138possible ways. 139 140As a key measurable result, the benefits of ASYNC VM_BIND and a discussion of 141various flavors, error handling and a sample API should be documented here or in 142a separate document pointed to by this document. 143 144Userptr integration and vm_bind 145------------------------------- 146Different drivers implement different ways of dealing with execution of userptr. 147With multiple drivers currently introducing support to VM_BIND, the goal is to 148aim for a DRM consensus on what’s the best way to have that support. To some 149extent this is already getting addressed itself with the GPUVA where likely the 150userptr will be a GPUVA with a NULL GEM call VM bind directly on the userptr. 151However, there are more aspects around the rules for that and the usage of 152mmu_notifiers, locking and other aspects. 153 154This task here has the goal of introducing a documentation of the basic rules. 155 156The documentation *needs* to first live in this document (API session below) and 157then moved to another more specific document or at Xe level or at DRM level. 158 159Documentation should include: 160 161 * The userptr part of the VM_BIND api. 162 163 * Locking, including the page-faulting case. 164 165 * O(1) complexity under VM_BIND. 166 167Some parts of userptr like mmu_notifiers should become GPUVA or DRM helpers when 168the second driver supporting VM_BIND+userptr appears. Details to be defined when 169the time comes. 170 171Long running compute: minimal data structure/scaffolding 172-------------------------------------------------------- 173The generic scheduler code needs to include the handling of endless compute 174contexts, with the minimal scaffolding for preempt-ctx fences (probably on the 175drm_sched_entity) and making sure drm_scheduler can cope with the lack of job 176completion fence. 177 178The goal is to achieve a consensus ahead of Xe initial pull-request, ideally with 179this minimal drm/scheduler work, if needed, merged to drm-misc in a way that any 180drm driver, including Xe, could re-use and add their own individual needs on top 181in a next stage. However, this should not block the initial merge. 182 183This is a non-blocker item since the driver without the support for the long 184running compute enabled is not a showstopper. 185 186Display integration with i915 187----------------------------- 188In order to share the display code with the i915 driver so that there is maximum 189reuse, the i915/display/ code is built twice, once for i915.ko and then for 190xe.ko. Currently, the i915/display code in Xe tree is polluted with many 'ifdefs' 191depending on the build target. The goal is to refactor both Xe and i915/display 192code simultaneously in order to get a clean result before they land upstream, so 193that display can already be part of the initial pull request towards drm-next. 194 195However, display code should not gate the acceptance of Xe in upstream. Xe 196patches will be refactored in a way that display code can be removed, if needed, 197from the first pull request of Xe towards drm-next. The expectation is that when 198both drivers are part of the drm-tip, the introduction of cleaner patches will be 199easier and speed up. 200 201Drm_exec 202-------- 203Helper to make dma_resv locking for a big number of buffers is getting removed in 204the drm_exec series proposed in https://patchwork.freedesktop.org/patch/524376/ 205If that happens, Xe needs to change and incorporate the changes in the driver. 206The goal is to engage with the Community to understand if the best approach is to 207move that to the drivers that are using it or if we should keep the helpers in 208place waiting for Xe to get merged. 209 210This item ties into the GPUVA, VM_BIND, and even long-running compute support. 211 212As a key measurable result, we need to have a community consensus documented in 213this document and the Xe driver prepared for the changes, if necessary. 214 215Dev_coredump 216------------ 217 218Xe needs to align with other drivers on the way that the error states are 219dumped, avoiding a Xe only error_state solution. The goal is to use devcoredump 220infrastructure to report error states, since it produces a standardized way 221by exposing a virtual and temporary /sys/class/devcoredump device. 222 223As the key measurable result, Xe driver needs to provide GPU snapshots captured 224at hang time through devcoredump, but without depending on any core modification 225of devcoredump infrastructure itself. 226 227Later, when we are in-tree, the goal is to collaborate with devcoredump 228infrastructure with overall possible improvements, like multiple file support 229for better organization of the dumps, snapshot support, dmesg extra print, 230and whatever may make sense and help the overall infrastructure. 231 232Xe – uAPI high level overview 233============================= 234 235...Warning: To be done in follow up patches after/when/where the main consensus in various items are individually reached. 236