1*0eb73424SAlex WilliamsonIntel Graphics Device (IGD) assignment with vfio-pci 2*0eb73424SAlex Williamson==================================================== 3*0eb73424SAlex Williamson 4*0eb73424SAlex WilliamsonIGD has two different modes for assignment using vfio-pci: 5*0eb73424SAlex Williamson 6*0eb73424SAlex Williamson1) Universal Pass-Through (UPT) mode: 7*0eb73424SAlex Williamson 8*0eb73424SAlex Williamson In this mode the IGD device is added as a *secondary* (ie. non-primary) 9*0eb73424SAlex Williamson graphics device in combination with an emulated primary graphics device. 10*0eb73424SAlex Williamson This mode *requires* guest driver support to remove the external 11*0eb73424SAlex Williamson dependencies generally associated with IGD (see below). Those guest 12*0eb73424SAlex Williamson drivers only support this mode for Broadwell and newer IGD, according to 13*0eb73424SAlex Williamson Intel. Additionally, this mode by default, and as officially supported 14*0eb73424SAlex Williamson by Intel, does not support direct video output. The intention is to use 15*0eb73424SAlex Williamson this mode either to provide hardware acceleration to the emulated graphics 16*0eb73424SAlex Williamson or to use this mode in combination with guest-based remote access software, 17*0eb73424SAlex Williamson for example VNC (see below for optional output support). This mode 18*0eb73424SAlex Williamson theoretically has no device specific handling dependencies on vfio-pci or 19*0eb73424SAlex Williamson the VM firmware. 20*0eb73424SAlex Williamson 21*0eb73424SAlex Williamson2) "Legacy" mode: 22*0eb73424SAlex Williamson 23*0eb73424SAlex Williamson In this mode the IGD device is intended to be the primary and exclusive 24*0eb73424SAlex Williamson graphics device in the VM[1], as such QEMU does not facilitate any sort 25*0eb73424SAlex Williamson of remote graphics to the VM in this mode. A connected physical monitor 26*0eb73424SAlex Williamson is the intended output device for IGD. This mode includes several 27*0eb73424SAlex Williamson requirements and restrictions: 28*0eb73424SAlex Williamson 29*0eb73424SAlex Williamson * IGD must be given address 02.0 on the PCI root bus in the VM 30*0eb73424SAlex Williamson * The host kernel must support vfio extensions for IGD (v4.6) 31*0eb73424SAlex Williamson * vfio VGA support very likely needs to be enabled in the host kernel 32*0eb73424SAlex Williamson * The VM firmware must support specific fw_cfg enablers for IGD 33*0eb73424SAlex Williamson * The VM machine type must support a PCI host bridge at 00.0 (standard) 34*0eb73424SAlex Williamson * The VM machine type must provide or allow to be created a special 35*0eb73424SAlex Williamson ISA/LPC bridge device (vfio-pci-igd-lpc-bridge) on the root bus at 36*0eb73424SAlex Williamson PCI address 1f.0. 37*0eb73424SAlex Williamson * The IGD device must have a VGA ROM, either provided via the romfile 38*0eb73424SAlex Williamson option or loaded automatically through vfio (standard). rombar=0 39*0eb73424SAlex Williamson will disable legacy mode support. 40*0eb73424SAlex Williamson * Hotplug of the IGD device is not supported. 41*0eb73424SAlex Williamson * The IGD device must be a SandyBridge or newer model device. 42*0eb73424SAlex Williamson 43*0eb73424SAlex WilliamsonFor either mode, depending on the host kernel, the i915 driver in the host 44*0eb73424SAlex Williamsonmay generate faults and errors upon re-binding to an IGD device after it 45*0eb73424SAlex Williamsonhas been assigned to a VM. It's therefore generally recommended to prevent 46*0eb73424SAlex Williamsonsuch driver binding unless the host driver is known to work well for this. 47*0eb73424SAlex WilliamsonThere are numerous ways to do this, i915 can be blacklisted on the host, 48*0eb73424SAlex Williamsonthe driver_override option can be used to ensure that only vfio-pci can bind 49*0eb73424SAlex Williamsonto the device on the host[2], virsh nodedev-detach can be used to bind the 50*0eb73424SAlex Williamsondevice to vfio drivers and then managed='no' set in the VM xml to prevent 51*0eb73424SAlex Williamsonre-binding to i915, etc. Also note that IGD is also typically the primary 52*0eb73424SAlex Williamsongraphics in the host and special options may be required beyond simply 53*0eb73424SAlex Williamsonblacklisting i915 or using pci-stub/vfio-pci to take ownership of IGD as a 54*0eb73424SAlex WilliamsonPCI class device. Lower level drivers exist that may still claim the device. 55*0eb73424SAlex WilliamsonIt may therefore be necessary to use kernel boot options video=vesafb:off or 56*0eb73424SAlex Williamsonvideo=efifb:off (depending on host BIOS/UEFI) or these can be combined to 57*0eb73424SAlex Williamsona catch-all, video=vesafb:off,efifb:off. Error messages such as: 58*0eb73424SAlex Williamson 59*0eb73424SAlex Williamson Failed to mmap 0000:00:02.0 BAR <>. Performance may be slow 60*0eb73424SAlex Williamson 61*0eb73424SAlex Williamsonare a good indicator that such a problem exists. The host files /proc/iomem 62*0eb73424SAlex Williamsonand /proc/ioports are often useful for identifying drivers consuming ranges 63*0eb73424SAlex Williamsonof the device to cause such conflicts. 64*0eb73424SAlex Williamson 65*0eb73424SAlex WilliamsonAdditionally, IGD device are known to generate small numbers of DMAR faults 66*0eb73424SAlex Williamsonwhen initially assigned. It is believed that this is simply the IGD attempting 67*0eb73424SAlex Williamsonto access the reserved GTT space after reset, which it no longer has access to 68*0eb73424SAlex Williamsonwhen accessed from userspace. So long as the DMAR faults are small in number 69*0eb73424SAlex Williamsonand most importantly, not ongoing, these are not an indication of an error. 70*0eb73424SAlex Williamson 71*0eb73424SAlex WilliamsonAdditionally++, analog VGA output (as opposed to digital outputs like HDMI, 72*0eb73424SAlex WilliamsonDVI, or DisplayPort) may be unsupported in some use cases. In the author's 73*0eb73424SAlex Williamsonexperience, even DP to VGA adapters can be troublesome while adapters between 74*0eb73424SAlex Williamsondigital formats work well. 75*0eb73424SAlex Williamson 76*0eb73424SAlex WilliamsonUsage 77*0eb73424SAlex Williamson===== 78*0eb73424SAlex WilliamsonThe intention is for IGD assignment to be transparent for users and thus for 79*0eb73424SAlex Williamsonmanagement tools like libvirt. To make use of legacy mode, simply remove all 80*0eb73424SAlex Williamsonother graphics options and use "-nographic" and either "-vga none" or 81*0eb73424SAlex Williamson"-nodefaults", along with adding the device using vfio-pci: 82*0eb73424SAlex Williamson 83*0eb73424SAlex Williamson -device vfio-pci,host=00:02.0,id=hostdev0,bus=pci.0,addr=0x2 84*0eb73424SAlex Williamson 85*0eb73424SAlex WilliamsonFor UPT mode, retain the default emulated graphics and simply add the vfio-pci 86*0eb73424SAlex Williamsondevice making use of any other bus address other than 02.0. libvirt will 87*0eb73424SAlex Williamsondefault to assigning the device a UPT compatible address while legacy mode 88*0eb73424SAlex Williamsonusers will need to manually edit the XML if using a tool like virt-manager 89*0eb73424SAlex Williamsonwhere the VM device address is not expressly specified. 90*0eb73424SAlex Williamson 91*0eb73424SAlex WilliamsonAn experimental vfio-pci option also exists to enable OpRegion, and thus 92*0eb73424SAlex Williamsonexternal monitor support, for UPT mode. This can be enabled by adding 93*0eb73424SAlex Williamson"x-igd-opregion=on" to the vfio-pci device options for the IGD device. As 94*0eb73424SAlex Williamsonwith legacy mode, this requires the host to support features introduced in 95*0eb73424SAlex Williamsonthe v4.6 kernel. If Intel chooses to embrace this support, the option may 96*0eb73424SAlex Williamsonbe made non-experimental in the future, opening it to libvirt support. 97*0eb73424SAlex Williamson 98*0eb73424SAlex WilliamsonDeveloper ABI 99*0eb73424SAlex Williamson============= 100*0eb73424SAlex WilliamsonLegacy mode IGD support imposes two fw_cfg requirements on the VM firmware: 101*0eb73424SAlex Williamson 102*0eb73424SAlex Williamson1) "etc/igd-opregion" 103*0eb73424SAlex Williamson 104*0eb73424SAlex Williamson This fw_cfg file exposes the OpRegion for the IGD device. A reserved 105*0eb73424SAlex Williamson region should be created below 4GB (recommended 4KB alignment), sized 106*0eb73424SAlex Williamson sufficient for the fw_cfg file size, and the content of this file copied 107*0eb73424SAlex Williamson to it. The dword based address of this reserved memory region must also 108*0eb73424SAlex Williamson be written to the ASLS register at offset 0xFC on the IGD device. It is 109*0eb73424SAlex Williamson recommended that firmware should make use of this fw_cfg entry for any 110*0eb73424SAlex Williamson PCI class VGA device with Intel vendor ID. Multiple of such devices 111*0eb73424SAlex Williamson within a VM is undefined. 112*0eb73424SAlex Williamson 113*0eb73424SAlex Williamson2) "etc/igd-bdsm-size" 114*0eb73424SAlex Williamson 115*0eb73424SAlex Williamson This fw_cfg file contains an 8-byte, little endian integer indicating 116*0eb73424SAlex Williamson the size of the reserved memory region required for IGD stolen memory. 117*0eb73424SAlex Williamson Firmware must allocate a reserved memory below 4GB with required 1MB 118*0eb73424SAlex Williamson alignment equal to this size. Additionally the base address of this 119*0eb73424SAlex Williamson reserved region must be written to the dword BDSM register in PCI config 120*0eb73424SAlex Williamson space of the IGD device at offset 0x5C. As this support is related to 121*0eb73424SAlex Williamson running the IGD ROM, which has other dependencies on the device appearing 122*0eb73424SAlex Williamson at guest address 00:02.0, it's expected that this fw_cfg file is only 123*0eb73424SAlex Williamson relevant to a single PCI class VGA device with Intel vendor ID, appearing 124*0eb73424SAlex Williamson at PCI bus address 00:02.0. 125*0eb73424SAlex Williamson 126*0eb73424SAlex WilliamsonFootnotes 127*0eb73424SAlex Williamson========= 128*0eb73424SAlex Williamson[1] Nothing precludes adding additional emulated or assigned graphics devices 129*0eb73424SAlex Williamson as non-primary, other than the combination typically not working. I only 130*0eb73424SAlex Williamson intend to set user expectations, others are welcome to find working 131*0eb73424SAlex Williamson combinations or fix whatever issues prevent this from working in the common 132*0eb73424SAlex Williamson case. 133*0eb73424SAlex Williamson[2] # echo "vfio-pci" > /sys/bus/pci/devices/0000:00:02.0/driver_override 134