ebbf7c60 | 15-Nov-2024 |
Cédric Le Goater <clg@redhat.com> |
vfio/container: Fix container object destruction
When commit 96b7af4388b3 intoduced a .instance_finalize() handler, it did not take into account that the container was not necessarily inserted into
vfio/container: Fix container object destruction
When commit 96b7af4388b3 intoduced a .instance_finalize() handler, it did not take into account that the container was not necessarily inserted into the container list of the address space. Hence, if the container object is destroyed, by calling object_unref() for example, before vfio_address_space_insert() is called, QEMU may crash when removing the container from the list as done in vfio_container_instance_finalize(). This was seen with an SEV-SNP guest for which discarding of RAM fails.
To resolve this issue, use the safe version of QLIST_REMOVE().
Cc: Zhenzhong Duan <zhenzhong.duan@intel.com> Cc: Eric Auger <eric.auger@redhat.com> Fixes: 96b7af4388b3 ("vfio/container: Move vfio_container_destroy() to an instance_finalize() handler") Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
show more ...
|
66650fd0 | 08-Nov-2024 |
Corvin Köhne <c.koehne@beckhoff.com> |
vfio/igd: fix calculation of graphics stolen memory
When copying the calculation of the stolen memory size for Intels integrated graphics device of gen 9 and later from the Linux kernel [1], we miss
vfio/igd: fix calculation of graphics stolen memory
When copying the calculation of the stolen memory size for Intels integrated graphics device of gen 9 and later from the Linux kernel [1], we missed subtracting 0xf0 from the graphics mode select value for values above 0xf0. This leads to QEMU reporting a very large size of the graphics stolen memory area. That's just a waste of memory. Additionally the guest firmware might be unable to allocate such a large buffer.
[1] https://github.com/torvalds/linux/blob/7c626ce4bae1ac14f60076d00eafe71af30450ba/arch/x86/kernel/early-quirks.c#L455-L460
Signed-off-by: Corvin Köhne <c.koehne@beckhoff.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Fixes: 871922416683 ("vfio/igd: correctly calculate stolen memory size for gen 9 and later") Reviewed-by: Alex Williamson <alex.williamson@redhat.com> [ clg: Changed commit subject ] Signed-off-by: Cédric Le Goater <clg@redhat.com>
show more ...
|
00b519c0 | 22-Oct-2024 |
Alex Williamson <alex.williamson@redhat.com> |
vfio/helpers: Align mmaps
Thanks to work by Peter Xu, support is introduced in Linux v6.12 to allow pfnmap insertions at PMD and PUD levels of the page table. This means that provided a properly al
vfio/helpers: Align mmaps
Thanks to work by Peter Xu, support is introduced in Linux v6.12 to allow pfnmap insertions at PMD and PUD levels of the page table. This means that provided a properly aligned mmap, the vfio driver is able to map MMIO at significantly larger intervals than PAGE_SIZE. For example on x86_64 (the only architecture currently supporting huge pfnmaps for PUD), rather than 4KiB mappings, we can map device MMIO using 2MiB and even 1GiB page table entries.
Typically mmap will already provide PMD aligned mappings, so devices with moderately sized MMIO ranges, even GPUs with standard 256MiB BARs, will already take advantage of this support. However in order to better support devices exposing multi-GiB MMIO, such as 3D accelerators or GPUs with resizable BARs enabled, we need to manually align the mmap.
There doesn't seem to be a way for userspace to easily learn about PMD and PUD mapping level sizes, therefore this takes the simple approach to align the mapping to the power-of-two size of the region, up to 1GiB, which is currently the maximum alignment we care about.
Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com>
show more ...
|
49915c0d | 22-Oct-2024 |
Alex Williamson <alex.williamson@redhat.com> |
vfio/helpers: Refactor vfio_region_mmap() error handling
Move error handling code to the end of the function so that it can more easily be shared by new mmap failure conditions. No functional chang
vfio/helpers: Refactor vfio_region_mmap() error handling
Move error handling code to the end of the function so that it can more easily be shared by new mmap failure conditions. No functional change intended.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com>
show more ...
|
fa4e20de | 20-Oct-2024 |
Avihai Horon <avihaih@nvidia.com> |
vfio/migration: Change trace formats from hex to decimal
Data sizes in VFIO migration trace events are printed in hex format while in migration core trace events they are printed in decimal format.
vfio/migration: Change trace formats from hex to decimal
Data sizes in VFIO migration trace events are printed in hex format while in migration core trace events they are printed in decimal format.
This inconsistency makes it less readable when using both trace event types. Hence, change the data sizes print format to decimal in VFIO migration trace events.
Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com>
show more ...
|
87192241 | 28-Aug-2024 |
Corvin Köhne <corvin.koehne@gmail.com> |
vfio/igd: correctly calculate stolen memory size for gen 9 and later
We have to update the calculation of the stolen memory size because we've seen devices using values of 0xf0 and above for the gra
vfio/igd: correctly calculate stolen memory size for gen 9 and later
We have to update the calculation of the stolen memory size because we've seen devices using values of 0xf0 and above for the graphics mode select field. The new calculation was taken from the linux kernel [1].
[1] https://github.com/torvalds/linux/blob/7c626ce4bae1ac14f60076d00eafe71af30450ba/arch/x86/kernel/early-quirks.c#L455-L460
Signed-off-by: Corvin Köhne <c.koehne@beckhoff.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
show more ...
|
971ca22f | 28-Aug-2024 |
Corvin Köhne <corvin.koehne@gmail.com> |
vfio/igd: don't set stolen memory size to zero
The stolen memory is required for the GOP (EFI) driver and the Windows driver. While the GOP driver seems to work with any stolen memory size, the Wind
vfio/igd: don't set stolen memory size to zero
The stolen memory is required for the GOP (EFI) driver and the Windows driver. While the GOP driver seems to work with any stolen memory size, the Windows driver will crash if the size doesn't match the size allocated by the host BIOS. For that reason, it doesn't make sense to overwrite the stolen memory size. It's true that this wastes some VM memory. In the worst case, the stolen memory can take up more than a GB. However, that's uncommon. Additionally, it's likely that a bunch of RAM is assigned to VMs making use of GPU passthrough.
Signed-off-by: Corvin Köhne <c.koehne@beckhoff.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
show more ...
|
9c86b9fb | 28-Aug-2024 |
Corvin Köhne <corvin.koehne@gmail.com> |
vfio/igd: add ID's for ElkhartLake and TigerLake
ElkhartLake and TigerLake devices were tested in legacy mode with Linux and Windows VMs. Both are working properly. It's likely that other Intel GPUs
vfio/igd: add ID's for ElkhartLake and TigerLake
ElkhartLake and TigerLake devices were tested in legacy mode with Linux and Windows VMs. Both are working properly. It's likely that other Intel GPUs of gen 11 and 12 like IceLake device are working too. However, we're only adding known good devices for now.
Signed-off-by: Corvin Köhne <c.koehne@beckhoff.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
show more ...
|
11b5ce95 | 28-Aug-2024 |
Corvin Köhne <corvin.koehne@gmail.com> |
vfio/igd: add new bar0 quirk to emulate BDSM mirror
The BDSM register is mirrored into MMIO space at least for gen 11 and later devices. Unfortunately, the Windows driver reads the register value fr
vfio/igd: add new bar0 quirk to emulate BDSM mirror
The BDSM register is mirrored into MMIO space at least for gen 11 and later devices. Unfortunately, the Windows driver reads the register value from MMIO space instead of PCI config space for those devices [1]. Therefore, we either have to keep a 1:1 mapping for the host and guest address or we have to emulate the MMIO register too. Using the igd in legacy mode is already hard due to it's many constraints. Keeping a 1:1 mapping may not work in all cases and makes it even harder to use. An MMIO emulation has to trap the whole MMIO page. This makes accesses to this page slower compared to using second level address translation. Nevertheless, it doesn't have any constraints and I haven't noticed any performance degradation yet making it a better solution.
[1] https://github.com/projectacrn/acrn-hypervisor/blob/5c351bee0f6ae46250eefc07f44b4a31e770f3cf/devicemodel/hw/pci/passthrough.c#L650-L653
Signed-off-by: Corvin Köhne <c.koehne@beckhoff.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
show more ...
|
7bafcd17 | 28-Aug-2024 |
Corvin Köhne <corvin.koehne@gmail.com> |
vfio/igd: use new BDSM register location and size for gen 11 and later
Intel changed the location and size of the BDSM register for gen 11 devices and later. We have to adjust our emulation for thes
vfio/igd: use new BDSM register location and size for gen 11 and later
Intel changed the location and size of the BDSM register for gen 11 devices and later. We have to adjust our emulation for these devices to properly support them.
Signed-off-by: Corvin Köhne <c.koehne@beckhoff.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
show more ...
|
abd9dda9 | 28-Aug-2024 |
Corvin Köhne <corvin.koehne@gmail.com> |
vfio/igd: support legacy mode for all known generations
We're soon going to add support for legacy mode to ElkhartLake and TigerLake devices. Those are gen 11 and 12 devices. At the moment, all devi
vfio/igd: support legacy mode for all known generations
We're soon going to add support for legacy mode to ElkhartLake and TigerLake devices. Those are gen 11 and 12 devices. At the moment, all devices identified by our igd_gen function do support legacy mode. This won't change when adding our new devices of gen 11 and 12. Therefore, it makes more sense to accept legacy mode for all known devices instead of maintaining a long list of known good generations. If we add a new generation to igd_gen which doesn't support legacy mode for some reason, it'll be easy to advance the check to reject legacy mode for this specific generation.
Signed-off-by: Corvin Köhne <c.koehne@beckhoff.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
show more ...
|
e433f208 | 28-Aug-2024 |
Corvin Köhne <corvin.koehne@gmail.com> |
vfio/igd: return an invalid generation for unknown devices
Intel changes it's specification quite often e.g. the location and size of the BDSM register has change for gen 11 devices and later. This
vfio/igd: return an invalid generation for unknown devices
Intel changes it's specification quite often e.g. the location and size of the BDSM register has change for gen 11 devices and later. This causes our emulation to fail on those devices. So, it's impossible for us to use a suitable default value for unknown devices. Instead of returning a random generation value and hoping that everthing works fine, we should verify that different devices are working and add them to our list of known devices.
Signed-off-by: Corvin Köhne <c.koehne@beckhoff.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
show more ...
|
30b91677 | 22-Jul-2024 |
Joao Martins <joao.m.martins@oracle.com> |
vfio/common: Allow disabling device dirty page tracking
The property 'x-pre-copy-dirty-page-tracking' allows disabling the whole tracking of VF pre-copy phase of dirty page tracking, though it means
vfio/common: Allow disabling device dirty page tracking
The property 'x-pre-copy-dirty-page-tracking' allows disabling the whole tracking of VF pre-copy phase of dirty page tracking, though it means that it will only be used at the start of the switchover phase.
Add an option that disables the VF dirty page tracking, and fall back into container-based dirty page tracking. This also allows to use IOMMU dirty tracking even on VFs with their own dirty tracker scheme.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
show more ...
|
f48b4724 | 22-Jul-2024 |
Joao Martins <joao.m.martins@oracle.com> |
vfio/migration: Don't block migration device dirty tracking is unsupported
By default VFIO migration is set to auto, which will support live migration if the migration capability is set *and* also d
vfio/migration: Don't block migration device dirty tracking is unsupported
By default VFIO migration is set to auto, which will support live migration if the migration capability is set *and* also dirty page tracking is supported.
For testing purposes one can force enable without dirty page tracking via enable-migration=on, but that option is generally left for testing purposes.
So starting with IOMMU dirty tracking it can use to accommodate the lack of VF dirty page tracking allowing us to minimize the VF requirements for migration and thus enabling migration by default for those too.
While at it change the error messages to mention IOMMU dirty tracking as well.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> [ clg: - spelling in commit log ] Signed-off-by: Cédric Le Goater <clg@redhat.com>
show more ...
|
7c30710b | 22-Jul-2024 |
Joao Martins <joao.m.martins@oracle.com> |
vfio/iommufd: Implement VFIOIOMMUClass::query_dirty_bitmap support
ioctl(iommufd, IOMMU_HWPT_GET_DIRTY_BITMAP, arg) is the UAPI that fetches the bitmap that tells what was dirty in an IOVA range.
A
vfio/iommufd: Implement VFIOIOMMUClass::query_dirty_bitmap support
ioctl(iommufd, IOMMU_HWPT_GET_DIRTY_BITMAP, arg) is the UAPI that fetches the bitmap that tells what was dirty in an IOVA range.
A single bitmap is allocated and used across all the hwpts sharing an IOAS which is then used in log_sync() to set Qemu global bitmaps.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
show more ...
|