125034616SKnut OmangPCI SR/IOV EMULATION SUPPORT 225034616SKnut Omang============================ 325034616SKnut Omang 425034616SKnut OmangDescription 525034616SKnut Omang=========== 625034616SKnut OmangSR/IOV (Single Root I/O Virtualization) is an optional extended capability 725034616SKnut Omangof a PCI Express device. It allows a single physical function (PF) to appear as multiple 825034616SKnut Omangvirtual functions (VFs) for the main purpose of eliminating software 925034616SKnut Omangoverhead in I/O from virtual machines. 1025034616SKnut Omang 119b765724SStefan WeilQEMU now implements the basic common functionality to enable an emulated device 122a3f8b33SAkihiko Odakito support SR/IOV. 1325034616SKnut Omang 1425034616SKnut OmangImplementation 1525034616SKnut Omang============== 1625034616SKnut OmangImplementing emulation of an SR/IOV capable device typically consists of 1725034616SKnut Omangimplementing support for two types of device classes; the "normal" physical device 189b765724SStefan Weil(PF) and the virtual device (VF). From QEMU's perspective, the VFs are just 1925034616SKnut Omanglike other devices, except that some of their properties are derived from 2025034616SKnut Omangthe PF. 2125034616SKnut Omang 2225034616SKnut OmangA virtual function is different from a physical function in that the BAR 2325034616SKnut Omangspace for all VFs are defined by the BAR registers in the PFs SR/IOV 2425034616SKnut Omangcapability. All VFs have the same BARs and BAR sizes. 2525034616SKnut Omang 2625034616SKnut OmangAccesses to these virtual BARs then is computed as 2725034616SKnut Omang 2825034616SKnut Omang <VF BAR start> + <VF number> * <BAR sz> + <offset> 2925034616SKnut Omang 3025034616SKnut OmangFrom our emulation perspective this means that there is a separate call for 3125034616SKnut Omangsetting up a BAR for a VF. 3225034616SKnut Omang 3325034616SKnut Omang1) To enable SR/IOV support in the PF, it must be a PCI Express device so 3425034616SKnut Omang you would need to add a PCI Express capability in the normal PCI 3525034616SKnut Omang capability list. You might also want to add an ARI (Alternative 3625034616SKnut Omang Routing-ID Interpretation) capability to indicate that your device 3725034616SKnut Omang supports functions beyond it's "own" function space (0-7), 3825034616SKnut Omang which is necessary to support more than 7 functions, or 3925034616SKnut Omang if functions extends beyond offset 7 because they are placed at an 4025034616SKnut Omang offset > 1 or have stride > 1. 4125034616SKnut Omang 4225034616SKnut Omang ... 4325034616SKnut Omang #include "hw/pci/pcie.h" 4425034616SKnut Omang #include "hw/pci/pcie_sriov.h" 4525034616SKnut Omang 4625034616SKnut Omang pci_your_pf_dev_realize( ... ) 4725034616SKnut Omang { 4825034616SKnut Omang ... 4925034616SKnut Omang int ret = pcie_endpoint_cap_init(d, 0x70); 5025034616SKnut Omang ... 51445416e3SAkihiko Odaki pcie_ari_init(d, 0x100); 5225034616SKnut Omang ... 5325034616SKnut Omang 5425034616SKnut Omang /* Add and initialize the SR/IOV capability */ 55*19c45c00SMichael S. Tsirkin pcie_sriov_pf_init(d, 0x200, "your_virtual_dev", 5625034616SKnut Omang vf_devid, initial_vfs, total_vfs, 57*19c45c00SMichael S. Tsirkin fun_offset, stride); 5825034616SKnut Omang 5925034616SKnut Omang /* Set up individual VF BARs (parameters as for normal BARs) */ 6025034616SKnut Omang pcie_sriov_pf_init_vf_bar( ... ) 6125034616SKnut Omang ... 6225034616SKnut Omang } 6325034616SKnut Omang 6425034616SKnut Omang For cleanup, you simply call: 6525034616SKnut Omang 6625034616SKnut Omang pcie_sriov_pf_exit(device); 6725034616SKnut Omang 6825034616SKnut Omang which will delete all the virtual functions and associated resources. 6925034616SKnut Omang 7025034616SKnut Omang2) Similarly in the implementation of the virtual function, you need to 7125034616SKnut Omang make it a PCI Express device and add a similar set of capabilities 7225034616SKnut Omang except for the SR/IOV capability. Then you need to set up the VF BARs as 7325034616SKnut Omang subregions of the PFs SR/IOV VF BARs by calling 7425034616SKnut Omang pcie_sriov_vf_register_bar() instead of the normal pci_register_bar() call: 7525034616SKnut Omang 7625034616SKnut Omang pci_your_vf_dev_realize( ... ) 7725034616SKnut Omang { 7825034616SKnut Omang ... 7925034616SKnut Omang int ret = pcie_endpoint_cap_init(d, 0x60); 8025034616SKnut Omang ... 81445416e3SAkihiko Odaki pcie_ari_init(d, 0x100); 8225034616SKnut Omang ... 8325034616SKnut Omang memory_region_init(mr, ... ) 8425034616SKnut Omang pcie_sriov_vf_register_bar(d, bar_nr, mr); 8525034616SKnut Omang ... 8625034616SKnut Omang } 8725034616SKnut Omang 8825034616SKnut OmangTesting on Linux guest 8925034616SKnut Omang====================== 9025034616SKnut OmangThe easiest is if your device driver supports sysfs based SR/IOV 9125034616SKnut Omangenabling. Support for this was added in kernel v.3.8, so not all drivers 9225034616SKnut Omangsupport it yet. 9325034616SKnut Omang 9425034616SKnut OmangTo enable 4 VFs for a device at 01:00.0: 9525034616SKnut Omang 9625034616SKnut Omang modprobe yourdriver 9725034616SKnut Omang echo 4 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs 9825034616SKnut Omang 9925034616SKnut OmangYou should now see 4 VFs with lspci. 10025034616SKnut OmangTo turn SR/IOV off again - the standard requires you to turn it off before you can enable 10125034616SKnut Omanganother VF count, and the emulation enforces this: 10225034616SKnut Omang 10325034616SKnut Omang echo 0 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs 10425034616SKnut Omang 10525034616SKnut OmangOlder drivers typically provide a max_vfs module parameter 10625034616SKnut Omangto enable it at load time: 10725034616SKnut Omang 10825034616SKnut Omang modprobe yourdriver max_vfs=4 10925034616SKnut Omang 11025034616SKnut OmangTo disable the VFs again then, you simply have to unload the driver: 11125034616SKnut Omang 11225034616SKnut Omang rmmod yourdriver 113