1132db935SJakub Kicinski.. SPDX-License-Identifier: GPL-2.0+ 2132db935SJakub Kicinski 3132db935SJakub Kicinski================================================================= 4132db935SJakub KicinskiLinux Base Driver for Intel(R) Ethernet Adaptive Virtual Function 5132db935SJakub Kicinski================================================================= 6132db935SJakub Kicinski 7132db935SJakub KicinskiIntel Ethernet Adaptive Virtual Function Linux driver. 8132db935SJakub KicinskiCopyright(c) 2013-2018 Intel Corporation. 9132db935SJakub Kicinski 10132db935SJakub KicinskiContents 11132db935SJakub Kicinski======== 12132db935SJakub Kicinski 13132db935SJakub Kicinski- Overview 14132db935SJakub Kicinski- Identifying Your Adapter 15132db935SJakub Kicinski- Additional Configurations 16132db935SJakub Kicinski- Known Issues/Troubleshooting 17132db935SJakub Kicinski- Support 18132db935SJakub Kicinski 19132db935SJakub KicinskiOverview 20132db935SJakub Kicinski======== 21132db935SJakub Kicinski 22132db935SJakub KicinskiThis file describes the iavf Linux Base Driver. This driver was formerly 23132db935SJakub Kicinskicalled i40evf. 24132db935SJakub Kicinski 25132db935SJakub KicinskiThe iavf driver supports the below mentioned virtual function devices and 26132db935SJakub Kicinskican only be activated on kernels running the i40e or newer Physical Function 27132db935SJakub Kicinski(PF) driver compiled with CONFIG_PCI_IOV. The iavf driver requires 28132db935SJakub KicinskiCONFIG_PCI_MSI to be enabled. 29132db935SJakub Kicinski 30132db935SJakub KicinskiThe guest OS loading the iavf driver must support MSI-X interrupts. 31132db935SJakub Kicinski 32132db935SJakub KicinskiIdentifying Your Adapter 33132db935SJakub Kicinski======================== 34132db935SJakub Kicinski 35132db935SJakub KicinskiThe driver in this kernel is compatible with devices based on the following: 36132db935SJakub Kicinski * Intel(R) XL710 X710 Virtual Function 37132db935SJakub Kicinski * Intel(R) X722 Virtual Function 38132db935SJakub Kicinski * Intel(R) XXV710 Virtual Function 39132db935SJakub Kicinski * Intel(R) Ethernet Adaptive Virtual Function 40132db935SJakub Kicinski 41132db935SJakub KicinskiFor the best performance, make sure the latest NVM/FW is installed on your 42132db935SJakub Kicinskidevice. 43132db935SJakub Kicinski 44132db935SJakub KicinskiFor information on how to identify your adapter, and for the latest NVM/FW 45132db935SJakub Kicinskiimages and Intel network drivers, refer to the Intel Support website: 4609a071f5SAlexander A. Klimovhttps://www.intel.com/support 47132db935SJakub Kicinski 48132db935SJakub Kicinski 49132db935SJakub KicinskiAdditional Features and Configurations 50132db935SJakub Kicinski====================================== 51132db935SJakub Kicinski 52132db935SJakub KicinskiViewing Link Messages 53132db935SJakub Kicinski--------------------- 54132db935SJakub KicinskiLink messages will not be displayed to the console if the distribution is 55132db935SJakub Kicinskirestricting system messages. In order to see network driver link messages on 56132db935SJakub Kicinskiyour console, set dmesg to eight by entering the following:: 57132db935SJakub Kicinski 58132db935SJakub Kicinski # dmesg -n 8 59132db935SJakub Kicinski 60132db935SJakub KicinskiNOTE: 61132db935SJakub Kicinski This setting is not saved across reboots. 62132db935SJakub Kicinski 63132db935SJakub Kicinskiethtool 64132db935SJakub Kicinski------- 65132db935SJakub KicinskiThe driver utilizes the ethtool interface for driver configuration and 66132db935SJakub Kicinskidiagnostics, as well as displaying statistical information. The latest ethtool 67132db935SJakub Kicinskiversion is required for this functionality. Download it at: 68132db935SJakub Kicinskihttps://www.kernel.org/pub/software/network/ethtool/ 69132db935SJakub Kicinski 70132db935SJakub KicinskiSetting VLAN Tag Stripping 71132db935SJakub Kicinski-------------------------- 72132db935SJakub KicinskiIf you have applications that require Virtual Functions (VFs) to receive 73132db935SJakub Kicinskipackets with VLAN tags, you can disable VLAN tag stripping for the VF. The 74132db935SJakub KicinskiPhysical Function (PF) processes requests issued from the VF to enable or 75132db935SJakub Kicinskidisable VLAN tag stripping. Note that if the PF has assigned a VLAN to a VF, 76132db935SJakub Kicinskithen requests from that VF to set VLAN tag stripping will be ignored. 77132db935SJakub Kicinski 78132db935SJakub KicinskiTo enable/disable VLAN tag stripping for a VF, issue the following command 79132db935SJakub Kicinskifrom inside the VM in which you are running the VF:: 80132db935SJakub Kicinski 81132db935SJakub Kicinski # ethtool -K <if_name> rxvlan on/off 82132db935SJakub Kicinski 83132db935SJakub Kicinskior alternatively:: 84132db935SJakub Kicinski 85132db935SJakub Kicinski # ethtool --offload <if_name> rxvlan on/off 86132db935SJakub Kicinski 87132db935SJakub KicinskiAdaptive Virtual Function 88132db935SJakub Kicinski------------------------- 89132db935SJakub KicinskiAdaptive Virtual Function (AVF) allows the virtual function driver, or VF, to 90132db935SJakub Kicinskiadapt to changing feature sets of the physical function driver (PF) with which 91132db935SJakub Kicinskiit is associated. This allows system administrators to update a PF without 92132db935SJakub Kicinskihaving to update all the VFs associated with it. All AVFs have a single common 93132db935SJakub Kicinskidevice ID and branding string. 94132db935SJakub Kicinski 95132db935SJakub KicinskiAVFs have a minimum set of features known as "base mode," but may provide 96132db935SJakub Kicinskiadditional features depending on what features are available in the PF with 97132db935SJakub Kicinskiwhich the AVF is associated. The following are base mode features: 98132db935SJakub Kicinski 99132db935SJakub Kicinski- 4 Queue Pairs (QP) and associated Configuration Status Registers (CSRs) 100132db935SJakub Kicinski for Tx/Rx 101132db935SJakub Kicinski- i40e descriptors and ring format 102132db935SJakub Kicinski- Descriptor write-back completion 103132db935SJakub Kicinski- 1 control queue, with i40e descriptors, CSRs and ring format 104132db935SJakub Kicinski- 5 MSI-X interrupt vectors and corresponding i40e CSRs 105132db935SJakub Kicinski- 1 Interrupt Throttle Rate (ITR) index 106132db935SJakub Kicinski- 1 Virtual Station Interface (VSI) per VF 107132db935SJakub Kicinski- 1 Traffic Class (TC), TC0 108132db935SJakub Kicinski- Receive Side Scaling (RSS) with 64 entry indirection table and key, 109132db935SJakub Kicinski configured through the PF 110132db935SJakub Kicinski- 1 unicast MAC address reserved per VF 111132db935SJakub Kicinski- 16 MAC address filters for each VF 112132db935SJakub Kicinski- Stateless offloads - non-tunneled checksums 113132db935SJakub Kicinski- AVF device ID 114132db935SJakub Kicinski- HW mailbox is used for VF to PF communications (including on Windows) 115132db935SJakub Kicinski 116729979ebSMauro Carvalho ChehabIEEE 802.1ad (QinQ) Support 117132db935SJakub Kicinski--------------------------- 118132db935SJakub KicinskiThe IEEE 802.1ad standard, informally known as QinQ, allows for multiple VLAN 119132db935SJakub KicinskiIDs within a single Ethernet frame. VLAN IDs are sometimes referred to as 120132db935SJakub Kicinski"tags," and multiple VLAN IDs are thus referred to as a "tag stack." Tag stacks 121132db935SJakub Kicinskiallow L2 tunneling and the ability to segregate traffic within a particular 122132db935SJakub KicinskiVLAN ID, among other uses. 123132db935SJakub Kicinski 124132db935SJakub KicinskiThe following are examples of how to configure 802.1ad (QinQ):: 125132db935SJakub Kicinski 126132db935SJakub Kicinski # ip link add link eth0 eth0.24 type vlan proto 802.1ad id 24 127132db935SJakub Kicinski # ip link add link eth0.24 eth0.24.371 type vlan proto 802.1Q id 371 128132db935SJakub Kicinski 129132db935SJakub KicinskiWhere "24" and "371" are example VLAN IDs. 130132db935SJakub Kicinski 131132db935SJakub KicinskiNOTES: 132132db935SJakub Kicinski Receive checksum offloads, cloud filters, and VLAN acceleration are not 133132db935SJakub Kicinski supported for 802.1ad (QinQ) packets. 134132db935SJakub Kicinski 135132db935SJakub KicinskiApplication Device Queues (ADq) 136132db935SJakub Kicinski------------------------------- 137132db935SJakub KicinskiApplication Device Queues (ADq) allows you to dedicate one or more queues to a 138132db935SJakub Kicinskispecific application. This can reduce latency for the specified application, 139132db935SJakub Kicinskiand allow Tx traffic to be rate limited per application. Follow the steps below 140132db935SJakub Kicinskito set ADq. 141132db935SJakub Kicinski 142132db935SJakub KicinskiRequirements: 143132db935SJakub Kicinski 144132db935SJakub Kicinski- The sch_mqprio, act_mirred and cls_flower modules must be loaded 145132db935SJakub Kicinski- The latest version of iproute2 146132db935SJakub Kicinski- If another driver (for example, DPDK) has set cloud filters, you cannot 147132db935SJakub Kicinski enable ADQ 148132db935SJakub Kicinski- Depending on the underlying PF device, ADQ cannot be enabled when the 149132db935SJakub Kicinski following features are enabled: 150132db935SJakub Kicinski 151132db935SJakub Kicinski + Data Center Bridging (DCB) 152132db935SJakub Kicinski + Multiple Functions per Port (MFP) 153132db935SJakub Kicinski + Sideband Filters 154132db935SJakub Kicinski 155132db935SJakub Kicinski1. Create traffic classes (TCs). Maximum of 8 TCs can be created per interface. 156132db935SJakub KicinskiThe shaper bw_rlimit parameter is optional. 157132db935SJakub Kicinski 158132db935SJakub KicinskiExample: Sets up two tcs, tc0 and tc1, with 16 queues each and max tx rate set 159132db935SJakub Kicinskito 1Gbit for tc0 and 3Gbit for tc1. 160132db935SJakub Kicinski 161132db935SJakub Kicinski:: 162132db935SJakub Kicinski 163132db935SJakub Kicinski tc qdisc add dev <interface> root mqprio num_tc 2 map 0 0 0 0 1 1 1 1 164132db935SJakub Kicinski queues 16@0 16@16 hw 1 mode channel shaper bw_rlimit min_rate 1Gbit 2Gbit 165132db935SJakub Kicinski max_rate 1Gbit 3Gbit 166132db935SJakub Kicinski 167132db935SJakub Kicinskimap: priority mapping for up to 16 priorities to tcs (e.g. map 0 0 0 0 1 1 1 1 168132db935SJakub Kicinskisets priorities 0-3 to use tc0 and 4-7 to use tc1) 169132db935SJakub Kicinski 170132db935SJakub Kicinskiqueues: for each tc, <num queues>@<offset> (e.g. queues 16@0 16@16 assigns 171132db935SJakub Kicinski16 queues to tc0 at offset 0 and 16 queues to tc1 at offset 16. Max total 172132db935SJakub Kicinskinumber of queues for all tcs is 64 or number of cores, whichever is lower.) 173132db935SJakub Kicinski 174132db935SJakub Kicinskihw 1 mode channel: ‘channel’ with ‘hw’ set to 1 is a new new hardware 175132db935SJakub Kicinskioffload mode in mqprio that makes full use of the mqprio options, the 176132db935SJakub KicinskiTCs, the queue configurations, and the QoS parameters. 177132db935SJakub Kicinski 178132db935SJakub Kicinskishaper bw_rlimit: for each tc, sets minimum and maximum bandwidth rates. 179132db935SJakub KicinskiTotals must be equal or less than port speed. 180132db935SJakub Kicinski 181132db935SJakub KicinskiFor example: min_rate 1Gbit 3Gbit: Verify bandwidth limit using network 1825e716ec6SMauro Carvalho Chehabmonitoring tools such as ``ifstat`` or ``sar -n DEV [interval] [number of samples]`` 183132db935SJakub Kicinski 184132db935SJakub KicinskiNOTE: 185132db935SJakub Kicinski Setting up channels via ethtool (ethtool -L) is not supported when the 186132db935SJakub Kicinski TCs are configured using mqprio. 187132db935SJakub Kicinski 188132db935SJakub Kicinski2. Enable HW TC offload on interface:: 189132db935SJakub Kicinski 190132db935SJakub Kicinski # ethtool -K <interface> hw-tc-offload on 191132db935SJakub Kicinski 192132db935SJakub Kicinski3. Apply TCs to ingress (RX) flow of interface:: 193132db935SJakub Kicinski 194132db935SJakub Kicinski # tc qdisc add dev <interface> ingress 195132db935SJakub Kicinski 196132db935SJakub KicinskiNOTES: 197132db935SJakub Kicinski - Run all tc commands from the iproute2 <pathtoiproute2>/tc/ directory 198132db935SJakub Kicinski - ADq is not compatible with cloud filters 199132db935SJakub Kicinski - Setting up channels via ethtool (ethtool -L) is not supported when the TCs 200132db935SJakub Kicinski are configured using mqprio 201132db935SJakub Kicinski - You must have iproute2 latest version 202132db935SJakub Kicinski - NVM version 6.01 or later is required 203132db935SJakub Kicinski - ADq cannot be enabled when any the following features are enabled: Data 204132db935SJakub Kicinski Center Bridging (DCB), Multiple Functions per Port (MFP), or Sideband Filters 205132db935SJakub Kicinski - If another driver (for example, DPDK) has set cloud filters, you cannot 206132db935SJakub Kicinski enable ADq 207132db935SJakub Kicinski - Tunnel filters are not supported in ADq. If encapsulated packets do arrive 208132db935SJakub Kicinski in non-tunnel mode, filtering will be done on the inner headers. For example, 209132db935SJakub Kicinski for VXLAN traffic in non-tunnel mode, PCTYPE is identified as a VXLAN 210132db935SJakub Kicinski encapsulated packet, outer headers are ignored. Therefore, inner headers are 211132db935SJakub Kicinski matched. 212132db935SJakub Kicinski - If a TC filter on a PF matches traffic over a VF (on the PF), that traffic 213132db935SJakub Kicinski will be routed to the appropriate queue of the PF, and will not be passed on 214132db935SJakub Kicinski the VF. Such traffic will end up getting dropped higher up in the TCP/IP 215132db935SJakub Kicinski stack as it does not match PF address data. 216132db935SJakub Kicinski - If traffic matches multiple TC filters that point to different TCs, that 217132db935SJakub Kicinski traffic will be duplicated and sent to all matching TC queues. The hardware 218132db935SJakub Kicinski switch mirrors the packet to a VSI list when multiple filters are matched. 219132db935SJakub Kicinski 220132db935SJakub Kicinski 221132db935SJakub KicinskiKnown Issues/Troubleshooting 222132db935SJakub Kicinski============================ 223132db935SJakub Kicinski 224132db935SJakub KicinskiBonding fails with VFs bound to an Intel(R) Ethernet Controller 700 series device 225132db935SJakub Kicinski--------------------------------------------------------------------------------- 226132db935SJakub KicinskiIf you bind Virtual Functions (VFs) to an Intel(R) Ethernet Controller 700 227132db935SJakub Kicinskiseries based device, the VF slaves may fail when they become the active slave. 228132db935SJakub KicinskiIf the MAC address of the VF is set by the PF (Physical Function) of the 229132db935SJakub Kicinskidevice, when you add a slave, or change the active-backup slave, Linux bonding 230132db935SJakub Kicinskitries to sync the backup slave's MAC address to the same MAC address as the 231132db935SJakub Kicinskiactive slave. Linux bonding will fail at this point. This issue will not occur 232132db935SJakub Kicinskiif the VF's MAC address is not set by the PF. 233132db935SJakub Kicinski 234132db935SJakub KicinskiTraffic Is Not Being Passed Between VM and Client 235132db935SJakub Kicinski------------------------------------------------- 236132db935SJakub KicinskiYou may not be able to pass traffic between a client system and a 237132db935SJakub KicinskiVirtual Machine (VM) running on a separate host if the Virtual Function 238132db935SJakub Kicinski(VF, or Virtual NIC) is not in trusted mode and spoof checking is enabled 239132db935SJakub Kicinskion the VF. Note that this situation can occur in any combination of client, 240132db935SJakub Kicinskihost, and guest operating system. For information on how to set the VF to 241132db935SJakub Kicinskitrusted mode, refer to the section "VLAN Tag Packet Steering" in this 242132db935SJakub Kicinskireadme document. For information on setting spoof checking, refer to the 243132db935SJakub Kicinskisection "MAC and VLAN anti-spoofing feature" in this readme document. 244132db935SJakub Kicinski 245132db935SJakub KicinskiDo not unload port driver if VF with active VM is bound to it 246132db935SJakub Kicinski------------------------------------------------------------- 247132db935SJakub KicinskiDo not unload a port's driver if a Virtual Function (VF) with an active Virtual 248132db935SJakub KicinskiMachine (VM) is bound to it. Doing so will cause the port to appear to hang. 249132db935SJakub KicinskiOnce the VM shuts down, or otherwise releases the VF, the command will complete. 250132db935SJakub Kicinski 251132db935SJakub KicinskiUsing four traffic classes fails 252132db935SJakub Kicinski-------------------------------- 253132db935SJakub KicinskiDo not try to reserve more than three traffic classes in the iavf driver. Doing 254132db935SJakub Kicinskiso will fail to set any traffic classes and will cause the driver to write 255132db935SJakub Kicinskierrors to stdout. Use a maximum of three queues to avoid this issue. 256132db935SJakub Kicinski 257132db935SJakub KicinskiMultiple log error messages on iavf driver removal 258132db935SJakub Kicinski-------------------------------------------------- 259132db935SJakub KicinskiIf you have several VFs and you remove the iavf driver, several instances of 260132db935SJakub Kicinskithe following log errors are written to the log:: 261132db935SJakub Kicinski 262132db935SJakub Kicinski Unable to send opcode 2 to PF, err I40E_ERR_QUEUE_EMPTY, aq_err ok 263132db935SJakub Kicinski Unable to send the message to VF 2 aq_err 12 264132db935SJakub Kicinski ARQ Overflow Error detected 265132db935SJakub Kicinski 266132db935SJakub KicinskiVirtual machine does not get link 267132db935SJakub Kicinski--------------------------------- 268132db935SJakub KicinskiIf the virtual machine has more than one virtual port assigned to it, and those 269132db935SJakub Kicinskivirtual ports are bound to different physical ports, you may not get link on 270132db935SJakub Kicinskiall of the virtual ports. The following command may work around the issue:: 271132db935SJakub Kicinski 272132db935SJakub Kicinski # ethtool -r <PF> 273132db935SJakub Kicinski 274132db935SJakub KicinskiWhere <PF> is the PF interface in the host, for example: p5p1. You may need to 275132db935SJakub Kicinskirun the command more than once to get link on all virtual ports. 276132db935SJakub Kicinski 277132db935SJakub KicinskiMAC address of Virtual Function changes unexpectedly 278132db935SJakub Kicinski---------------------------------------------------- 279132db935SJakub KicinskiIf a Virtual Function's MAC address is not assigned in the host, then the VF 280132db935SJakub Kicinski(virtual function) driver will use a random MAC address. This random MAC 281132db935SJakub Kicinskiaddress may change each time the VF driver is reloaded. You can assign a static 282132db935SJakub KicinskiMAC address in the host machine. This static MAC address will survive 283132db935SJakub Kicinskia VF driver reload. 284132db935SJakub Kicinski 285132db935SJakub KicinskiDriver Buffer Overflow Fix 286132db935SJakub Kicinski-------------------------- 287132db935SJakub KicinskiThe fix to resolve CVE-2016-8105, referenced in Intel SA-00069 288132db935SJakub Kicinskihttps://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00069.html 289132db935SJakub Kicinskiis included in this and future versions of the driver. 290132db935SJakub Kicinski 291132db935SJakub KicinskiMultiple Interfaces on Same Ethernet Broadcast Network 292132db935SJakub Kicinski------------------------------------------------------ 293132db935SJakub KicinskiDue to the default ARP behavior on Linux, it is not possible to have one system 294132db935SJakub Kicinskion two IP networks in the same Ethernet broadcast domain (non-partitioned 295132db935SJakub Kicinskiswitch) behave as expected. All Ethernet interfaces will respond to IP traffic 296132db935SJakub Kicinskifor any IP address assigned to the system. This results in unbalanced receive 297132db935SJakub Kicinskitraffic. 298132db935SJakub Kicinski 299132db935SJakub KicinskiIf you have multiple interfaces in a server, either turn on ARP filtering by 300132db935SJakub Kicinskientering:: 301132db935SJakub Kicinski 302132db935SJakub Kicinski # echo 1 > /proc/sys/net/ipv4/conf/all/arp_filter 303132db935SJakub Kicinski 304132db935SJakub KicinskiNOTE: 305132db935SJakub Kicinski This setting is not saved across reboots. The configuration change can be 306132db935SJakub Kicinski made permanent by adding the following line to the file /etc/sysctl.conf:: 307132db935SJakub Kicinski 308132db935SJakub Kicinski net.ipv4.conf.all.arp_filter = 1 309132db935SJakub Kicinski 310132db935SJakub KicinskiAnother alternative is to install the interfaces in separate broadcast domains 311132db935SJakub Kicinski(either in different switches or in a switch partitioned to VLANs). 312132db935SJakub Kicinski 313132db935SJakub KicinskiRx Page Allocation Errors 314132db935SJakub Kicinski------------------------- 315132db935SJakub Kicinski'Page allocation failure. order:0' errors may occur under stress. 316132db935SJakub KicinskiThis is caused by the way the Linux kernel reports this stressed condition. 317132db935SJakub Kicinski 318132db935SJakub Kicinski 319132db935SJakub KicinskiSupport 320132db935SJakub Kicinski======= 321132db935SJakub KicinskiFor general information, go to the Intel support website at: 322132db935SJakub Kicinskihttps://support.intel.com 323132db935SJakub Kicinski 324132db935SJakub KicinskiIf an issue is identified with the released source code on the supported kernel 325132db935SJakub Kicinskiwith a supported adapter, email the specific information related to the issue 326*8ba732beSTony Nguyento intel-wired-lan@lists.osuosl.org. 327