xref: /openbmc/linux/Documentation/networking/device_drivers/ethernet/microsoft/netvsc.rst (revision 8be98d2f2a0a262f8bf8a0bc1fdf522b3c7aab17)
1132db935SJakub Kicinski.. SPDX-License-Identifier: GPL-2.0
2132db935SJakub Kicinski
3132db935SJakub Kicinski======================
4132db935SJakub KicinskiHyper-V network driver
5132db935SJakub Kicinski======================
6132db935SJakub Kicinski
7132db935SJakub KicinskiCompatibility
8132db935SJakub Kicinski=============
9132db935SJakub Kicinski
10132db935SJakub KicinskiThis driver is compatible with Windows Server 2012 R2, 2016 and
11132db935SJakub KicinskiWindows 10.
12132db935SJakub Kicinski
13132db935SJakub KicinskiFeatures
14132db935SJakub Kicinski========
15132db935SJakub Kicinski
16132db935SJakub KicinskiChecksum offload
17132db935SJakub Kicinski----------------
18132db935SJakub Kicinski  The netvsc driver supports checksum offload as long as the
19132db935SJakub Kicinski  Hyper-V host version does. Windows Server 2016 and Azure
20132db935SJakub Kicinski  support checksum offload for TCP and UDP for both IPv4 and
21132db935SJakub Kicinski  IPv6. Windows Server 2012 only supports checksum offload for TCP.
22132db935SJakub Kicinski
23132db935SJakub KicinskiReceive Side Scaling
24132db935SJakub Kicinski--------------------
25132db935SJakub Kicinski  Hyper-V supports receive side scaling. For TCP & UDP, packets can
26132db935SJakub Kicinski  be distributed among available queues based on IP address and port
27132db935SJakub Kicinski  number.
28132db935SJakub Kicinski
29132db935SJakub Kicinski  For TCP & UDP, we can switch hash level between L3 and L4 by ethtool
30132db935SJakub Kicinski  command. TCP/UDP over IPv4 and v6 can be set differently. The default
31132db935SJakub Kicinski  hash level is L4. We currently only allow switching TX hash level
32132db935SJakub Kicinski  from within the guests.
33132db935SJakub Kicinski
34132db935SJakub Kicinski  On Azure, fragmented UDP packets have high loss rate with L4
35132db935SJakub Kicinski  hashing. Using L3 hashing is recommended in this case.
36132db935SJakub Kicinski
37132db935SJakub Kicinski  For example, for UDP over IPv4 on eth0:
38132db935SJakub Kicinski
39132db935SJakub Kicinski  To include UDP port numbers in hashing::
40132db935SJakub Kicinski
41132db935SJakub Kicinski	ethtool -N eth0 rx-flow-hash udp4 sdfn
42132db935SJakub Kicinski
43132db935SJakub Kicinski  To exclude UDP port numbers in hashing::
44132db935SJakub Kicinski
45132db935SJakub Kicinski	ethtool -N eth0 rx-flow-hash udp4 sd
46132db935SJakub Kicinski
47132db935SJakub Kicinski  To show UDP hash level::
48132db935SJakub Kicinski
49132db935SJakub Kicinski	ethtool -n eth0 rx-flow-hash udp4
50132db935SJakub Kicinski
51132db935SJakub KicinskiGeneric Receive Offload, aka GRO
52132db935SJakub Kicinski--------------------------------
53132db935SJakub Kicinski  The driver supports GRO and it is enabled by default. GRO coalesces
54132db935SJakub Kicinski  like packets and significantly reduces CPU usage under heavy Rx
55132db935SJakub Kicinski  load.
56132db935SJakub Kicinski
57132db935SJakub KicinskiLarge Receive Offload (LRO), or Receive Side Coalescing (RSC)
58132db935SJakub Kicinski-------------------------------------------------------------
59132db935SJakub Kicinski  The driver supports LRO/RSC in the vSwitch feature. It reduces the per packet
60132db935SJakub Kicinski  processing overhead by coalescing multiple TCP segments when possible. The
61132db935SJakub Kicinski  feature is enabled by default on VMs running on Windows Server 2019 and
62132db935SJakub Kicinski  later. It may be changed by ethtool command::
63132db935SJakub Kicinski
64132db935SJakub Kicinski	ethtool -K eth0 lro on
65132db935SJakub Kicinski	ethtool -K eth0 lro off
66132db935SJakub Kicinski
67132db935SJakub KicinskiSR-IOV support
68132db935SJakub Kicinski--------------
69132db935SJakub Kicinski  Hyper-V supports SR-IOV as a hardware acceleration option. If SR-IOV
70132db935SJakub Kicinski  is enabled in both the vSwitch and the guest configuration, then the
71132db935SJakub Kicinski  Virtual Function (VF) device is passed to the guest as a PCI
72132db935SJakub Kicinski  device. In this case, both a synthetic (netvsc) and VF device are
73132db935SJakub Kicinski  visible in the guest OS and both NIC's have the same MAC address.
74132db935SJakub Kicinski
75132db935SJakub Kicinski  The VF is enslaved by netvsc device.  The netvsc driver will transparently
76132db935SJakub Kicinski  switch the data path to the VF when it is available and up.
77132db935SJakub Kicinski  Network state (addresses, firewall, etc) should be applied only to the
78132db935SJakub Kicinski  netvsc device; the slave device should not be accessed directly in
79132db935SJakub Kicinski  most cases.  The exceptions are if some special queue discipline or
80132db935SJakub Kicinski  flow direction is desired, these should be applied directly to the
81132db935SJakub Kicinski  VF slave device.
82132db935SJakub Kicinski
83132db935SJakub KicinskiReceive Buffer
84132db935SJakub Kicinski--------------
85132db935SJakub Kicinski  Packets are received into a receive area which is created when device
86132db935SJakub Kicinski  is probed. The receive area is broken into MTU sized chunks and each may
87132db935SJakub Kicinski  contain one or more packets. The number of receive sections may be changed
88132db935SJakub Kicinski  via ethtool Rx ring parameters.
89132db935SJakub Kicinski
90*bd49fea7SShachar Raindel  There is a similar send buffer which is used to aggregate packets
91*bd49fea7SShachar Raindel  for sending.  The send area is broken into chunks, typically of 6144
92*bd49fea7SShachar Raindel  bytes, each of section may contain one or more packets. Small
93*bd49fea7SShachar Raindel  packets are usually transmitted via copy to the send buffer. However,
94*bd49fea7SShachar Raindel  if the buffer is temporarily exhausted, or the packet to be transmitted is
95*bd49fea7SShachar Raindel  an LSO packet, the driver will provide the host with pointers to the data
96*bd49fea7SShachar Raindel  from the SKB. This attempts to achieve a balance between the overhead of
97*bd49fea7SShachar Raindel  data copy and the impact of remapping VM memory to be accessible by the
98*bd49fea7SShachar Raindel  host.
99132db935SJakub Kicinski
100132db935SJakub KicinskiXDP support
101132db935SJakub Kicinski-----------
102132db935SJakub Kicinski  XDP (eXpress Data Path) is a feature that runs eBPF bytecode at the early
103132db935SJakub Kicinski  stage when packets arrive at a NIC card. The goal is to increase performance
104132db935SJakub Kicinski  for packet processing, reducing the overhead of SKB allocation and other
105132db935SJakub Kicinski  upper network layers.
106132db935SJakub Kicinski
107132db935SJakub Kicinski  hv_netvsc supports XDP in native mode, and transparently sets the XDP
108132db935SJakub Kicinski  program on the associated VF NIC as well.
109132db935SJakub Kicinski
110132db935SJakub Kicinski  Setting / unsetting XDP program on synthetic NIC (netvsc) propagates to
111132db935SJakub Kicinski  VF NIC automatically. Setting / unsetting XDP program on VF NIC directly
112132db935SJakub Kicinski  is not recommended, also not propagated to synthetic NIC, and may be
113132db935SJakub Kicinski  overwritten by setting of synthetic NIC.
114132db935SJakub Kicinski
115132db935SJakub Kicinski  XDP program cannot run with LRO (RSC) enabled, so you need to disable LRO
116132db935SJakub Kicinski  before running XDP::
117132db935SJakub Kicinski
118132db935SJakub Kicinski	ethtool -K eth0 lro off
119132db935SJakub Kicinski
120132db935SJakub Kicinski  XDP_REDIRECT action is not yet supported.
121