xref: /openbmc/linux/Documentation/networking/multiqueue.rst (revision 4b4193256c8d3bc3a5397b5cd9494c2ad386317d)
1*e98aa682SMauro Carvalho Chehab.. SPDX-License-Identifier: GPL-2.0
2*e98aa682SMauro Carvalho Chehab
3*e98aa682SMauro Carvalho Chehab===========================================
4*e98aa682SMauro Carvalho ChehabHOWTO for multiqueue network device support
5*e98aa682SMauro Carvalho Chehab===========================================
6*e98aa682SMauro Carvalho Chehab
7*e98aa682SMauro Carvalho ChehabSection 1: Base driver requirements for implementing multiqueue support
8*e98aa682SMauro Carvalho Chehab=======================================================================
9*e98aa682SMauro Carvalho Chehab
10*e98aa682SMauro Carvalho ChehabIntro: Kernel support for multiqueue devices
11*e98aa682SMauro Carvalho Chehab---------------------------------------------------------
12*e98aa682SMauro Carvalho Chehab
13*e98aa682SMauro Carvalho ChehabKernel support for multiqueue devices is always present.
14*e98aa682SMauro Carvalho Chehab
15*e98aa682SMauro Carvalho ChehabBase drivers are required to use the new alloc_etherdev_mq() or
16*e98aa682SMauro Carvalho Chehaballoc_netdev_mq() functions to allocate the subqueues for the device.  The
17*e98aa682SMauro Carvalho Chehabunderlying kernel API will take care of the allocation and deallocation of
18*e98aa682SMauro Carvalho Chehabthe subqueue memory, as well as netdev configuration of where the queues
19*e98aa682SMauro Carvalho Chehabexist in memory.
20*e98aa682SMauro Carvalho Chehab
21*e98aa682SMauro Carvalho ChehabThe base driver will also need to manage the queues as it does the global
22*e98aa682SMauro Carvalho Chehabnetdev->queue_lock today.  Therefore base drivers should use the
23*e98aa682SMauro Carvalho Chehabnetif_{start|stop|wake}_subqueue() functions to manage each queue while the
24*e98aa682SMauro Carvalho Chehabdevice is still operational.  netdev->queue_lock is still used when the device
25*e98aa682SMauro Carvalho Chehabcomes online or when it's completely shut down (unregister_netdev(), etc.).
26*e98aa682SMauro Carvalho Chehab
27*e98aa682SMauro Carvalho Chehab
28*e98aa682SMauro Carvalho ChehabSection 2: Qdisc support for multiqueue devices
29*e98aa682SMauro Carvalho Chehab===============================================
30*e98aa682SMauro Carvalho Chehab
31*e98aa682SMauro Carvalho ChehabCurrently two qdiscs are optimized for multiqueue devices.  The first is the
32*e98aa682SMauro Carvalho Chehabdefault pfifo_fast qdisc.  This qdisc supports one qdisc per hardware queue.
33*e98aa682SMauro Carvalho ChehabA new round-robin qdisc, sch_multiq also supports multiple hardware queues. The
34*e98aa682SMauro Carvalho Chehabqdisc is responsible for classifying the skb's and then directing the skb's to
35*e98aa682SMauro Carvalho Chehabbands and queues based on the value in skb->queue_mapping.  Use this field in
36*e98aa682SMauro Carvalho Chehabthe base driver to determine which queue to send the skb to.
37*e98aa682SMauro Carvalho Chehab
38*e98aa682SMauro Carvalho Chehabsch_multiq has been added for hardware that wishes to avoid head-of-line
39*e98aa682SMauro Carvalho Chehabblocking.  It will cycle though the bands and verify that the hardware queue
40*e98aa682SMauro Carvalho Chehabassociated with the band is not stopped prior to dequeuing a packet.
41*e98aa682SMauro Carvalho Chehab
42*e98aa682SMauro Carvalho ChehabOn qdisc load, the number of bands is based on the number of queues on the
43*e98aa682SMauro Carvalho Chehabhardware.  Once the association is made, any skb with skb->queue_mapping set,
44*e98aa682SMauro Carvalho Chehabwill be queued to the band associated with the hardware queue.
45*e98aa682SMauro Carvalho Chehab
46*e98aa682SMauro Carvalho Chehab
47*e98aa682SMauro Carvalho ChehabSection 3: Brief howto using MULTIQ for multiqueue devices
48*e98aa682SMauro Carvalho Chehab==========================================================
49*e98aa682SMauro Carvalho Chehab
50*e98aa682SMauro Carvalho ChehabThe userspace command 'tc,' part of the iproute2 package, is used to configure
51*e98aa682SMauro Carvalho Chehabqdiscs.  To add the MULTIQ qdisc to your network device, assuming the device
52*e98aa682SMauro Carvalho Chehabis called eth0, run the following command::
53*e98aa682SMauro Carvalho Chehab
54*e98aa682SMauro Carvalho Chehab    # tc qdisc add dev eth0 root handle 1: multiq
55*e98aa682SMauro Carvalho Chehab
56*e98aa682SMauro Carvalho ChehabThe qdisc will allocate the number of bands to equal the number of queues that
57*e98aa682SMauro Carvalho Chehabthe device reports, and bring the qdisc online.  Assuming eth0 has 4 Tx
58*e98aa682SMauro Carvalho Chehabqueues, the band mapping would look like::
59*e98aa682SMauro Carvalho Chehab
60*e98aa682SMauro Carvalho Chehab    band 0 => queue 0
61*e98aa682SMauro Carvalho Chehab    band 1 => queue 1
62*e98aa682SMauro Carvalho Chehab    band 2 => queue 2
63*e98aa682SMauro Carvalho Chehab    band 3 => queue 3
64*e98aa682SMauro Carvalho Chehab
65*e98aa682SMauro Carvalho ChehabTraffic will begin flowing through each queue based on either the simple_tx_hash
66*e98aa682SMauro Carvalho Chehabfunction or based on netdev->select_queue() if you have it defined.
67*e98aa682SMauro Carvalho Chehab
68*e98aa682SMauro Carvalho ChehabThe behavior of tc filters remains the same.  However a new tc action,
69*e98aa682SMauro Carvalho Chehabskbedit, has been added.  Assuming you wanted to route all traffic to a
70*e98aa682SMauro Carvalho Chehabspecific host, for example 192.168.0.3, through a specific queue you could use
71*e98aa682SMauro Carvalho Chehabthis action and establish a filter such as::
72*e98aa682SMauro Carvalho Chehab
73*e98aa682SMauro Carvalho Chehab    tc filter add dev eth0 parent 1: protocol ip prio 1 u32 \
74*e98aa682SMauro Carvalho Chehab	    match ip dst 192.168.0.3 \
75*e98aa682SMauro Carvalho Chehab	    action skbedit queue_mapping 3
76*e98aa682SMauro Carvalho Chehab
77*e98aa682SMauro Carvalho Chehab:Author: Alexander Duyck <alexander.h.duyck@intel.com>
78*e98aa682SMauro Carvalho Chehab:Original Author: Peter P. Waskiewicz Jr. <peter.p.waskiewicz.jr@intel.com>
79