xref: /openbmc/linux/Documentation/networking/dsa/configuration.rst (revision 9b4469410cf9a0fcbccc92c480fd42f7c815a745)
1.. SPDX-License-Identifier: GPL-2.0
2
3=======================================
4DSA switch configuration from userspace
5=======================================
6
7The DSA switch configuration is not integrated into the main userspace
8network configuration suites by now and has to be performed manually.
9
10.. _dsa-config-showcases:
11
12Configuration showcases
13-----------------------
14
15To configure a DSA switch a couple of commands need to be executed. In this
16documentation some common configuration scenarios are handled as showcases:
17
18*single port*
19  Every switch port acts as a different configurable Ethernet port
20
21*bridge*
22  Every switch port is part of one configurable Ethernet bridge
23
24*gateway*
25  Every switch port except one upstream port is part of a configurable
26  Ethernet bridge.
27  The upstream port acts as different configurable Ethernet port.
28
29All configurations are performed with tools from iproute2, which is available
30at https://www.kernel.org/pub/linux/utils/net/iproute2/
31
32Through DSA every port of a switch is handled like a normal linux Ethernet
33interface. The CPU port is the switch port connected to an Ethernet MAC chip.
34The corresponding linux Ethernet interface is called the master interface.
35All other corresponding linux interfaces are called slave interfaces.
36
37The slave interfaces depend on the master interface being up in order for them
38to send or receive traffic. Prior to kernel v5.12, the state of the master
39interface had to be managed explicitly by the user. Starting with kernel v5.12,
40the behavior is as follows:
41
42- when a DSA slave interface is brought up, the master interface is
43  automatically brought up.
44- when the master interface is brought down, all DSA slave interfaces are
45  automatically brought down.
46
47In this documentation the following Ethernet interfaces are used:
48
49*eth0*
50  the master interface
51
52*eth1*
53  another master interface
54
55*lan1*
56  a slave interface
57
58*lan2*
59  another slave interface
60
61*lan3*
62  a third slave interface
63
64*wan*
65  A slave interface dedicated for upstream traffic
66
67Further Ethernet interfaces can be configured similar.
68The configured IPs and networks are:
69
70*single port*
71  * lan1: 192.0.2.1/30 (192.0.2.0 - 192.0.2.3)
72  * lan2: 192.0.2.5/30 (192.0.2.4 - 192.0.2.7)
73  * lan3: 192.0.2.9/30 (192.0.2.8 - 192.0.2.11)
74
75*bridge*
76  * br0: 192.0.2.129/25 (192.0.2.128 - 192.0.2.255)
77
78*gateway*
79  * br0: 192.0.2.129/25 (192.0.2.128 - 192.0.2.255)
80  * wan: 192.0.2.1/30 (192.0.2.0 - 192.0.2.3)
81
82.. _dsa-tagged-configuration:
83
84Configuration with tagging support
85----------------------------------
86
87The tagging based configuration is desired and supported by the majority of
88DSA switches. These switches are capable to tag incoming and outgoing traffic
89without using a VLAN based configuration.
90
91*single port*
92  .. code-block:: sh
93
94    # configure each interface
95    ip addr add 192.0.2.1/30 dev lan1
96    ip addr add 192.0.2.5/30 dev lan2
97    ip addr add 192.0.2.9/30 dev lan3
98
99    # For kernels earlier than v5.12, the master interface needs to be
100    # brought up manually before the slave ports.
101    ip link set eth0 up
102
103    # bring up the slave interfaces
104    ip link set lan1 up
105    ip link set lan2 up
106    ip link set lan3 up
107
108*bridge*
109  .. code-block:: sh
110
111    # For kernels earlier than v5.12, the master interface needs to be
112    # brought up manually before the slave ports.
113    ip link set eth0 up
114
115    # bring up the slave interfaces
116    ip link set lan1 up
117    ip link set lan2 up
118    ip link set lan3 up
119
120    # create bridge
121    ip link add name br0 type bridge
122
123    # add ports to bridge
124    ip link set dev lan1 master br0
125    ip link set dev lan2 master br0
126    ip link set dev lan3 master br0
127
128    # configure the bridge
129    ip addr add 192.0.2.129/25 dev br0
130
131    # bring up the bridge
132    ip link set dev br0 up
133
134*gateway*
135  .. code-block:: sh
136
137    # For kernels earlier than v5.12, the master interface needs to be
138    # brought up manually before the slave ports.
139    ip link set eth0 up
140
141    # bring up the slave interfaces
142    ip link set wan up
143    ip link set lan1 up
144    ip link set lan2 up
145
146    # configure the upstream port
147    ip addr add 192.0.2.1/30 dev wan
148
149    # create bridge
150    ip link add name br0 type bridge
151
152    # add ports to bridge
153    ip link set dev lan1 master br0
154    ip link set dev lan2 master br0
155
156    # configure the bridge
157    ip addr add 192.0.2.129/25 dev br0
158
159    # bring up the bridge
160    ip link set dev br0 up
161
162.. _dsa-vlan-configuration:
163
164Configuration without tagging support
165-------------------------------------
166
167A minority of switches are not capable to use a taging protocol
168(DSA_TAG_PROTO_NONE). These switches can be configured by a VLAN based
169configuration.
170
171*single port*
172  The configuration can only be set up via VLAN tagging and bridge setup.
173
174  .. code-block:: sh
175
176    # tag traffic on CPU port
177    ip link add link eth0 name eth0.1 type vlan id 1
178    ip link add link eth0 name eth0.2 type vlan id 2
179    ip link add link eth0 name eth0.3 type vlan id 3
180
181    # For kernels earlier than v5.12, the master interface needs to be
182    # brought up manually before the slave ports.
183    ip link set eth0 up
184    ip link set eth0.1 up
185    ip link set eth0.2 up
186    ip link set eth0.3 up
187
188    # bring up the slave interfaces
189    ip link set lan1 up
190    ip link set lan2 up
191    ip link set lan3 up
192
193    # create bridge
194    ip link add name br0 type bridge
195
196    # activate VLAN filtering
197    ip link set dev br0 type bridge vlan_filtering 1
198
199    # add ports to bridges
200    ip link set dev lan1 master br0
201    ip link set dev lan2 master br0
202    ip link set dev lan3 master br0
203
204    # tag traffic on ports
205    bridge vlan add dev lan1 vid 1 pvid untagged
206    bridge vlan add dev lan2 vid 2 pvid untagged
207    bridge vlan add dev lan3 vid 3 pvid untagged
208
209    # configure the VLANs
210    ip addr add 192.0.2.1/30 dev eth0.1
211    ip addr add 192.0.2.5/30 dev eth0.2
212    ip addr add 192.0.2.9/30 dev eth0.3
213
214    # bring up the bridge devices
215    ip link set br0 up
216
217
218*bridge*
219  .. code-block:: sh
220
221    # tag traffic on CPU port
222    ip link add link eth0 name eth0.1 type vlan id 1
223
224    # For kernels earlier than v5.12, the master interface needs to be
225    # brought up manually before the slave ports.
226    ip link set eth0 up
227    ip link set eth0.1 up
228
229    # bring up the slave interfaces
230    ip link set lan1 up
231    ip link set lan2 up
232    ip link set lan3 up
233
234    # create bridge
235    ip link add name br0 type bridge
236
237    # activate VLAN filtering
238    ip link set dev br0 type bridge vlan_filtering 1
239
240    # add ports to bridge
241    ip link set dev lan1 master br0
242    ip link set dev lan2 master br0
243    ip link set dev lan3 master br0
244    ip link set eth0.1 master br0
245
246    # tag traffic on ports
247    bridge vlan add dev lan1 vid 1 pvid untagged
248    bridge vlan add dev lan2 vid 1 pvid untagged
249    bridge vlan add dev lan3 vid 1 pvid untagged
250
251    # configure the bridge
252    ip addr add 192.0.2.129/25 dev br0
253
254    # bring up the bridge
255    ip link set dev br0 up
256
257*gateway*
258  .. code-block:: sh
259
260    # tag traffic on CPU port
261    ip link add link eth0 name eth0.1 type vlan id 1
262    ip link add link eth0 name eth0.2 type vlan id 2
263
264    # For kernels earlier than v5.12, the master interface needs to be
265    # brought up manually before the slave ports.
266    ip link set eth0 up
267    ip link set eth0.1 up
268    ip link set eth0.2 up
269
270    # bring up the slave interfaces
271    ip link set wan up
272    ip link set lan1 up
273    ip link set lan2 up
274
275    # create bridge
276    ip link add name br0 type bridge
277
278    # activate VLAN filtering
279    ip link set dev br0 type bridge vlan_filtering 1
280
281    # add ports to bridges
282    ip link set dev wan master br0
283    ip link set eth0.1 master br0
284    ip link set dev lan1 master br0
285    ip link set dev lan2 master br0
286
287    # tag traffic on ports
288    bridge vlan add dev lan1 vid 1 pvid untagged
289    bridge vlan add dev lan2 vid 1 pvid untagged
290    bridge vlan add dev wan vid 2 pvid untagged
291
292    # configure the VLANs
293    ip addr add 192.0.2.1/30 dev eth0.2
294    ip addr add 192.0.2.129/25 dev br0
295
296    # bring up the bridge devices
297    ip link set br0 up
298
299Forwarding database (FDB) management
300------------------------------------
301
302The existing DSA switches do not have the necessary hardware support to keep
303the software FDB of the bridge in sync with the hardware tables, so the two
304tables are managed separately (``bridge fdb show`` queries both, and depending
305on whether the ``self`` or ``master`` flags are being used, a ``bridge fdb
306add`` or ``bridge fdb del`` command acts upon entries from one or both tables).
307
308Up until kernel v4.14, DSA only supported user space management of bridge FDB
309entries using the bridge bypass operations (which do not update the software
310FDB, just the hardware one) using the ``self`` flag (which is optional and can
311be omitted).
312
313  .. code-block:: sh
314
315    bridge fdb add dev swp0 00:01:02:03:04:05 self static
316    # or shorthand
317    bridge fdb add dev swp0 00:01:02:03:04:05 static
318
319Due to a bug, the bridge bypass FDB implementation provided by DSA did not
320distinguish between ``static`` and ``local`` FDB entries (``static`` are meant
321to be forwarded, while ``local`` are meant to be locally terminated, i.e. sent
322to the host port). Instead, all FDB entries with the ``self`` flag (implicit or
323explicit) are treated by DSA as ``static`` even if they are ``local``.
324
325  .. code-block:: sh
326
327    # This command:
328    bridge fdb add dev swp0 00:01:02:03:04:05 static
329    # behaves the same for DSA as this command:
330    bridge fdb add dev swp0 00:01:02:03:04:05 local
331    # or shorthand, because the 'local' flag is implicit if 'static' is not
332    # specified, it also behaves the same as:
333    bridge fdb add dev swp0 00:01:02:03:04:05
334
335The last command is an incorrect way of adding a static bridge FDB entry to a
336DSA switch using the bridge bypass operations, and works by mistake. Other
337drivers will treat an FDB entry added by the same command as ``local`` and as
338such, will not forward it, as opposed to DSA.
339
340Between kernel v4.14 and v5.14, DSA has supported in parallel two modes of
341adding a bridge FDB entry to the switch: the bridge bypass discussed above, as
342well as a new mode using the ``master`` flag which installs FDB entries in the
343software bridge too.
344
345  .. code-block:: sh
346
347    bridge fdb add dev swp0 00:01:02:03:04:05 master static
348
349Since kernel v5.14, DSA has gained stronger integration with the bridge's
350software FDB, and the support for its bridge bypass FDB implementation (using
351the ``self`` flag) has been removed. This results in the following changes:
352
353  .. code-block:: sh
354
355    # This is the only valid way of adding an FDB entry that is supported,
356    # compatible with v4.14 kernels and later:
357    bridge fdb add dev swp0 00:01:02:03:04:05 master static
358    # This command is no longer buggy and the entry is properly treated as
359    # 'local' instead of being forwarded:
360    bridge fdb add dev swp0 00:01:02:03:04:05
361    # This command no longer installs a static FDB entry to hardware:
362    bridge fdb add dev swp0 00:01:02:03:04:05 static
363
364Script writers are therefore encouraged to use the ``master static`` set of
365flags when working with bridge FDB entries on DSA switch interfaces.
366
367Affinity of user ports to CPU ports
368-----------------------------------
369
370Typically, DSA switches are attached to the host via a single Ethernet
371interface, but in cases where the switch chip is discrete, the hardware design
372may permit the use of 2 or more ports connected to the host, for an increase in
373termination throughput.
374
375DSA can make use of multiple CPU ports in two ways. First, it is possible to
376statically assign the termination traffic associated with a certain user port
377to be processed by a certain CPU port. This way, user space can implement
378custom policies of static load balancing between user ports, by spreading the
379affinities according to the available CPU ports.
380
381Secondly, it is possible to perform load balancing between CPU ports on a per
382packet basis, rather than statically assigning user ports to CPU ports.
383This can be achieved by placing the DSA masters under a LAG interface (bonding
384or team). DSA monitors this operation and creates a mirror of this software LAG
385on the CPU ports facing the physical DSA masters that constitute the LAG slave
386devices.
387
388To make use of multiple CPU ports, the firmware (device tree) description of
389the switch must mark all the links between CPU ports and their DSA masters
390using the ``ethernet`` reference/phandle. At startup, only a single CPU port
391and DSA master will be used - the numerically first port from the firmware
392description which has an ``ethernet`` property. It is up to the user to
393configure the system for the switch to use other masters.
394
395DSA uses the ``rtnl_link_ops`` mechanism (with a "dsa" ``kind``) to allow
396changing the DSA master of a user port. The ``IFLA_DSA_MASTER`` u32 netlink
397attribute contains the ifindex of the master device that handles each slave
398device. The DSA master must be a valid candidate based on firmware node
399information, or a LAG interface which contains only slaves which are valid
400candidates.
401
402Using iproute2, the following manipulations are possible:
403
404  .. code-block:: sh
405
406    # See the DSA master in current use
407    ip -d link show dev swp0
408        (...)
409        dsa master eth0
410
411    # Static CPU port distribution
412    ip link set swp0 type dsa master eth1
413    ip link set swp1 type dsa master eth0
414    ip link set swp2 type dsa master eth1
415    ip link set swp3 type dsa master eth0
416
417    # CPU ports in LAG, using explicit assignment of the DSA master
418    ip link add bond0 type bond mode balance-xor && ip link set bond0 up
419    ip link set eth1 down && ip link set eth1 master bond0
420    ip link set swp0 type dsa master bond0
421    ip link set swp1 type dsa master bond0
422    ip link set swp2 type dsa master bond0
423    ip link set swp3 type dsa master bond0
424    ip link set eth0 down && ip link set eth0 master bond0
425    ip -d link show dev swp0
426        (...)
427        dsa master bond0
428
429    # CPU ports in LAG, relying on implicit migration of the DSA master
430    ip link add bond0 type bond mode balance-xor && ip link set bond0 up
431    ip link set eth0 down && ip link set eth0 master bond0
432    ip link set eth1 down && ip link set eth1 master bond0
433    ip -d link show dev swp0
434        (...)
435        dsa master bond0
436
437Notice that in the case of CPU ports under a LAG, the use of the
438``IFLA_DSA_MASTER`` netlink attribute is not strictly needed, but rather, DSA
439reacts to the ``IFLA_MASTER`` attribute change of its present master (``eth0``)
440and migrates all user ports to the new upper of ``eth0``, ``bond0``. Similarly,
441when ``bond0`` is destroyed using ``RTM_DELLINK``, DSA migrates the user ports
442that were assigned to this interface to the first physical DSA master which is
443eligible, based on the firmware description (it effectively reverts to the
444startup configuration).
445
446In a setup with more than 2 physical CPU ports, it is therefore possible to mix
447static user to CPU port assignment with LAG between DSA masters. It is not
448possible to statically assign a user port towards a DSA master that has any
449upper interfaces (this includes LAG devices - the master must always be the LAG
450in this case).
451
452Live changing of the DSA master (and thus CPU port) affinity of a user port is
453permitted, in order to allow dynamic redistribution in response to traffic.
454
455Physical DSA masters are allowed to join and leave at any time a LAG interface
456used as a DSA master; however, DSA will reject a LAG interface as a valid
457candidate for being a DSA master unless it has at least one physical DSA master
458as a slave device.
459