13c91d114SIoana Ciornei============
23c91d114SIoana CiorneiArchitecture
33c91d114SIoana Ciornei============
43c91d114SIoana Ciornei
53c91d114SIoana CiorneiThis document describes the **Distributed Switch Architecture (DSA)** subsystem
63c91d114SIoana Ciorneidesign principles, limitations, interactions with other subsystems, and how to
73c91d114SIoana Ciorneidevelop drivers for this subsystem as well as a TODO for developers interested
83c91d114SIoana Ciorneiin joining the effort.
93c91d114SIoana Ciornei
103c91d114SIoana CiorneiDesign principles
113c91d114SIoana Ciornei=================
123c91d114SIoana Ciornei
133c91d114SIoana CiorneiThe Distributed Switch Architecture is a subsystem which was primarily designed
143c91d114SIoana Ciorneito support Marvell Ethernet switches (MV88E6xxx, a.k.a Linkstreet product line)
153c91d114SIoana Ciorneiusing Linux, but has since evolved to support other vendors as well.
163c91d114SIoana Ciornei
173c91d114SIoana CiorneiThe original philosophy behind this design was to be able to use unmodified
183c91d114SIoana CiorneiLinux tools such as bridge, iproute2, ifconfig to work transparently whether
193c91d114SIoana Ciorneithey configured/queried a switch port network device or a regular network
203c91d114SIoana Ciorneidevice.
213c91d114SIoana Ciornei
223c91d114SIoana CiorneiAn Ethernet switch is typically comprised of multiple front-panel ports, and one
233c91d114SIoana Ciorneior more CPU or management port. The DSA subsystem currently relies on the
243c91d114SIoana Ciorneipresence of a management port connected to an Ethernet controller capable of
253c91d114SIoana Ciorneireceiving Ethernet frames from the switch. This is a very common setup for all
263c91d114SIoana Ciorneikinds of Ethernet switches found in Small Home and Office products: routers,
273c91d114SIoana Ciorneigateways, or even top-of-the rack switches. This host Ethernet controller will
283c91d114SIoana Ciorneibe later referred to as "master" and "cpu" in DSA terminology and code.
293c91d114SIoana Ciornei
303c91d114SIoana CiorneiThe D in DSA stands for Distributed, because the subsystem has been designed
313c91d114SIoana Ciorneiwith the ability to configure and manage cascaded switches on top of each other
323c91d114SIoana Ciorneiusing upstream and downstream Ethernet links between switches. These specific
333c91d114SIoana Ciorneiports are referred to as "dsa" ports in DSA terminology and code. A collection
343c91d114SIoana Ciorneiof multiple switches connected to each other is called a "switch tree".
353c91d114SIoana Ciornei
363c91d114SIoana CiorneiFor each front-panel port, DSA will create specialized network devices which are
373c91d114SIoana Ciorneiused as controlling and data-flowing endpoints for use by the Linux networking
383c91d114SIoana Ciorneistack. These specialized network interfaces are referred to as "slave" network
393c91d114SIoana Ciorneiinterfaces in DSA terminology and code.
403c91d114SIoana Ciornei
413c91d114SIoana CiorneiThe ideal case for using DSA is when an Ethernet switch supports a "switch tag"
423c91d114SIoana Ciorneiwhich is a hardware feature making the switch insert a specific tag for each
433c91d114SIoana CiorneiEthernet frames it received to/from specific ports to help the management
443c91d114SIoana Ciorneiinterface figure out:
453c91d114SIoana Ciornei
463c91d114SIoana Ciornei- what port is this frame coming from
473c91d114SIoana Ciornei- what was the reason why this frame got forwarded
483c91d114SIoana Ciornei- how to send CPU originated traffic to specific ports
493c91d114SIoana Ciornei
503c91d114SIoana CiorneiThe subsystem does support switches not capable of inserting/stripping tags, but
513c91d114SIoana Ciorneithe features might be slightly limited in that case (traffic separation relies
523c91d114SIoana Ciorneion Port-based VLAN IDs).
533c91d114SIoana Ciornei
543c91d114SIoana CiorneiNote that DSA does not currently create network interfaces for the "cpu" and
553c91d114SIoana Ciornei"dsa" ports because:
563c91d114SIoana Ciornei
573c91d114SIoana Ciornei- the "cpu" port is the Ethernet switch facing side of the management
583c91d114SIoana Ciornei  controller, and as such, would create a duplication of feature, since you
593c91d114SIoana Ciornei  would get two interfaces for the same conduit: master netdev, and "cpu" netdev
603c91d114SIoana Ciornei
613c91d114SIoana Ciornei- the "dsa" port(s) are just conduits between two or more switches, and as such
623c91d114SIoana Ciornei  cannot really be used as proper network interfaces either, only the
633c91d114SIoana Ciornei  downstream, or the top-most upstream interface makes sense with that model
643c91d114SIoana Ciornei
653c91d114SIoana CiorneiSwitch tagging protocols
663c91d114SIoana Ciornei------------------------
673c91d114SIoana Ciornei
683c91d114SIoana CiorneiDSA currently supports 5 different tagging protocols, and a tag-less mode as
693c91d114SIoana Ciorneiwell. The different protocols are implemented in:
703c91d114SIoana Ciornei
713c91d114SIoana Ciornei- ``net/dsa/tag_trailer.c``: Marvell's 4 trailer tag mode (legacy)
723c91d114SIoana Ciornei- ``net/dsa/tag_dsa.c``: Marvell's original DSA tag
733c91d114SIoana Ciornei- ``net/dsa/tag_edsa.c``: Marvell's enhanced DSA tag
743c91d114SIoana Ciornei- ``net/dsa/tag_brcm.c``: Broadcom's 4 bytes tag
753c91d114SIoana Ciornei- ``net/dsa/tag_qca.c``: Qualcomm's 2 bytes tag
763c91d114SIoana Ciornei
773c91d114SIoana CiorneiThe exact format of the tag protocol is vendor specific, but in general, they
783c91d114SIoana Ciorneiall contain something which:
793c91d114SIoana Ciornei
803c91d114SIoana Ciornei- identifies which port the Ethernet frame came from/should be sent to
813c91d114SIoana Ciornei- provides a reason why this frame was forwarded to the management interface
823c91d114SIoana Ciornei
833c91d114SIoana CiorneiMaster network devices
843c91d114SIoana Ciornei----------------------
853c91d114SIoana Ciornei
863c91d114SIoana CiorneiMaster network devices are regular, unmodified Linux network device drivers for
873c91d114SIoana Ciorneithe CPU/management Ethernet interface. Such a driver might occasionally need to
883c91d114SIoana Ciorneiknow whether DSA is enabled (e.g.: to enable/disable specific offload features),
893c91d114SIoana Ciorneibut the DSA subsystem has been proven to work with industry standard drivers:
903c91d114SIoana Ciornei``e1000e,`` ``mv643xx_eth`` etc. without having to introduce modifications to these
913c91d114SIoana Ciorneidrivers. Such network devices are also often referred to as conduit network
923c91d114SIoana Ciorneidevices since they act as a pipe between the host processor and the hardware
933c91d114SIoana CiorneiEthernet switch.
943c91d114SIoana Ciornei
953c91d114SIoana CiorneiNetworking stack hooks
963c91d114SIoana Ciornei----------------------
973c91d114SIoana Ciornei
983c91d114SIoana CiorneiWhen a master netdev is used with DSA, a small hook is placed in in the
993c91d114SIoana Ciorneinetworking stack is in order to have the DSA subsystem process the Ethernet
1003c91d114SIoana Ciorneiswitch specific tagging protocol. DSA accomplishes this by registering a
1013c91d114SIoana Ciorneispecific (and fake) Ethernet type (later becoming ``skb->protocol``) with the
1023c91d114SIoana Ciorneinetworking stack, this is also known as a ``ptype`` or ``packet_type``. A typical
1033c91d114SIoana CiorneiEthernet Frame receive sequence looks like this:
1043c91d114SIoana Ciornei
1053c91d114SIoana CiorneiMaster network device (e.g.: e1000e):
1063c91d114SIoana Ciornei
1073c91d114SIoana Ciornei1. Receive interrupt fires:
1083c91d114SIoana Ciornei
1093c91d114SIoana Ciornei        - receive function is invoked
1103c91d114SIoana Ciornei        - basic packet processing is done: getting length, status etc.
1113c91d114SIoana Ciornei        - packet is prepared to be processed by the Ethernet layer by calling
1123c91d114SIoana Ciornei          ``eth_type_trans``
1133c91d114SIoana Ciornei
1143c91d114SIoana Ciornei2. net/ethernet/eth.c::
1153c91d114SIoana Ciornei
1163c91d114SIoana Ciornei          eth_type_trans(skb, dev)
1173c91d114SIoana Ciornei                  if (dev->dsa_ptr != NULL)
1183c91d114SIoana Ciornei                          -> skb->protocol = ETH_P_XDSA
1193c91d114SIoana Ciornei
1203c91d114SIoana Ciornei3. drivers/net/ethernet/\*::
1213c91d114SIoana Ciornei
1223c91d114SIoana Ciornei          netif_receive_skb(skb)
1233c91d114SIoana Ciornei                  -> iterate over registered packet_type
1243c91d114SIoana Ciornei                          -> invoke handler for ETH_P_XDSA, calls dsa_switch_rcv()
1253c91d114SIoana Ciornei
1263c91d114SIoana Ciornei4. net/dsa/dsa.c::
1273c91d114SIoana Ciornei
1283c91d114SIoana Ciornei          -> dsa_switch_rcv()
1293c91d114SIoana Ciornei                  -> invoke switch tag specific protocol handler in 'net/dsa/tag_*.c'
1303c91d114SIoana Ciornei
1313c91d114SIoana Ciornei5. net/dsa/tag_*.c:
1323c91d114SIoana Ciornei
1333c91d114SIoana Ciornei        - inspect and strip switch tag protocol to determine originating port
1343c91d114SIoana Ciornei        - locate per-port network device
1353c91d114SIoana Ciornei        - invoke ``eth_type_trans()`` with the DSA slave network device
1363c91d114SIoana Ciornei        - invoked ``netif_receive_skb()``
1373c91d114SIoana Ciornei
1383c91d114SIoana CiorneiPast this point, the DSA slave network devices get delivered regular Ethernet
1393c91d114SIoana Ciorneiframes that can be processed by the networking stack.
1403c91d114SIoana Ciornei
1413c91d114SIoana CiorneiSlave network devices
1423c91d114SIoana Ciornei---------------------
1433c91d114SIoana Ciornei
1443c91d114SIoana CiorneiSlave network devices created by DSA are stacked on top of their master network
1453c91d114SIoana Ciorneidevice, each of these network interfaces will be responsible for being a
1463c91d114SIoana Ciorneicontrolling and data-flowing end-point for each front-panel port of the switch.
1473c91d114SIoana CiorneiThese interfaces are specialized in order to:
1483c91d114SIoana Ciornei
1493c91d114SIoana Ciornei- insert/remove the switch tag protocol (if it exists) when sending traffic
1503c91d114SIoana Ciornei  to/from specific switch ports
1513c91d114SIoana Ciornei- query the switch for ethtool operations: statistics, link state,
1523c91d114SIoana Ciornei  Wake-on-LAN, register dumps...
1533c91d114SIoana Ciornei- external/internal PHY management: link, auto-negotiation etc.
1543c91d114SIoana Ciornei
1553c91d114SIoana CiorneiThese slave network devices have custom net_device_ops and ethtool_ops function
1563c91d114SIoana Ciorneipointers which allow DSA to introduce a level of layering between the networking
1573c91d114SIoana Ciorneistack/ethtool, and the switch driver implementation.
1583c91d114SIoana Ciornei
1593c91d114SIoana CiorneiUpon frame transmission from these slave network devices, DSA will look up which
1603c91d114SIoana Ciorneiswitch tagging protocol is currently registered with these network devices, and
1613c91d114SIoana Ciorneiinvoke a specific transmit routine which takes care of adding the relevant
1623c91d114SIoana Ciorneiswitch tag in the Ethernet frames.
1633c91d114SIoana Ciornei
1643c91d114SIoana CiorneiThese frames are then queued for transmission using the master network device
1653c91d114SIoana Ciornei``ndo_start_xmit()`` function, since they contain the appropriate switch tag, the
1663c91d114SIoana CiorneiEthernet switch will be able to process these incoming frames from the
1673c91d114SIoana Ciorneimanagement interface and delivers these frames to the physical switch port.
1683c91d114SIoana Ciornei
1693c91d114SIoana CiorneiGraphical representation
1703c91d114SIoana Ciornei------------------------
1713c91d114SIoana Ciornei
1723c91d114SIoana CiorneiSummarized, this is basically how DSA looks like from a network device
1733c91d114SIoana Ciorneiperspective::
1743c91d114SIoana Ciornei
1753c91d114SIoana Ciornei
1763c91d114SIoana Ciornei                |---------------------------
1773c91d114SIoana Ciornei                | CPU network device (eth0)|
1783c91d114SIoana Ciornei                ----------------------------
1793c91d114SIoana Ciornei                | <tag added by switch     |
1803c91d114SIoana Ciornei                |                          |
1813c91d114SIoana Ciornei                |                          |
1823c91d114SIoana Ciornei                |        tag added by CPU> |
1833c91d114SIoana Ciornei        |--------------------------------------------|
1843c91d114SIoana Ciornei        |            Switch driver                   |
1853c91d114SIoana Ciornei        |--------------------------------------------|
1863c91d114SIoana Ciornei                  ||        ||         ||
1873c91d114SIoana Ciornei              |-------|  |-------|  |-------|
1883c91d114SIoana Ciornei              | sw0p0 |  | sw0p1 |  | sw0p2 |
1893c91d114SIoana Ciornei              |-------|  |-------|  |-------|
1903c91d114SIoana Ciornei
1913c91d114SIoana Ciornei
1923c91d114SIoana Ciornei
1933c91d114SIoana CiorneiSlave MDIO bus
1943c91d114SIoana Ciornei--------------
1953c91d114SIoana Ciornei
1963c91d114SIoana CiorneiIn order to be able to read to/from a switch PHY built into it, DSA creates a
1973c91d114SIoana Ciorneislave MDIO bus which allows a specific switch driver to divert and intercept
1983c91d114SIoana CiorneiMDIO reads/writes towards specific PHY addresses. In most MDIO-connected
1993c91d114SIoana Ciorneiswitches, these functions would utilize direct or indirect PHY addressing mode
2003c91d114SIoana Ciorneito return standard MII registers from the switch builtin PHYs, allowing the PHY
2013c91d114SIoana Ciorneilibrary and/or to return link status, link partner pages, auto-negotiation
2023c91d114SIoana Ciorneiresults etc..
2033c91d114SIoana Ciornei
2043c91d114SIoana CiorneiFor Ethernet switches which have both external and internal MDIO busses, the
2053c91d114SIoana Ciorneislave MII bus can be utilized to mux/demux MDIO reads and writes towards either
2063c91d114SIoana Ciorneiinternal or external MDIO devices this switch might be connected to: internal
2073c91d114SIoana CiorneiPHYs, external PHYs, or even external switches.
2083c91d114SIoana Ciornei
2093c91d114SIoana CiorneiData structures
2103c91d114SIoana Ciornei---------------
2113c91d114SIoana Ciornei
2123c91d114SIoana CiorneiDSA data structures are defined in ``include/net/dsa.h`` as well as
2133c91d114SIoana Ciornei``net/dsa/dsa_priv.h``:
2143c91d114SIoana Ciornei
2153c91d114SIoana Ciornei- ``dsa_chip_data``: platform data configuration for a given switch device,
2163c91d114SIoana Ciornei  this structure describes a switch device's parent device, its address, as
2173c91d114SIoana Ciornei  well as various properties of its ports: names/labels, and finally a routing
2183c91d114SIoana Ciornei  table indication (when cascading switches)
2193c91d114SIoana Ciornei
2203c91d114SIoana Ciornei- ``dsa_platform_data``: platform device configuration data which can reference
2213c91d114SIoana Ciornei  a collection of dsa_chip_data structure if multiples switches are cascaded,
2223c91d114SIoana Ciornei  the master network device this switch tree is attached to needs to be
2233c91d114SIoana Ciornei  referenced
2243c91d114SIoana Ciornei
2253c91d114SIoana Ciornei- ``dsa_switch_tree``: structure assigned to the master network device under
2263c91d114SIoana Ciornei  ``dsa_ptr``, this structure references a dsa_platform_data structure as well as
2273c91d114SIoana Ciornei  the tagging protocol supported by the switch tree, and which receive/transmit
2283c91d114SIoana Ciornei  function hooks should be invoked, information about the directly attached
2293c91d114SIoana Ciornei  switch is also provided: CPU port. Finally, a collection of dsa_switch are
2303c91d114SIoana Ciornei  referenced to address individual switches in the tree.
2313c91d114SIoana Ciornei
2323c91d114SIoana Ciornei- ``dsa_switch``: structure describing a switch device in the tree, referencing
2333c91d114SIoana Ciornei  a ``dsa_switch_tree`` as a backpointer, slave network devices, master network
2343c91d114SIoana Ciornei  device, and a reference to the backing``dsa_switch_ops``
2353c91d114SIoana Ciornei
2363c91d114SIoana Ciornei- ``dsa_switch_ops``: structure referencing function pointers, see below for a
2373c91d114SIoana Ciornei  full description.
2383c91d114SIoana Ciornei
2393c91d114SIoana CiorneiDesign limitations
2403c91d114SIoana Ciornei==================
2413c91d114SIoana Ciornei
2423c91d114SIoana CiorneiLimits on the number of devices and ports
2433c91d114SIoana Ciornei-----------------------------------------
2443c91d114SIoana Ciornei
2453c91d114SIoana CiorneiDSA currently limits the number of maximum switches within a tree to 4
2463c91d114SIoana Ciornei(``DSA_MAX_SWITCHES``), and the number of ports per switch to 12 (``DSA_MAX_PORTS``).
2473c91d114SIoana CiorneiThese limits could be extended to support larger configurations would this need
2483c91d114SIoana Ciorneiarise.
2493c91d114SIoana Ciornei
2503c91d114SIoana CiorneiLack of CPU/DSA network devices
2513c91d114SIoana Ciornei-------------------------------
2523c91d114SIoana Ciornei
2533c91d114SIoana CiorneiDSA does not currently create slave network devices for the CPU or DSA ports, as
2543c91d114SIoana Ciorneidescribed before. This might be an issue in the following cases:
2553c91d114SIoana Ciornei
2563c91d114SIoana Ciornei- inability to fetch switch CPU port statistics counters using ethtool, which
2573c91d114SIoana Ciornei  can make it harder to debug MDIO switch connected using xMII interfaces
2583c91d114SIoana Ciornei
2593c91d114SIoana Ciornei- inability to configure the CPU port link parameters based on the Ethernet
2603c91d114SIoana Ciornei  controller capabilities attached to it: http://patchwork.ozlabs.org/patch/509806/
2613c91d114SIoana Ciornei
2623c91d114SIoana Ciornei- inability to configure specific VLAN IDs / trunking VLANs between switches
2633c91d114SIoana Ciornei  when using a cascaded setup
2643c91d114SIoana Ciornei
2653c91d114SIoana CiorneiCommon pitfalls using DSA setups
2663c91d114SIoana Ciornei--------------------------------
2673c91d114SIoana Ciornei
2683c91d114SIoana CiorneiOnce a master network device is configured to use DSA (dev->dsa_ptr becomes
2693c91d114SIoana Ciorneinon-NULL), and the switch behind it expects a tagging protocol, this network
2703c91d114SIoana Ciorneiinterface can only exclusively be used as a conduit interface. Sending packets
2713c91d114SIoana Ciorneidirectly through this interface (e.g.: opening a socket using this interface)
2723c91d114SIoana Ciorneiwill not make us go through the switch tagging protocol transmit function, so
2733c91d114SIoana Ciorneithe Ethernet switch on the other end, expecting a tag will typically drop this
2743c91d114SIoana Ciorneiframe.
2753c91d114SIoana Ciornei
2763c91d114SIoana CiorneiSlave network devices check that the master network device is UP before allowing
2773c91d114SIoana Ciorneiyou to administratively bring UP these slave network devices. A common
2783c91d114SIoana Ciorneiconfiguration mistake is forgetting to bring UP the master network device first.
2793c91d114SIoana Ciornei
2803c91d114SIoana CiorneiInteractions with other subsystems
2813c91d114SIoana Ciornei==================================
2823c91d114SIoana Ciornei
2833c91d114SIoana CiorneiDSA currently leverages the following subsystems:
2843c91d114SIoana Ciornei
2853c91d114SIoana Ciornei- MDIO/PHY library: ``drivers/net/phy/phy.c``, ``mdio_bus.c``
2863c91d114SIoana Ciornei- Switchdev:``net/switchdev/*``
2873c91d114SIoana Ciornei- Device Tree for various of_* functions
2883c91d114SIoana Ciornei
2893c91d114SIoana CiorneiMDIO/PHY library
2903c91d114SIoana Ciornei----------------
2913c91d114SIoana Ciornei
2923c91d114SIoana CiorneiSlave network devices exposed by DSA may or may not be interfacing with PHY
2933c91d114SIoana Ciorneidevices (``struct phy_device`` as defined in ``include/linux/phy.h)``, but the DSA
2943c91d114SIoana Ciorneisubsystem deals with all possible combinations:
2953c91d114SIoana Ciornei
2963c91d114SIoana Ciornei- internal PHY devices, built into the Ethernet switch hardware
2973c91d114SIoana Ciornei- external PHY devices, connected via an internal or external MDIO bus
2983c91d114SIoana Ciornei- internal PHY devices, connected via an internal MDIO bus
2993c91d114SIoana Ciornei- special, non-autonegotiated or non MDIO-managed PHY devices: SFPs, MoCA; a.k.a
3003c91d114SIoana Ciornei  fixed PHYs
3013c91d114SIoana Ciornei
3023c91d114SIoana CiorneiThe PHY configuration is done by the ``dsa_slave_phy_setup()`` function and the
3033c91d114SIoana Ciorneilogic basically looks like this:
3043c91d114SIoana Ciornei
3053c91d114SIoana Ciornei- if Device Tree is used, the PHY device is looked up using the standard
3063c91d114SIoana Ciornei  "phy-handle" property, if found, this PHY device is created and registered
3073c91d114SIoana Ciornei  using ``of_phy_connect()``
3083c91d114SIoana Ciornei
3093c91d114SIoana Ciornei- if Device Tree is used, and the PHY device is "fixed", that is, conforms to
3103c91d114SIoana Ciornei  the definition of a non-MDIO managed PHY as defined in
3113c91d114SIoana Ciornei  ``Documentation/devicetree/bindings/net/fixed-link.txt``, the PHY is registered
3123c91d114SIoana Ciornei  and connected transparently using the special fixed MDIO bus driver
3133c91d114SIoana Ciornei
3143c91d114SIoana Ciornei- finally, if the PHY is built into the switch, as is very common with
3153c91d114SIoana Ciornei  standalone switch packages, the PHY is probed using the slave MII bus created
3163c91d114SIoana Ciornei  by DSA
3173c91d114SIoana Ciornei
3183c91d114SIoana Ciornei
3193c91d114SIoana CiorneiSWITCHDEV
3203c91d114SIoana Ciornei---------
3213c91d114SIoana Ciornei
3223c91d114SIoana CiorneiDSA directly utilizes SWITCHDEV when interfacing with the bridge layer, and
3233c91d114SIoana Ciorneimore specifically with its VLAN filtering portion when configuring VLANs on top
3243c91d114SIoana Ciorneiof per-port slave network devices. Since DSA primarily deals with
3253c91d114SIoana CiorneiMDIO-connected switches, although not exclusively, SWITCHDEV's
3263c91d114SIoana Ciorneiprepare/abort/commit phases are often simplified into a prepare phase which
3273c91d114SIoana Ciorneichecks whether the operation is supported by the DSA switch driver, and a commit
3283c91d114SIoana Ciorneiphase which applies the changes.
3293c91d114SIoana Ciornei
3303c91d114SIoana CiorneiAs of today, the only SWITCHDEV objects supported by DSA are the FDB and VLAN
3313c91d114SIoana Ciorneiobjects.
3323c91d114SIoana Ciornei
3333c91d114SIoana CiorneiDevice Tree
3343c91d114SIoana Ciornei-----------
3353c91d114SIoana Ciornei
3363c91d114SIoana CiorneiDSA features a standardized binding which is documented in
3373c91d114SIoana Ciornei``Documentation/devicetree/bindings/net/dsa/dsa.txt``. PHY/MDIO library helper
3383c91d114SIoana Ciorneifunctions such as ``of_get_phy_mode()``, ``of_phy_connect()`` are also used to query
3393c91d114SIoana Ciorneiper-port PHY specific details: interface connection, MDIO bus location etc..
3403c91d114SIoana Ciornei
3413c91d114SIoana CiorneiDriver development
3423c91d114SIoana Ciornei==================
3433c91d114SIoana Ciornei
3443c91d114SIoana CiorneiDSA switch drivers need to implement a dsa_switch_ops structure which will
3453c91d114SIoana Ciorneicontain the various members described below.
3463c91d114SIoana Ciornei
3473c91d114SIoana Ciornei``register_switch_driver()`` registers this dsa_switch_ops in its internal list
3483c91d114SIoana Ciorneiof drivers to probe for. ``unregister_switch_driver()`` does the exact opposite.
3493c91d114SIoana Ciornei
3503c91d114SIoana CiorneiUnless requested differently by setting the priv_size member accordingly, DSA
3513c91d114SIoana Ciorneidoes not allocate any driver private context space.
3523c91d114SIoana Ciornei
3533c91d114SIoana CiorneiSwitch configuration
3543c91d114SIoana Ciornei--------------------
3553c91d114SIoana Ciornei
3563c91d114SIoana Ciornei- ``tag_protocol``: this is to indicate what kind of tagging protocol is supported,
3573c91d114SIoana Ciornei  should be a valid value from the ``dsa_tag_protocol`` enum
3583c91d114SIoana Ciornei
3593c91d114SIoana Ciornei- ``probe``: probe routine which will be invoked by the DSA platform device upon
3603c91d114SIoana Ciornei  registration to test for the presence/absence of a switch device. For MDIO
3613c91d114SIoana Ciornei  devices, it is recommended to issue a read towards internal registers using
3623c91d114SIoana Ciornei  the switch pseudo-PHY and return whether this is a supported device. For other
3633c91d114SIoana Ciornei  buses, return a non-NULL string
3643c91d114SIoana Ciornei
3653c91d114SIoana Ciornei- ``setup``: setup function for the switch, this function is responsible for setting
3663c91d114SIoana Ciornei  up the ``dsa_switch_ops`` private structure with all it needs: register maps,
3673c91d114SIoana Ciornei  interrupts, mutexes, locks etc.. This function is also expected to properly
3683c91d114SIoana Ciornei  configure the switch to separate all network interfaces from each other, that
3693c91d114SIoana Ciornei  is, they should be isolated by the switch hardware itself, typically by creating
3703c91d114SIoana Ciornei  a Port-based VLAN ID for each port and allowing only the CPU port and the
3713c91d114SIoana Ciornei  specific port to be in the forwarding vector. Ports that are unused by the
3723c91d114SIoana Ciornei  platform should be disabled. Past this function, the switch is expected to be
3733c91d114SIoana Ciornei  fully configured and ready to serve any kind of request. It is recommended
3743c91d114SIoana Ciornei  to issue a software reset of the switch during this setup function in order to
3753c91d114SIoana Ciornei  avoid relying on what a previous software agent such as a bootloader/firmware
3763c91d114SIoana Ciornei  may have previously configured.
3773c91d114SIoana Ciornei
3783c91d114SIoana CiorneiPHY devices and link management
3793c91d114SIoana Ciornei-------------------------------
3803c91d114SIoana Ciornei
3813c91d114SIoana Ciornei- ``get_phy_flags``: Some switches are interfaced to various kinds of Ethernet PHYs,
3823c91d114SIoana Ciornei  if the PHY library PHY driver needs to know about information it cannot obtain
3833c91d114SIoana Ciornei  on its own (e.g.: coming from switch memory mapped registers), this function
3843c91d114SIoana Ciornei  should return a 32-bits bitmask of "flags", that is private between the switch
3853c91d114SIoana Ciornei  driver and the Ethernet PHY driver in ``drivers/net/phy/\*``.
3863c91d114SIoana Ciornei
3873c91d114SIoana Ciornei- ``phy_read``: Function invoked by the DSA slave MDIO bus when attempting to read
3883c91d114SIoana Ciornei  the switch port MDIO registers. If unavailable, return 0xffff for each read.
3893c91d114SIoana Ciornei  For builtin switch Ethernet PHYs, this function should allow reading the link
3903c91d114SIoana Ciornei  status, auto-negotiation results, link partner pages etc..
3913c91d114SIoana Ciornei
3923c91d114SIoana Ciornei- ``phy_write``: Function invoked by the DSA slave MDIO bus when attempting to write
3933c91d114SIoana Ciornei  to the switch port MDIO registers. If unavailable return a negative error
3943c91d114SIoana Ciornei  code.
3953c91d114SIoana Ciornei
3963c91d114SIoana Ciornei- ``adjust_link``: Function invoked by the PHY library when a slave network device
3973c91d114SIoana Ciornei  is attached to a PHY device. This function is responsible for appropriately
3983c91d114SIoana Ciornei  configuring the switch port link parameters: speed, duplex, pause based on
3993c91d114SIoana Ciornei  what the ``phy_device`` is providing.
4003c91d114SIoana Ciornei
4013c91d114SIoana Ciornei- ``fixed_link_update``: Function invoked by the PHY library, and specifically by
4023c91d114SIoana Ciornei  the fixed PHY driver asking the switch driver for link parameters that could
4033c91d114SIoana Ciornei  not be auto-negotiated, or obtained by reading the PHY registers through MDIO.
4043c91d114SIoana Ciornei  This is particularly useful for specific kinds of hardware such as QSGMII,
4053c91d114SIoana Ciornei  MoCA or other kinds of non-MDIO managed PHYs where out of band link
4063c91d114SIoana Ciornei  information is obtained
4073c91d114SIoana Ciornei
4083c91d114SIoana CiorneiEthtool operations
4093c91d114SIoana Ciornei------------------
4103c91d114SIoana Ciornei
4113c91d114SIoana Ciornei- ``get_strings``: ethtool function used to query the driver's strings, will
4123c91d114SIoana Ciornei  typically return statistics strings, private flags strings etc.
4133c91d114SIoana Ciornei
4143c91d114SIoana Ciornei- ``get_ethtool_stats``: ethtool function used to query per-port statistics and
4153c91d114SIoana Ciornei  return their values. DSA overlays slave network devices general statistics:
4163c91d114SIoana Ciornei  RX/TX counters from the network device, with switch driver specific statistics
4173c91d114SIoana Ciornei  per port
4183c91d114SIoana Ciornei
4193c91d114SIoana Ciornei- ``get_sset_count``: ethtool function used to query the number of statistics items
4203c91d114SIoana Ciornei
4213c91d114SIoana Ciornei- ``get_wol``: ethtool function used to obtain Wake-on-LAN settings per-port, this
4223c91d114SIoana Ciornei  function may, for certain implementations also query the master network device
4233c91d114SIoana Ciornei  Wake-on-LAN settings if this interface needs to participate in Wake-on-LAN
4243c91d114SIoana Ciornei
4253c91d114SIoana Ciornei- ``set_wol``: ethtool function used to configure Wake-on-LAN settings per-port,
4263c91d114SIoana Ciornei  direct counterpart to set_wol with similar restrictions
4273c91d114SIoana Ciornei
4283c91d114SIoana Ciornei- ``set_eee``: ethtool function which is used to configure a switch port EEE (Green
4293c91d114SIoana Ciornei  Ethernet) settings, can optionally invoke the PHY library to enable EEE at the
4303c91d114SIoana Ciornei  PHY level if relevant. This function should enable EEE at the switch port MAC
4313c91d114SIoana Ciornei  controller and data-processing logic
4323c91d114SIoana Ciornei
4333c91d114SIoana Ciornei- ``get_eee``: ethtool function which is used to query a switch port EEE settings,
4343c91d114SIoana Ciornei  this function should return the EEE state of the switch port MAC controller
4353c91d114SIoana Ciornei  and data-processing logic as well as query the PHY for its currently configured
4363c91d114SIoana Ciornei  EEE settings
4373c91d114SIoana Ciornei
4383c91d114SIoana Ciornei- ``get_eeprom_len``: ethtool function returning for a given switch the EEPROM
4393c91d114SIoana Ciornei  length/size in bytes
4403c91d114SIoana Ciornei
4413c91d114SIoana Ciornei- ``get_eeprom``: ethtool function returning for a given switch the EEPROM contents
4423c91d114SIoana Ciornei
4433c91d114SIoana Ciornei- ``set_eeprom``: ethtool function writing specified data to a given switch EEPROM
4443c91d114SIoana Ciornei
4453c91d114SIoana Ciornei- ``get_regs_len``: ethtool function returning the register length for a given
4463c91d114SIoana Ciornei  switch
4473c91d114SIoana Ciornei
4483c91d114SIoana Ciornei- ``get_regs``: ethtool function returning the Ethernet switch internal register
4493c91d114SIoana Ciornei  contents. This function might require user-land code in ethtool to
4503c91d114SIoana Ciornei  pretty-print register values and registers
4513c91d114SIoana Ciornei
4523c91d114SIoana CiorneiPower management
4533c91d114SIoana Ciornei----------------
4543c91d114SIoana Ciornei
4553c91d114SIoana Ciornei- ``suspend``: function invoked by the DSA platform device when the system goes to
4563c91d114SIoana Ciornei  suspend, should quiesce all Ethernet switch activities, but keep ports
4573c91d114SIoana Ciornei  participating in Wake-on-LAN active as well as additional wake-up logic if
4583c91d114SIoana Ciornei  supported
4593c91d114SIoana Ciornei
4603c91d114SIoana Ciornei- ``resume``: function invoked by the DSA platform device when the system resumes,
4613c91d114SIoana Ciornei  should resume all Ethernet switch activities and re-configure the switch to be
4623c91d114SIoana Ciornei  in a fully active state
4633c91d114SIoana Ciornei
4643c91d114SIoana Ciornei- ``port_enable``: function invoked by the DSA slave network device ndo_open
4653c91d114SIoana Ciornei  function when a port is administratively brought up, this function should be
4663c91d114SIoana Ciornei  fully enabling a given switch port. DSA takes care of marking the port with
4673c91d114SIoana Ciornei  ``BR_STATE_BLOCKING`` if the port is a bridge member, or ``BR_STATE_FORWARDING`` if it
4683c91d114SIoana Ciornei  was not, and propagating these changes down to the hardware
4693c91d114SIoana Ciornei
4703c91d114SIoana Ciornei- ``port_disable``: function invoked by the DSA slave network device ndo_close
4713c91d114SIoana Ciornei  function when a port is administratively brought down, this function should be
4723c91d114SIoana Ciornei  fully disabling a given switch port. DSA takes care of marking the port with
4733c91d114SIoana Ciornei  ``BR_STATE_DISABLED`` and propagating changes to the hardware if this port is
4743c91d114SIoana Ciornei  disabled while being a bridge member
4753c91d114SIoana Ciornei
4763c91d114SIoana CiorneiBridge layer
4773c91d114SIoana Ciornei------------
4783c91d114SIoana Ciornei
4793c91d114SIoana Ciornei- ``port_bridge_join``: bridge layer function invoked when a given switch port is
4803c91d114SIoana Ciornei  added to a bridge, this function should be doing the necessary at the switch
4813c91d114SIoana Ciornei  level to permit the joining port from being added to the relevant logical
4823c91d114SIoana Ciornei  domain for it to ingress/egress traffic with other members of the bridge.
4833c91d114SIoana Ciornei
4843c91d114SIoana Ciornei- ``port_bridge_leave``: bridge layer function invoked when a given switch port is
4853c91d114SIoana Ciornei  removed from a bridge, this function should be doing the necessary at the
4863c91d114SIoana Ciornei  switch level to deny the leaving port from ingress/egress traffic from the
4873c91d114SIoana Ciornei  remaining bridge members. When the port leaves the bridge, it should be aged
4883c91d114SIoana Ciornei  out at the switch hardware for the switch to (re) learn MAC addresses behind
4893c91d114SIoana Ciornei  this port.
4903c91d114SIoana Ciornei
4913c91d114SIoana Ciornei- ``port_stp_state_set``: bridge layer function invoked when a given switch port STP
4923c91d114SIoana Ciornei  state is computed by the bridge layer and should be propagated to switch
4933c91d114SIoana Ciornei  hardware to forward/block/learn traffic. The switch driver is responsible for
4943c91d114SIoana Ciornei  computing a STP state change based on current and asked parameters and perform
4953c91d114SIoana Ciornei  the relevant ageing based on the intersection results
4963c91d114SIoana Ciornei
4973c91d114SIoana CiorneiBridge VLAN filtering
4983c91d114SIoana Ciornei---------------------
4993c91d114SIoana Ciornei
5003c91d114SIoana Ciornei- ``port_vlan_filtering``: bridge layer function invoked when the bridge gets
5013c91d114SIoana Ciornei  configured for turning on or off VLAN filtering. If nothing specific needs to
5023c91d114SIoana Ciornei  be done at the hardware level, this callback does not need to be implemented.
5033c91d114SIoana Ciornei  When VLAN filtering is turned on, the hardware must be programmed with
5043c91d114SIoana Ciornei  rejecting 802.1Q frames which have VLAN IDs outside of the programmed allowed
5053c91d114SIoana Ciornei  VLAN ID map/rules.  If there is no PVID programmed into the switch port,
5063c91d114SIoana Ciornei  untagged frames must be rejected as well. When turned off the switch must
5073c91d114SIoana Ciornei  accept any 802.1Q frames irrespective of their VLAN ID, and untagged frames are
5083c91d114SIoana Ciornei  allowed.
5093c91d114SIoana Ciornei
5103c91d114SIoana Ciornei- ``port_vlan_prepare``: bridge layer function invoked when the bridge prepares the
5113c91d114SIoana Ciornei  configuration of a VLAN on the given port. If the operation is not supported
5123c91d114SIoana Ciornei  by the hardware, this function should return ``-EOPNOTSUPP`` to inform the bridge
5133c91d114SIoana Ciornei  code to fallback to a software implementation. No hardware setup must be done
5143c91d114SIoana Ciornei  in this function. See port_vlan_add for this and details.
5153c91d114SIoana Ciornei
5163c91d114SIoana Ciornei- ``port_vlan_add``: bridge layer function invoked when a VLAN is configured
5173c91d114SIoana Ciornei  (tagged or untagged) for the given switch port
5183c91d114SIoana Ciornei
5193c91d114SIoana Ciornei- ``port_vlan_del``: bridge layer function invoked when a VLAN is removed from the
5203c91d114SIoana Ciornei  given switch port
5213c91d114SIoana Ciornei
5223c91d114SIoana Ciornei- ``port_vlan_dump``: bridge layer function invoked with a switchdev callback
5233c91d114SIoana Ciornei  function that the driver has to call for each VLAN the given port is a member
5243c91d114SIoana Ciornei  of. A switchdev object is used to carry the VID and bridge flags.
5253c91d114SIoana Ciornei
5263c91d114SIoana Ciornei- ``port_fdb_add``: bridge layer function invoked when the bridge wants to install a
5273c91d114SIoana Ciornei  Forwarding Database entry, the switch hardware should be programmed with the
5283c91d114SIoana Ciornei  specified address in the specified VLAN Id in the forwarding database
5293c91d114SIoana Ciornei  associated with this VLAN ID. If the operation is not supported, this
5303c91d114SIoana Ciornei  function should return ``-EOPNOTSUPP`` to inform the bridge code to fallback to
5313c91d114SIoana Ciornei  a software implementation.
5323c91d114SIoana Ciornei
5333c91d114SIoana Ciornei.. note:: VLAN ID 0 corresponds to the port private database, which, in the context
5343c91d114SIoana Ciornei        of DSA, would be the its port-based VLAN, used by the associated bridge device.
5353c91d114SIoana Ciornei
5363c91d114SIoana Ciornei- ``port_fdb_del``: bridge layer function invoked when the bridge wants to remove a
5373c91d114SIoana Ciornei  Forwarding Database entry, the switch hardware should be programmed to delete
5383c91d114SIoana Ciornei  the specified MAC address from the specified VLAN ID if it was mapped into
5393c91d114SIoana Ciornei  this port forwarding database
5403c91d114SIoana Ciornei
5413c91d114SIoana Ciornei- ``port_fdb_dump``: bridge layer function invoked with a switchdev callback
5423c91d114SIoana Ciornei  function that the driver has to call for each MAC address known to be behind
5433c91d114SIoana Ciornei  the given port. A switchdev object is used to carry the VID and FDB info.
5443c91d114SIoana Ciornei
5453c91d114SIoana Ciornei- ``port_mdb_prepare``: bridge layer function invoked when the bridge prepares the
5463c91d114SIoana Ciornei  installation of a multicast database entry. If the operation is not supported,
5473c91d114SIoana Ciornei  this function should return ``-EOPNOTSUPP`` to inform the bridge code to fallback
5483c91d114SIoana Ciornei  to a software implementation. No hardware setup must be done in this function.
5493c91d114SIoana Ciornei  See ``port_fdb_add`` for this and details.
5503c91d114SIoana Ciornei
5513c91d114SIoana Ciornei- ``port_mdb_add``: bridge layer function invoked when the bridge wants to install
5523c91d114SIoana Ciornei  a multicast database entry, the switch hardware should be programmed with the
5533c91d114SIoana Ciornei  specified address in the specified VLAN ID in the forwarding database
5543c91d114SIoana Ciornei  associated with this VLAN ID.
5553c91d114SIoana Ciornei
5563c91d114SIoana Ciornei.. note:: VLAN ID 0 corresponds to the port private database, which, in the context
5573c91d114SIoana Ciornei        of DSA, would be the its port-based VLAN, used by the associated bridge device.
5583c91d114SIoana Ciornei
5593c91d114SIoana Ciornei- ``port_mdb_del``: bridge layer function invoked when the bridge wants to remove a
5603c91d114SIoana Ciornei  multicast database entry, the switch hardware should be programmed to delete
5613c91d114SIoana Ciornei  the specified MAC address from the specified VLAN ID if it was mapped into
5623c91d114SIoana Ciornei  this port forwarding database.
5633c91d114SIoana Ciornei
5643c91d114SIoana Ciornei- ``port_mdb_dump``: bridge layer function invoked with a switchdev callback
5653c91d114SIoana Ciornei  function that the driver has to call for each MAC address known to be behind
5663c91d114SIoana Ciornei  the given port. A switchdev object is used to carry the VID and MDB info.
5673c91d114SIoana Ciornei
5683c91d114SIoana CiorneiTODO
5693c91d114SIoana Ciornei====
5703c91d114SIoana Ciornei
5713c91d114SIoana CiorneiMaking SWITCHDEV and DSA converge towards an unified codebase
5723c91d114SIoana Ciornei-------------------------------------------------------------
5733c91d114SIoana Ciornei
5743c91d114SIoana CiorneiSWITCHDEV properly takes care of abstracting the networking stack with offload
5753c91d114SIoana Ciorneicapable hardware, but does not enforce a strict switch device driver model. On
5763c91d114SIoana Ciorneithe other DSA enforces a fairly strict device driver model, and deals with most
5773c91d114SIoana Ciorneiof the switch specific. At some point we should envision a merger between these
5783c91d114SIoana Ciorneitwo subsystems and get the best of both worlds.
5793c91d114SIoana Ciornei
5803c91d114SIoana CiorneiOther hanging fruits
5813c91d114SIoana Ciornei--------------------
5823c91d114SIoana Ciornei
5833c91d114SIoana Ciornei- making the number of ports fully dynamic and not dependent on ``DSA_MAX_PORTS``
5843c91d114SIoana Ciornei- allowing more than one CPU/management interface:
5853c91d114SIoana Ciornei  http://comments.gmane.org/gmane.linux.network/365657
5863c91d114SIoana Ciornei- porting more drivers from other vendors:
5873c91d114SIoana Ciornei  http://comments.gmane.org/gmane.linux.network/365510
588