147592097SVladimir Oltean=========================
247592097SVladimir OlteanNXP SJA1105 switch driver
347592097SVladimir Oltean=========================
447592097SVladimir Oltean
547592097SVladimir OlteanOverview
647592097SVladimir Oltean========
747592097SVladimir Oltean
847592097SVladimir OlteanThe NXP SJA1105 is a family of 6 devices:
947592097SVladimir Oltean
1047592097SVladimir Oltean- SJA1105E: First generation, no TTEthernet
1147592097SVladimir Oltean- SJA1105T: First generation, TTEthernet
1247592097SVladimir Oltean- SJA1105P: Second generation, no TTEthernet, no SGMII
1347592097SVladimir Oltean- SJA1105Q: Second generation, TTEthernet, no SGMII
1447592097SVladimir Oltean- SJA1105R: Second generation, no TTEthernet, SGMII
1547592097SVladimir Oltean- SJA1105S: Second generation, TTEthernet, SGMII
1647592097SVladimir Oltean
1747592097SVladimir OlteanThese are SPI-managed automotive switches, with all ports being gigabit
1847592097SVladimir Olteancapable, and supporting MII/RMII/RGMII and optionally SGMII on one port.
1947592097SVladimir Oltean
2047592097SVladimir OlteanBeing automotive parts, their configuration interface is geared towards
2147592097SVladimir Olteanset-and-forget use, with minimal dynamic interaction at runtime. They
2247592097SVladimir Olteanrequire a static configuration to be composed by software and packed
2347592097SVladimir Olteanwith CRC and table headers, and sent over SPI.
2447592097SVladimir Oltean
2547592097SVladimir OlteanThe static configuration is composed of several configuration tables. Each
2647592097SVladimir Olteantable takes a number of entries. Some configuration tables can be (partially)
2747592097SVladimir Olteanreconfigured at runtime, some not. Some tables are mandatory, some not:
2847592097SVladimir Oltean
2947592097SVladimir Oltean============================= ================== =============================
3047592097SVladimir OlteanTable                          Mandatory          Reconfigurable
3147592097SVladimir Oltean============================= ================== =============================
3247592097SVladimir OlteanSchedule                       no                 no
3347592097SVladimir OlteanSchedule entry points          if Scheduling      no
3447592097SVladimir OlteanVL Lookup                      no                 no
3547592097SVladimir OlteanVL Policing                    if VL Lookup       no
3647592097SVladimir OlteanVL Forwarding                  if VL Lookup       no
3747592097SVladimir OlteanL2 Lookup                      no                 no
3847592097SVladimir OlteanL2 Policing                    yes                no
3947592097SVladimir OlteanVLAN Lookup                    yes                yes
4047592097SVladimir OlteanL2 Forwarding                  yes                partially (fully on P/Q/R/S)
4147592097SVladimir OlteanMAC Config                     yes                partially (fully on P/Q/R/S)
4247592097SVladimir OlteanSchedule Params                if Scheduling      no
4347592097SVladimir OlteanSchedule Entry Points Params   if Scheduling      no
4447592097SVladimir OlteanVL Forwarding Params           if VL Forwarding   no
4547592097SVladimir OlteanL2 Lookup Params               no                 partially (fully on P/Q/R/S)
4647592097SVladimir OlteanL2 Forwarding Params           yes                no
4747592097SVladimir OlteanClock Sync Params              no                 no
4847592097SVladimir OlteanAVB Params                     no                 no
4947592097SVladimir OlteanGeneral Params                 yes                partially
5047592097SVladimir OlteanRetagging                      no                 yes
5147592097SVladimir OlteanxMII Params                    yes                no
5247592097SVladimir OlteanSGMII                          no                 yes
5347592097SVladimir Oltean============================= ================== =============================
5447592097SVladimir Oltean
5547592097SVladimir Oltean
5647592097SVladimir OlteanAlso the configuration is write-only (software cannot read it back from the
5747592097SVladimir Olteanswitch except for very few exceptions).
5847592097SVladimir Oltean
5947592097SVladimir OlteanThe driver creates a static configuration at probe time, and keeps it at
6047592097SVladimir Olteanall times in memory, as a shadow for the hardware state. When required to
6147592097SVladimir Olteanchange a hardware setting, the static configuration is also updated.
6247592097SVladimir OlteanIf that changed setting can be transmitted to the switch through the dynamic
6347592097SVladimir Olteanreconfiguration interface, it is; otherwise the switch is reset and
6447592097SVladimir Olteanreprogrammed with the updated static configuration.
6547592097SVladimir Oltean
660a58d471SVladimir OlteanTraffic support
670a58d471SVladimir Oltean===============
680a58d471SVladimir Oltean
690a58d471SVladimir OlteanThe switches do not support switch tagging in hardware. But they do support
700a58d471SVladimir Olteancustomizing the TPID by which VLAN traffic is identified as such. The switch
710a58d471SVladimir Olteandriver is leveraging ``CONFIG_NET_DSA_TAG_8021Q`` by requesting that special
720a58d471SVladimir OlteanVLANs (with a custom TPID of ``ETH_P_EDSA`` instead of ``ETH_P_8021Q``) are
730a58d471SVladimir Olteaninstalled on its ports when not in ``vlan_filtering`` mode. This does not
740a58d471SVladimir Olteaninterfere with the reception and transmission of real 802.1Q-tagged traffic,
750a58d471SVladimir Olteanbecause the switch does no longer parse those packets as VLAN after the TPID
760a58d471SVladimir Olteanchange.
770a58d471SVladimir OlteanThe TPID is restored when ``vlan_filtering`` is requested by the user through
780a58d471SVladimir Olteanthe bridge layer, and general IP termination becomes no longer possible through
790a58d471SVladimir Olteanthe switch netdevices in this mode.
800a58d471SVladimir Oltean
810a58d471SVladimir OlteanThe switches have two programmable filters for link-local destination MACs.
820a58d471SVladimir OlteanThese are used to trap BPDUs and PTP traffic to the master netdevice, and are
830a58d471SVladimir Olteanfurther used to support STP and 1588 ordinary clock/boundary clock
840a58d471SVladimir Olteanfunctionality.
850a58d471SVladimir Oltean
860a58d471SVladimir OlteanThe following traffic modes are supported over the switch netdevices:
870a58d471SVladimir Oltean
880a58d471SVladimir Oltean+--------------------+------------+------------------+------------------+
890a58d471SVladimir Oltean|                    | Standalone | Bridged with     | Bridged with     |
900a58d471SVladimir Oltean|                    | ports      | vlan_filtering 0 | vlan_filtering 1 |
910a58d471SVladimir Oltean+====================+============+==================+==================+
920a58d471SVladimir Oltean| Regular traffic    |     Yes    |       Yes        |  No (use master) |
930a58d471SVladimir Oltean+--------------------+------------+------------------+------------------+
940a58d471SVladimir Oltean| Management traffic |     Yes    |       Yes        |       Yes        |
950a58d471SVladimir Oltean| (BPDU, PTP)        |            |                  |                  |
960a58d471SVladimir Oltean+--------------------+------------+------------------+------------------+
970a58d471SVladimir Oltean
9847592097SVladimir OlteanSwitching features
9947592097SVladimir Oltean==================
10047592097SVladimir Oltean
10147592097SVladimir OlteanThe driver supports the configuration of L2 forwarding rules in hardware for
10247592097SVladimir Olteanport bridging. The forwarding, broadcast and flooding domain between ports can
10347592097SVladimir Olteanbe restricted through two methods: either at the L2 forwarding level (isolate
10447592097SVladimir Olteanone bridge's ports from another's) or at the VLAN port membership level
10547592097SVladimir Oltean(isolate ports within the same bridge). The final forwarding decision taken by
10647592097SVladimir Olteanthe hardware is a logical AND of these two sets of rules.
10747592097SVladimir Oltean
10847592097SVladimir OlteanThe hardware tags all traffic internally with a port-based VLAN (pvid), or it
10947592097SVladimir Olteandecodes the VLAN information from the 802.1Q tag. Advanced VLAN classification
11047592097SVladimir Olteanis not possible. Once attributed a VLAN tag, frames are checked against the
11147592097SVladimir Olteanport's membership rules and dropped at ingress if they don't match any VLAN.
11247592097SVladimir OlteanThis behavior is available when switch ports are enslaved to a bridge with
11347592097SVladimir Oltean``vlan_filtering 1``.
11447592097SVladimir Oltean
11547592097SVladimir OlteanNormally the hardware is not configurable with respect to VLAN awareness, but
11647592097SVladimir Olteanby changing what TPID the switch searches 802.1Q tags for, the semantics of a
11747592097SVladimir Olteanbridge with ``vlan_filtering 0`` can be kept (accept all traffic, tagged or
11847592097SVladimir Olteanuntagged), and therefore this mode is also supported.
11947592097SVladimir Oltean
12047592097SVladimir OlteanSegregating the switch ports in multiple bridges is supported (e.g. 2 + 2), but
12147592097SVladimir Olteanall bridges should have the same level of VLAN awareness (either both have
12247592097SVladimir Oltean``vlan_filtering`` 0, or both 1). Also an inevitable limitation of the fact
12347592097SVladimir Olteanthat VLAN awareness is global at the switch level is that once a bridge with
12447592097SVladimir Oltean``vlan_filtering`` enslaves at least one switch port, the other un-bridged
12547592097SVladimir Olteanports are no longer available for standalone traffic termination.
12647592097SVladimir Oltean
1270a58d471SVladimir OlteanTopology and loop detection through STP is supported.
1280a58d471SVladimir Oltean
1290a58d471SVladimir OlteanL2 FDB manipulation (add/delete/dump) is currently possible for the first
1300a58d471SVladimir Olteangeneration devices. Aging time of FDB entries, as well as enabling fully static
1310a58d471SVladimir Olteanmanagement (no address learning and no flooding of unknown traffic) is not yet
1320a58d471SVladimir Olteanconfigurable in the driver.
1330a58d471SVladimir Oltean
1340a58d471SVladimir OlteanA special comment about bridging with other netdevices (illustrated with an
1350a58d471SVladimir Olteanexample):
1360a58d471SVladimir Oltean
1370a58d471SVladimir OlteanA board has eth0, eth1, swp0@eth1, swp1@eth1, swp2@eth1, swp3@eth1.
1380a58d471SVladimir OlteanThe switch ports (swp0-3) are under br0.
1390a58d471SVladimir OlteanIt is desired that eth0 is turned into another switched port that communicates
1400a58d471SVladimir Olteanwith swp0-3.
1410a58d471SVladimir Oltean
1420a58d471SVladimir OlteanIf br0 has vlan_filtering 0, then eth0 can simply be added to br0 with the
1430a58d471SVladimir Olteanintended results.
1440a58d471SVladimir OlteanIf br0 has vlan_filtering 1, then a new br1 interface needs to be created that
1450a58d471SVladimir Olteanenslaves eth0 and eth1 (the DSA master of the switch ports). This is because in
1460a58d471SVladimir Olteanthis mode, the switch ports beneath br0 are not capable of regular traffic, and
1470a58d471SVladimir Olteanare only used as a conduit for switchdev operations.
1480a58d471SVladimir Oltean
1497c95afa4SVladimir OlteanOffloads
1507c95afa4SVladimir Oltean========
1517c95afa4SVladimir Oltean
1527c95afa4SVladimir OlteanTime-aware scheduling
1537c95afa4SVladimir Oltean---------------------
1547c95afa4SVladimir Oltean
1557c95afa4SVladimir OlteanThe switch supports a variation of the enhancements for scheduled traffic
1567c95afa4SVladimir Olteanspecified in IEEE 802.1Q-2018 (formerly 802.1Qbv). This means it can be used to
1577c95afa4SVladimir Olteanensure deterministic latency for priority traffic that is sent in-band with its
1587c95afa4SVladimir Olteangate-open event in the network schedule.
1597c95afa4SVladimir Oltean
1607c95afa4SVladimir OlteanThis capability can be managed through the tc-taprio offload ('flags 2'). The
1617c95afa4SVladimir Olteandifference compared to the software implementation of taprio is that the latter
1627c95afa4SVladimir Olteanwould only be able to shape traffic originated from the CPU, but not
1637c95afa4SVladimir Olteanautonomously forwarded flows.
1647c95afa4SVladimir Oltean
1657c95afa4SVladimir OlteanThe device has 8 traffic classes, and maps incoming frames to one of them based
1667c95afa4SVladimir Olteanon the VLAN PCP bits (if no VLAN is present, the port-based default is used).
1677c95afa4SVladimir OlteanAs described in the previous sections, depending on the value of
1687c95afa4SVladimir Oltean``vlan_filtering``, the EtherType recognized by the switch as being VLAN can
1697c95afa4SVladimir Olteaneither be the typical 0x8100 or a custom value used internally by the driver
1707c95afa4SVladimir Olteanfor tagging. Therefore, the switch ignores the VLAN PCP if used in standalone
1717c95afa4SVladimir Olteanor bridge mode with ``vlan_filtering=0``, as it will not recognize the 0x8100
1727c95afa4SVladimir OlteanEtherType. In these modes, injecting into a particular TX queue can only be
1737c95afa4SVladimir Olteandone by the DSA net devices, which populate the PCP field of the tagging header
1747c95afa4SVladimir Olteanon egress. Using ``vlan_filtering=1``, the behavior is the other way around:
1757c95afa4SVladimir Olteanoffloaded flows can be steered to TX queues based on the VLAN PCP, but the DSA
1767c95afa4SVladimir Olteannet devices are no longer able to do that. To inject frames into a hardware TX
1777c95afa4SVladimir Olteanqueue with VLAN awareness active, it is necessary to create a VLAN
1787c95afa4SVladimir Olteansub-interface on the DSA master port, and send normal (0x8100) VLAN-tagged
1797c95afa4SVladimir Olteantowards the switch, with the VLAN PCP bits set appropriately.
1807c95afa4SVladimir Oltean
1817c95afa4SVladimir OlteanManagement traffic (having DMAC 01-80-C2-xx-xx-xx or 01-19-1B-xx-xx-xx) is the
1827c95afa4SVladimir Olteannotable exception: the switch always treats it with a fixed priority and
1837c95afa4SVladimir Olteandisregards any VLAN PCP bits even if present. The traffic class for management
1847c95afa4SVladimir Olteantraffic has a value of 7 (highest priority) at the moment, which is not
1857c95afa4SVladimir Olteanconfigurable in the driver.
1867c95afa4SVladimir Oltean
1877c95afa4SVladimir OlteanBelow is an example of configuring a 500 us cyclic schedule on egress port
1887c95afa4SVladimir Oltean``swp5``. The traffic class gate for management traffic (7) is open for 100 us,
1897c95afa4SVladimir Olteanand the gates for all other traffic classes are open for 400 us::
1907c95afa4SVladimir Oltean
1917c95afa4SVladimir Oltean  #!/bin/bash
1927c95afa4SVladimir Oltean
1937c95afa4SVladimir Oltean  set -e -u -o pipefail
1947c95afa4SVladimir Oltean
1957c95afa4SVladimir Oltean  NSEC_PER_SEC="1000000000"
1967c95afa4SVladimir Oltean
1977c95afa4SVladimir Oltean  gatemask() {
1987c95afa4SVladimir Oltean          local tc_list="$1"
1997c95afa4SVladimir Oltean          local mask=0
2007c95afa4SVladimir Oltean
2017c95afa4SVladimir Oltean          for tc in ${tc_list}; do
2027c95afa4SVladimir Oltean                  mask=$((${mask} | (1 << ${tc})))
2037c95afa4SVladimir Oltean          done
2047c95afa4SVladimir Oltean
2057c95afa4SVladimir Oltean          printf "%02x" ${mask}
2067c95afa4SVladimir Oltean  }
2077c95afa4SVladimir Oltean
2087c95afa4SVladimir Oltean  if ! systemctl is-active --quiet ptp4l; then
2097c95afa4SVladimir Oltean          echo "Please start the ptp4l service"
2107c95afa4SVladimir Oltean          exit
2117c95afa4SVladimir Oltean  fi
2127c95afa4SVladimir Oltean
2137c95afa4SVladimir Oltean  now=$(phc_ctl /dev/ptp1 get | gawk '/clock time is/ { print $5; }')
2147c95afa4SVladimir Oltean  # Phase-align the base time to the start of the next second.
2157c95afa4SVladimir Oltean  sec=$(echo "${now}" | gawk -F. '{ print $1; }')
2167c95afa4SVladimir Oltean  base_time="$(((${sec} + 1) * ${NSEC_PER_SEC}))"
2177c95afa4SVladimir Oltean
2187c95afa4SVladimir Oltean  tc qdisc add dev swp5 parent root handle 100 taprio \
2197c95afa4SVladimir Oltean          num_tc 8 \
2207c95afa4SVladimir Oltean          map 0 1 2 3 5 6 7 \
2217c95afa4SVladimir Oltean          queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
2227c95afa4SVladimir Oltean          base-time ${base_time} \
2237c95afa4SVladimir Oltean          sched-entry S $(gatemask 7) 100000 \
2247c95afa4SVladimir Oltean          sched-entry S $(gatemask "0 1 2 3 4 5 6") 400000 \
2257c95afa4SVladimir Oltean          flags 2
2267c95afa4SVladimir Oltean
2277c95afa4SVladimir OlteanIt is possible to apply the tc-taprio offload on multiple egress ports. There
2287c95afa4SVladimir Olteanare hardware restrictions related to the fact that no gate event may trigger
2297c95afa4SVladimir Olteansimultaneously on two ports. The driver checks the consistency of the schedules
2307c95afa4SVladimir Olteanagainst this restriction and errors out when appropriate. Schedule analysis is
2317c95afa4SVladimir Olteanneeded to avoid this, which is outside the scope of the document.
2327c95afa4SVladimir Oltean
2337c95afa4SVladimir OlteanAt the moment, the time-aware scheduler can only be triggered based on a
2347c95afa4SVladimir Olteanstandalone clock and not based on PTP time. This means the base-time argument
2357c95afa4SVladimir Olteanfrom tc-taprio is ignored and the schedule starts right away. It also means it
2367c95afa4SVladimir Olteanis more difficult to phase-align the scheduler with the other devices in the
2377c95afa4SVladimir Olteannetwork.
2387c95afa4SVladimir Oltean
23947592097SVladimir OlteanDevice Tree bindings and board design
24047592097SVladimir Oltean=====================================
24147592097SVladimir Oltean
24247592097SVladimir OlteanThis section references ``Documentation/devicetree/bindings/net/dsa/sja1105.txt``
24347592097SVladimir Olteanand aims to showcase some potential switch caveats.
24447592097SVladimir Oltean
24547592097SVladimir OlteanRMII PHY role and out-of-band signaling
24647592097SVladimir Oltean---------------------------------------
24747592097SVladimir Oltean
24847592097SVladimir OlteanIn the RMII spec, the 50 MHz clock signals are either driven by the MAC or by
24947592097SVladimir Olteanan external oscillator (but not by the PHY).
25047592097SVladimir OlteanBut the spec is rather loose and devices go outside it in several ways.
25147592097SVladimir OlteanSome PHYs go against the spec and may provide an output pin where they source
25247592097SVladimir Olteanthe 50 MHz clock themselves, in an attempt to be helpful.
25347592097SVladimir OlteanOn the other hand, the SJA1105 is only binary configurable - when in the RMII
25447592097SVladimir OlteanMAC role it will also attempt to drive the clock signal. To prevent this from
25547592097SVladimir Olteanhappening it must be put in RMII PHY role.
25647592097SVladimir OlteanBut doing so has some unintended consequences.
25747592097SVladimir OlteanIn the RMII spec, the PHY can transmit extra out-of-band signals via RXD[1:0].
25847592097SVladimir OlteanThese are practically some extra code words (/J/ and /K/) sent prior to the
25947592097SVladimir Olteanpreamble of each frame. The MAC does not have this out-of-band signaling
26047592097SVladimir Olteanmechanism defined by the RMII spec.
26147592097SVladimir OlteanSo when the SJA1105 port is put in PHY role to avoid having 2 drivers on the
26247592097SVladimir Olteanclock signal, inevitably an RMII PHY-to-PHY connection is created. The SJA1105
26347592097SVladimir Olteanemulates a PHY interface fully and generates the /J/ and /K/ symbols prior to
26447592097SVladimir Olteanframe preambles, which the real PHY is not expected to understand. So the PHY
26547592097SVladimir Olteansimply encodes the extra symbols received from the SJA1105-as-PHY onto the
26647592097SVladimir Oltean100Base-Tx wire.
26747592097SVladimir OlteanOn the other side of the wire, some link partners might discard these extra
26847592097SVladimir Olteansymbols, while others might choke on them and discard the entire Ethernet
26947592097SVladimir Olteanframes that follow along. This looks like packet loss with some link partners
27047592097SVladimir Olteanbut not with others.
27147592097SVladimir OlteanThe take-away is that in RMII mode, the SJA1105 must be let to drive the
27247592097SVladimir Olteanreference clock if connected to a PHY.
27347592097SVladimir Oltean
27447592097SVladimir OlteanRGMII fixed-link and internal delays
27547592097SVladimir Oltean------------------------------------
27647592097SVladimir Oltean
27747592097SVladimir OlteanAs mentioned in the bindings document, the second generation of devices has
27847592097SVladimir Olteantunable delay lines as part of the MAC, which can be used to establish the
27947592097SVladimir Olteancorrect RGMII timing budget.
28047592097SVladimir OlteanWhen powered up, these can shift the Rx and Tx clocks with a phase difference
28147592097SVladimir Olteanbetween 73.8 and 101.7 degrees.
28247592097SVladimir OlteanThe catch is that the delay lines need to lock onto a clock signal with a
28347592097SVladimir Olteanstable frequency. This means that there must be at least 2 microseconds of
28447592097SVladimir Olteansilence between the clock at the old vs at the new frequency. Otherwise the
28547592097SVladimir Olteanlock is lost and the delay lines must be reset (powered down and back up).
28647592097SVladimir OlteanIn RGMII the clock frequency changes with link speed (125 MHz at 1000 Mbps, 25
28747592097SVladimir OlteanMHz at 100 Mbps and 2.5 MHz at 10 Mbps), and link speed might change during the
28847592097SVladimir OlteanAN process.
28947592097SVladimir OlteanIn the situation where the switch port is connected through an RGMII fixed-link
29047592097SVladimir Olteanto a link partner whose link state life cycle is outside the control of Linux
29147592097SVladimir Oltean(such as a different SoC), then the delay lines would remain unlocked (and
29247592097SVladimir Olteaninactive) until there is manual intervention (ifdown/ifup on the switch port).
29347592097SVladimir OlteanThe take-away is that in RGMII mode, the switch's internal delays are only
29447592097SVladimir Olteanreliable if the link partner never changes link speeds, or if it does, it does
29547592097SVladimir Olteanso in a way that is coordinated with the switch port (practically, both ends of
29647592097SVladimir Olteanthe fixed-link are under control of the same Linux system).
29747592097SVladimir OlteanAs to why would a fixed-link interface ever change link speeds: there are
29847592097SVladimir OlteanEthernet controllers out there which come out of reset in 100 Mbps mode, and
29947592097SVladimir Olteantheir driver inevitably needs to change the speed and clock frequency if it's
30047592097SVladimir Olteanrequired to work at gigabit.
30147592097SVladimir Oltean
30247592097SVladimir OlteanMDIO bus and PHY management
30347592097SVladimir Oltean---------------------------
30447592097SVladimir Oltean
30547592097SVladimir OlteanThe SJA1105 does not have an MDIO bus and does not perform in-band AN either.
30647592097SVladimir OlteanTherefore there is no link state notification coming from the switch device.
30747592097SVladimir OlteanA board would need to hook up the PHYs connected to the switch to any other
30847592097SVladimir OlteanMDIO bus available to Linux within the system (e.g. to the DSA master's MDIO
30947592097SVladimir Olteanbus). Link state management then works by the driver manually keeping in sync
31047592097SVladimir Oltean(over SPI commands) the MAC link speed with the settings negotiated by the PHY.
311