#
4d752508 |
| 27-May-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: sja1105: offload the Credit-Based Shaper qdisc
SJA1105, being AVB/TSN switches, provide hardware assist for the Credit-Based Shaper as described in the IEEE 8021Q-2018 document.
First gen
net: dsa: sja1105: offload the Credit-Based Shaper qdisc
SJA1105, being AVB/TSN switches, provide hardware assist for the Credit-Based Shaper as described in the IEEE 8021Q-2018 document.
First generation has 10 shapers, freely assignable to any of the 4 external ports and 8 traffic classes, and second generation has 16 shapers.
The Credit-Based Shaper tables are accessed through the dynamic reconfiguration interface, so we have to restore them manually after a switch reset. The tables are backed up by the static config only on P/Q/R/S, and we don't want to add custom code only for that family, since the procedure that is in place now works for both.
Tested with the following commands:
data_rate_kbps=67000 port_transmit_rate_kbps=1000000 idleslope=$data_rate_kbps sendslope=$(($idleslope - $port_transmit_rate_kbps)) locredit=$((-0x80000000)) hicredit=$((0x7fffffff)) tc qdisc add dev swp2 root handle 1: mqprio hw 0 num_tc 8 \ map 0 1 2 3 4 5 6 7 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 tc qdisc replace dev swp2 parent 1:1 cbs \ idleslope $idleslope \ sendslope $sendslope \ hicredit $hicredit \ locredit $locredit \ offload 1
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v5.4.43, v5.4.42, v5.4.41 |
|
#
aaa270c6 |
| 12-May-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: sja1105: implement a common frame memory partitioning function
There are 2 different features that require some reserved frame memory space: VLAN retagging and virtual links. Create a cent
net: dsa: sja1105: implement a common frame memory partitioning function
There are 2 different features that require some reserved frame memory space: VLAN retagging and virtual links. Create a central function that modifies the static config and ensures frame memory is never overcommitted.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
88cac0fa |
| 12-May-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: sja1105: add packing ops for the Retagging Table
The Retagging Table is an optional feature that allows the switch to match frames against a {ingress port, egress port, vid} rule and chang
net: dsa: sja1105: add packing ops for the Retagging Table
The Retagging Table is an optional feature that allows the switch to match frames against a {ingress port, egress port, vid} rule and change their VLAN ID. The retagged frames are by default clones of the original ones (since the hardware-foreseen use case was to mirror traffic for debugging purposes and to tag it with a special VLAN for this purpose), but we can force the original frames to be dropped by removing the pre-retagging VLAN from the port membership list of the egress port.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
2cafa72e |
| 12-May-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: sja1105: add a new best_effort_vlan_filtering devlink parameter
This devlink parameter enables the handling of DSA tags when enslaved to a bridge with vlan_filtering=1. There are very good
net: dsa: sja1105: add a new best_effort_vlan_filtering devlink parameter
This devlink parameter enables the handling of DSA tags when enslaved to a bridge with vlan_filtering=1. There are very good reasons to want this, but there are also very good reasons for not enabling it by default. So a devlink param named best_effort_vlan_filtering, currently driver-specific and exported only by sja1105, is used to configure this.
In practice, this is perhaps the way that most users are going to use the switch in. It assumes that no more than 7 VLANs are needed per port.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
38b5beea |
| 12-May-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: sja1105: prepare tagger for handling DSA tags and VLAN simultaneously
In VLAN-unaware mode, sja1105 uses VLAN tags with a custom TPID of 0xdadb. While in the yet-to-be introduced best_effo
net: dsa: sja1105: prepare tagger for handling DSA tags and VLAN simultaneously
In VLAN-unaware mode, sja1105 uses VLAN tags with a custom TPID of 0xdadb. While in the yet-to-be introduced best_effort_vlan_filtering mode, it needs to work with normal VLAN TPID values.
A complication arises when we must transmit a VLAN-tagged packet to the switch when it's in VLAN-aware mode. We need to construct a packet with 2 VLAN tags, and the switch will use the outer header for routing and pop it on egress. But sadly, here the 2 hardware generations don't behave the same:
- E/T switches won't pop an ETH_P_8021AD tag on egress, it seems (packets will remain double-tagged). - P/Q/R/S switches will drop a packet with 2 ETH_P_8021Q tags (it looks like it tries to prevent VLAN hopping).
But looks like the reverse is also true:
- E/T switches have no problem popping the outer tag from packets with 2 ETH_P_8021Q tags. - P/Q/R/S will have no problem popping a single tag even if that is ETH_P_8021AD.
So it is clear that if we want the hardware to work with dsa_8021q tagging in VLAN-aware mode, we need to send different TPIDs depending on revision. Keep that information in priv->info->qinq_tpid.
The per-port tagger structure will hold an xmit_tpid value that depends not only upon the qinq_tpid, but also upon the VLAN awareness state itself (in case we must transmit using 0xdadb).
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
ec5ae610 |
| 12-May-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: sja1105: save/restore VLANs using a delta commit method
Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan
net: dsa: sja1105: save/restore VLANs using a delta commit method
Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way.
In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs.
Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs.
Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch.
This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
60b33aeb |
| 12-May-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: sja1105: deny alterations of dsa_8021q VLANs from the bridge
At the moment, this can never happen. The 2 modes that we operate in do not permit that:
- SJA1105_VLAN_UNAWARE: we are guard
net: dsa: sja1105: deny alterations of dsa_8021q VLANs from the bridge
At the moment, this can never happen. The 2 modes that we operate in do not permit that:
- SJA1105_VLAN_UNAWARE: we are guarded from bridge VLANs added by the user by the DSA core. We will later lift this restriction by setting ds->vlan_bridge_vtu = true, and that is where we'll need it.
- SJA1105_VLAN_FILTERING_FULL: in this mode, dsa_8021q configuration is disabled. So the user is free to add these VLANs in the 1024-3071 range.
The reason for the patch is that we'll introduce a third VLAN awareness state, where both dsa_8021q as well as the bridge are going to call our .port_vlan_add and .port_vlan_del methods.
For that, we need a good way to discriminate between the 2. The easiest (and less intrusive way for upper layers) is to recognize the fact that dsa_8021q configurations are always driven by our driver - we _know_ when a .port_vlan_add method will be called from dsa_8021q because _we_ initiated it.
So introduce an expect_dsa_8021q boolean which is only used, at the moment, for blacklisting VLANs in range 1024-3071 in the modes when dsa_8021q is active.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
7f14937f |
| 12-May-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: sja1105: keep the VLAN awareness state in a driver variable
Soon we'll add a third operating mode to the driver. Introduce a vlan_state to make things more easy to manage, and use it where
net: dsa: sja1105: keep the VLAN awareness state in a driver variable
Soon we'll add a third operating mode to the driver. Introduce a vlan_state to make things more easy to manage, and use it where applicable.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
ac02a451 |
| 10-May-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: sja1105: implement cross-chip bridging operations
sja1105 uses dsa_8021q for DSA tagging, a format which is VLAN at heart and which is compatible with cascading. A complete description of
net: dsa: sja1105: implement cross-chip bridging operations
sja1105 uses dsa_8021q for DSA tagging, a format which is VLAN at heart and which is compatible with cascading. A complete description of this tagging format is in net/dsa/tag_8021q.c, but a quick summary is that each external-facing port tags incoming frames with a unique pvid, and this special VLAN is transmitted as tagged towards the inside of the system, and as untagged towards the exterior. The tag encodes the switch id and the source port index.
This means that cross-chip bridging for dsa_8021q only entails adding the dsa_8021q pvids of one switch to the RX filter of the other switches. Everything else falls naturally into place, as long as the bottom-end of ports (the leaves in the tree) is comprised exclusively of dsa_8021q-compatible (i.e. sja1105 switches). Otherwise, there would be a chance that a front-panel switch transmits a packet tagged with a dsa_8021q header, header which it wouldn't be able to remove, and which would hence "leak" out.
The only use case I tested (due to lack of board availability) was when the sja1105 switches are part of disjoint trees (however, this doesn't change the fact that multiple sja1105 switches still need unique switch identifiers in such a system). But in principle, even "true" single-tree setups (with DSA links) should work just as fine, except for a small change which I can't test: dsa_towards_port should be used instead of dsa_upstream_port (I made the assumption that the routing port that any sja1105 should use towards its neighbours is the CPU port. That might not hold true in other setups).
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
Revision tags: v5.4.40, v5.4.39 |
|
#
834f8933 |
| 05-May-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: sja1105: implement tc-gate using time-triggered virtual links
Restrict the TTEthernet hardware support on this switch to operate as closely as possible to IEEE 802.1Qci as possible. This m
net: dsa: sja1105: implement tc-gate using time-triggered virtual links
Restrict the TTEthernet hardware support on this switch to operate as closely as possible to IEEE 802.1Qci as possible. This means that it can perform PTP-time-based ingress admission control on streams identified by {DMAC, VID, PCP}, which is useful when trying to ensure the determinism of traffic scheduled via IEEE 802.1Qbv.
The oddity comes from the fact that in hardware (and in TTEthernet at large), virtual links always need a full-blown action, including not only the type of policing, but also the list of destination ports. So in practice, a single tc-gate action will result in all packets getting dropped. Additional actions (either "trap" or "redirect") need to be specified in the same filter rule such that the conforming packets are actually forwarded somewhere.
Apart from the VL Lookup, Policing and Forwarding tables which need to be programmed for each flow (virtual link), the Schedule engine also needs to be told to open/close the admission gates for each individual virtual link. A fairly accurate (and detailed) description of how that works is already present in sja1105_tas.c, since it is already used to trigger the egress gates for the tc-taprio offload (IEEE 802.1Qbv). Key point here, we remember that the schedule engine supports 8 "subschedules" (execution threads that iterate through the global schedule in parallel, and that no 2 hardware threads must execute a schedule entry at the same time). For tc-taprio, each egress port used one of these 8 subschedules, leaving a total of 4 subschedules unused. In principle we could have allocated 1 subschedule for the tc-gate offload of each ingress port, but actually the schedules of all virtual links installed on each ingress port would have needed to be merged together, before they could have been programmed to hardware. So simplify our life and just merge the entire tc-gate configuration, for all virtual links on all ingress ports, into a single subschedule. Be sure to check that against the usual hardware scheduling conflicts, and program it to hardware alongside any tc-taprio subschedule that may be present.
The following scenarios were tested:
1. Quantitative testing:
tc qdisc add dev swp2 clsact tc filter add dev swp2 ingress flower skip_sw \ dst_mac 42:be:24:9b:76:20 \ action gate index 1 base-time 0 \ sched-entry OPEN 1200 -1 -1 \ sched-entry CLOSE 1200 -1 -1 \ action trap
ping 192.168.1.2 -f PING 192.168.1.2 (192.168.1.2) 56(84) bytes of data. ............................. --- 192.168.1.2 ping statistics --- 948 packets transmitted, 467 received, 50.7384% packet loss, time 9671ms
2. Qualitative testing (with a phase-aligned schedule - the clocks are synchronized by ptp4l, not shown here):
Receiver (sja1105):
tc qdisc add dev swp2 clsact now=$(phc_ctl /dev/ptp1 get | awk '/clock time is/ {print $5}') && \ sec=$(echo $now | awk -F. '{print $1}') && \ base_time="$(((sec + 2) * 1000000000))" && \ echo "base time ${base_time}" tc filter add dev swp2 ingress flower skip_sw \ dst_mac 42:be:24:9b:76:20 \ action gate base-time ${base_time} \ sched-entry OPEN 60000 -1 -1 \ sched-entry CLOSE 40000 -1 -1 \ action trap
Sender (enetc): now=$(phc_ctl /dev/ptp0 get | awk '/clock time is/ {print $5}') && \ sec=$(echo $now | awk -F. '{print $1}') && \ base_time="$(((sec + 2) * 1000000000))" && \ echo "base time ${base_time}" tc qdisc add dev eno0 parent root taprio \ num_tc 8 \ map 0 1 2 3 4 5 6 7 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \ base-time ${base_time} \ sched-entry S 01 50000 \ sched-entry S 00 50000 \ flags 2
ping -A 192.168.1.1 PING 192.168.1.1 (192.168.1.1): 56 data bytes ... ^C --- 192.168.1.1 ping statistics --- 1425 packets transmitted, 1424 packets received, 0% packet loss round-trip min/avg/max = 0.322/0.361/0.990 ms
And just for comparison, with the tc-taprio schedule deleted:
ping -A 192.168.1.1 PING 192.168.1.1 (192.168.1.1): 56 data bytes ... ^C --- 192.168.1.1 ping statistics --- 33 packets transmitted, 19 packets received, 42% packet loss round-trip min/avg/max = 0.336/0.464/0.597 ms
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
dfacc5a2 |
| 05-May-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: sja1105: support flow-based redirection via virtual links
Implement tc-flower offloads for redirect, trap and drop using non-critical virtual links.
Commands which were tested to work are
net: dsa: sja1105: support flow-based redirection via virtual links
Implement tc-flower offloads for redirect, trap and drop using non-critical virtual links.
Commands which were tested to work are:
# Send frames received on swp2 with a DA of 42:be:24:9b:76:20 to the # CPU and to swp3. This type of key (DA only) when the port's VLAN # awareness state is off. tc qdisc add dev swp2 clsact tc filter add dev swp2 ingress flower skip_sw dst_mac 42:be:24:9b:76:20 \ action mirred egress redirect dev swp3 \ action trap
# Drop frames received on swp2 with a DA of 42:be:24:9b:76:20, a VID # of 100 and a PCP of 0. tc filter add dev swp2 ingress protocol 802.1Q flower skip_sw \ dst_mac 42:be:24:9b:76:20 vlan_id 100 vlan_prio 0 action drop
Under the hood, all rules match on DMAC, VID and PCP, but when VLAN filtering is disabled, those are set internally by the driver to the port-based defaults. Because we would be put in an awkward situation if the user were to change the VLAN filtering state while there are active rules (packets would no longer match on the specified keys), we simply deny changing vlan_filtering unless the list of flows offloaded via virtual links is empty. Then the user can re-add new rules.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
b70bb8d4 |
| 05-May-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: sja1105: make room for virtual link parsing in flower offload
Virtual links are a sja1105 hardware concept of executing various flow actions based on a key extracted from the frame's DMAC,
net: dsa: sja1105: make room for virtual link parsing in flower offload
Virtual links are a sja1105 hardware concept of executing various flow actions based on a key extracted from the frame's DMAC, VID and PCP.
Currently the tc-flower offload code supports only parsing the DMAC if that is the broadcast MAC address, and the VLAN PCP. Extract the key parsing logic from the L2 policers functionality and move it into its own function, after adding extra logic for matching on any DMAC and VID.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
94f94d4a |
| 05-May-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: sja1105: add static tables for virtual links
This patch adds the register definitions for the: - VL Lookup Table - VL Policing Table - VL Forwarding Table - VL Forwarding Parameters Table
net: dsa: sja1105: add static tables for virtual links
This patch adds the register definitions for the: - VL Lookup Table - VL Policing Table - VL Forwarding Table - VL Forwarding Parameters Table
These are needed in order to perform TTEthernet operations: QoS classification, flow-based policing and/or frame redirecting with the switch.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v5.4.38, v5.4.37, v5.4.36, v5.4.35, v5.4.34 |
|
#
135e3018 |
| 17-Apr-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: sja1105: enable internal pull-down for RX_DV/CRS_DV/RX_CTL and RX_ER
Some boards do not have the RX_ER MII signal connected. Normally in such situation, those pins would be grounded, but t
net: dsa: sja1105: enable internal pull-down for RX_DV/CRS_DV/RX_CTL and RX_ER
Some boards do not have the RX_ER MII signal connected. Normally in such situation, those pins would be grounded, but then again, some boards left it electrically floating.
When sending traffic to those switch ports, one can see that the N_SOFERR statistics counter is incrementing once per each packet. The user manual states for this counter that it may count the number of frames "that have the MII error input being asserted prior to or up to the SOF delimiter byte". So the switch MAC is sampling an electrically floating signal, and preventing proper traffic reception because of that.
As a workaround, enable the internal weak pull-downs on the input pads for the MII control signals. This way, a floating signal would be internally tied to ground.
The logic levels of signals which _are_ externally driven should not be bothered by this 40-50 KOhm internal resistor. So it is not an issue to enable the internal pull-down unconditionally, irrespective of PHY interface type (MII, RMII, RGMII, SGMII) and of board layout.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v5.4.33, v5.4.32, v5.4.31, v5.4.30, v5.4.29, v5.6 |
|
#
a6af7763 |
| 29-Mar-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: sja1105: add broadcast and per-traffic class policers
This patch adds complete support for manipulating the L2 Policing Tables from this switch. There are 45 table entries, one entry per e
net: dsa: sja1105: add broadcast and per-traffic class policers
This patch adds complete support for manipulating the L2 Policing Tables from this switch. There are 45 table entries, one entry per each port and traffic class, and one dedicated entry for broadcast traffic for each ingress port.
Policing entries are shareable, and we use this functionality to support shared block filters.
We are modeling broadcast policers as simple tc-flower matches on dst_mac. As for the traffic class policers, the switch only deduces the traffic class from the VLAN PCP field, so it makes sense to model this as a tc-flower match on vlan_prio.
How to limit broadcast traffic coming from all front-panel ports to a cumulated total of 10 Mbit/s:
tc qdisc add dev sw0p0 ingress_block 1 clsact tc qdisc add dev sw0p1 ingress_block 1 clsact tc qdisc add dev sw0p2 ingress_block 1 clsact tc qdisc add dev sw0p3 ingress_block 1 clsact tc filter add block 1 flower skip_sw dst_mac ff:ff:ff:ff:ff:ff \ action police rate 10mbit burst 64k
How to limit traffic with VLAN PCP 0 (also includes untagged traffic) to 100 Mbit/s on port 0 only:
tc filter add dev sw0p0 ingress protocol 802.1Q flower skip_sw \ vlan_prio 0 action police rate 100mbit burst 64k
The broadcast, VLAN PCP and port policers are compatible with one another (can be installed at the same time on a port).
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
336aa67b |
| 27-Mar-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: sja1105: show more ethtool statistics counters for P/Q/R/S
It looks like the P/Q/R/S series supports some more counters, generically named "Ethernet statistics counter", which we were not
net: dsa: sja1105: show more ethtool statistics counters for P/Q/R/S
It looks like the P/Q/R/S series supports some more counters, generically named "Ethernet statistics counter", which we were not printing. Add them.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
c279c726 |
| 27-Mar-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: sja1105: implement the port MTU callbacks
On this switch, the frame length enforcements are performed by the ingress policers. There are 2 types of those: regular L2 (also called best-effo
net: dsa: sja1105: implement the port MTU callbacks
On this switch, the frame length enforcements are performed by the ingress policers. There are 2 types of those: regular L2 (also called best-effort) and Virtual Link policers (an ARINC664/AFDX concept for defining L2 streams with certain QoS abilities). To avoid future confusion, I prefer to call the reset reason "Best-effort policers", even though the VL policers are not yet supported.
We also need to change the setup of the initial static config, such that DSA calls to .change_mtu (which are expensive) become no-ops and don't reset the switch 5 times.
A driver-level decision is to unconditionally allow single VLAN-tagged traffic on all ports. The CPU port must accept an additional VLAN header for the DSA tag, which is again a driver-level decision.
The policers actually count bytes not only from the SDU, but also from the Ethernet header and FCS, so those need to be accounted for as well.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v5.4.28 |
|
#
747e5eb3 |
| 23-Mar-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: sja1105: configure the PTP_CLK pin as EXT_TS or PER_OUT
The SJA1105 switch family has a PTP_CLK pin which emits a signal with fixed 50% duty cycle, but variable frequency and programmable
net: dsa: sja1105: configure the PTP_CLK pin as EXT_TS or PER_OUT
The SJA1105 switch family has a PTP_CLK pin which emits a signal with fixed 50% duty cycle, but variable frequency and programmable start time.
On the second generation (P/Q/R/S) switches, this pin supports even more functionality. The use case described by the hardware documents talks about synchronization via oneshot pulses: given 2 sja1105 switches, arbitrarily designated as a master and a slave, the master emits a single pulse on PTP_CLK, while the slave is configured to timestamp this pulse received on its PTP_CLK pin (which must obviously be configured as input). The difference between the timestamps then exactly becomes the slave offset to the master.
The only trouble with the above is that the hardware is very much tied into this use case only, and not very generic beyond that: - When emitting a oneshot pulse, instead of being told when to emit it, the switch just does it "now" and tells you later what time it was, via the PTPSYNCTS register. [ Incidentally, this is the same register that the slave uses to collect the ext_ts timestamp from, too. ] - On the sync slave, there is no interrupt mechanism on reception of a new extts, and no FIFO to buffer them, because in the foreseen use case, software is in control of both the master and the slave pins, so it "knows" when there's something to collect.
These 2 problems mean that: - We don't support (at least yet) the quirky oneshot mode exposed by the hardware, just normal periodic output. - We abuse the hardware a little bit when we expose generic extts. Because there's no interrupt mechanism, we need to poll at double the frequency we expect to receive a pulse. Currently that means a non-configurable "twice a second".
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
0a7e984c |
| 23-Mar-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: sja1105: make the AVB table dynamically reconfigurable
The AVB table contains the CAS_MASTER field (to be added in the next patch) which decides the direction of the PTP_CLK pin.
Reconfig
net: dsa: sja1105: make the AVB table dynamically reconfigurable
The AVB table contains the CAS_MASTER field (to be added in the next patch) which decides the direction of the PTP_CLK pin.
Reconfiguring this field dynamically is highly preferable to having to reset the switch and upload a new static configuration, so we add support for exactly that.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v5.4.27 |
|
#
ffe10e67 |
| 20-Mar-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: sja1105: Add support for the SGMII port
SJA1105 switches R and S have one SerDes port with an 802.3z quasi-compatible PCS, hardwired on port 4. The other ports are still MII/RMII/RGMII. Th
net: dsa: sja1105: Add support for the SGMII port
SJA1105 switches R and S have one SerDes port with an 802.3z quasi-compatible PCS, hardwired on port 4. The other ports are still MII/RMII/RGMII. The PCS performs rate adaptation to lower link speeds; the MAC on this port is hardwired at gigabit. Only full duplex is supported.
The SGMII port can be configured as part of the static config tables, as well as through a dedicated SPI address region for its pseudo-clause-22 registers. However it looks like the static configuration is not able to change some out-of-reset values (like the value of MII_BMCR), so at the end of the day, having code for it is utterly pointless. We are just going to use the pseudo-C22 interface.
Because the PCS gets reset when the switch resets, we have to add even more restoration logic to sja1105_static_config_reload, otherwise the SGMII port breaks after operations such as enabling PTP timestamping which require a switch reset.
>From PHYLINK perspective, the switch supports *only* SGMII (it doesn't support 1000Base-X). It also doesn't expose access to the raw config word for in-band AN in registers MII_ADV/MII_LPA. It is able to work in the following modes: - Forced speed - SGMII in-band AN slave (speed received from PHY) - SGMII in-band AN master (acting as a PHY)
The latter mode is not supported by this patch. It is even unclear to me how that would be described. There is some code for it left in the patch, but 'an_master' is always passed as false.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v5.4.26, v5.4.25, v5.4.24, v5.4.23, v5.4.22, v5.4.21, v5.4.20, v5.4.19, v5.4.18, v5.4.17, v5.4.16, v5.5, v5.4.15, v5.4.14, v5.4.13, v5.4.12, v5.4.11, v5.4.10, v5.4.9, v5.4.8, v5.4.7, v5.4.6, v5.4.5, v5.4.4, v5.4.3, v5.3.15, v5.4.2, v5.4.1, v5.3.14, v5.4, v5.3.13, v5.3.12 |
|
#
abfb228a |
| 12-Nov-2019 |
Vladimir Oltean <olteanv@gmail.com> |
net: dsa: sja1105: Simplify reset handling
We don't really need 10k species of reset. Remove everything except cold reset which is what is actually used. Too bad the hardware designers couldn't agre
net: dsa: sja1105: Simplify reset handling
We don't really need 10k species of reset. Remove everything except cold reset which is what is actually used. Too bad the hardware designers couldn't agree to use the same bit field for rev 1 and rev 2, so the (*reset_cmd) function pointer is there to stay.
However let's simplify the prototype and give it a struct dsa_switch (we want to avoid forward-declarations of structures, in this case struct sja1105_private, wherever we can).
Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v5.3.11 |
|
#
86db36a3 |
| 11-Nov-2019 |
Vladimir Oltean <olteanv@gmail.com> |
net: dsa: sja1105: Implement state machine for TAS with PTP clock source
Tested using the following bash script and the tc from iproute2-next:
#!/bin/bash
set -e -u -o pipefail
NSEC_PER_SEC="1
net: dsa: sja1105: Implement state machine for TAS with PTP clock source
Tested using the following bash script and the tc from iproute2-next:
#!/bin/bash
set -e -u -o pipefail
NSEC_PER_SEC="1000000000"
gatemask() { local tc_list="$1" local mask=0
for tc in ${tc_list}; do mask=$((${mask} | (1 << ${tc}))) done
printf "%02x" ${mask} }
if ! systemctl is-active --quiet ptp4l; then echo "Please start the ptp4l service" exit fi
now=$(phc_ctl /dev/ptp1 get | gawk '/clock time is/ { print $5; }') # Phase-align the base time to the start of the next second. sec=$(echo "${now}" | gawk -F. '{ print $1; }') base_time="$(((${sec} + 1) * ${NSEC_PER_SEC}))"
tc qdisc add dev swp5 parent root handle 100 taprio \ num_tc 8 \ map 0 1 2 3 5 6 7 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \ base-time ${base_time} \ sched-entry S $(gatemask 7) 100000 \ sched-entry S $(gatemask "0 1 2 3 4 5 6") 400000 \ clockid CLOCK_TAI flags 2
The "state machine" is a workqueue invoked after each manipulation command on the PTP clock (reset, adjust time, set time, adjust frequency) which checks over the state of the time-aware scheduler. So it is not monitored periodically, only in reaction to a PTP command typically triggered from a userspace daemon (linuxptp). Otherwise there is no reason for things to go wrong.
Now that the timecounter/cyclecounter has been replaced with hardware operations on the PTP clock, the TAS Kconfig now depends upon PTP and the standalone clocksource operating mode has been removed.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
41603d78 |
| 11-Nov-2019 |
Vladimir Oltean <olteanv@gmail.com> |
net: dsa: sja1105: Make the PTP command read-write
The PTPSTRTSCH and PTPSTOPSCH bits are actually readable and indicate whether the time-aware scheduler is running or not. We will be using that for
net: dsa: sja1105: Make the PTP command read-write
The PTPSTRTSCH and PTPSTOPSCH bits are actually readable and indicate whether the time-aware scheduler is running or not. We will be using that for monitoring the scheduler in the next patch, so refactor the PTP command API in order to allow that.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
2eea1fa8 |
| 12-Nov-2019 |
Vladimir Oltean <olteanv@gmail.com> |
net: dsa: sja1105: Print the reset reason
Sometimes it can be quite opaque even for me why the driver decided to reset the switch. So instead of adding dump_stack() calls each time for debugging, ju
net: dsa: sja1105: Print the reset reason
Sometimes it can be quite opaque even for me why the driver decided to reset the switch. So instead of adding dump_stack() calls each time for debugging, just add a reset reason to sja1105_static_config_reload calls which gets printed to the console.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v5.3.10 |
|
#
34d76e9f |
| 09-Nov-2019 |
Vladimir Oltean <olteanv@gmail.com> |
net: dsa: sja1105: Implement the .gettimex64 system call for PTP
Through the PTP_SYS_OFFSET_EXTENDED ioctl, it is possible for userspace applications (i.e. phc2sys) to compensate for the delays incu
net: dsa: sja1105: Implement the .gettimex64 system call for PTP
Through the PTP_SYS_OFFSET_EXTENDED ioctl, it is possible for userspace applications (i.e. phc2sys) to compensate for the delays incurred while reading the PHC's time.
The task itself of taking the software timestamp is delegated to the SPI subsystem, through the newly introduced API in struct spi_transfer. The goal is to cross-timestamp I/O operations on the switch's PTP clock with values in the local system clock (CLOCK_REALTIME). For that we need to understand a bit of the hardware internals.
The 'read PTP time' message is a 12 byte structure, first 4 bytes of which represent the SPI header, and the last 8 bytes represent the 64-bit PTP time. The switch itself starts processing the command immediately after receiving the last bit of the address, i.e. at the middle of byte 3 (last byte of header). The PTP time is shadowed to a buffer register in the switch, and retrieved atomically during the subsequent SPI frames.
A similar thing goes on for the 'write PTP time' message, although in that case the switch waits until the 64-bit PTP time becomes fully available before taking any action. So the byte that needs to be software-timestamped is byte 11 (last) of the transfer.
The patch creates a common (and local) sja1105_xfer implementation for the SPI I/O, and offers 3 front-ends:
- sja1105_xfer_u32 and sja1105_xfer_u64: these are capable of optionally requesting a PTP timestamp
- sja1105_xfer_buf: this is for large transfers (e.g. the static config buffer) and other misc data, and there is no point in giving timestamping capabilities to this.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|