#
360823a0 |
| 17-Feb-2025 |
Andrew Jeffery <andrew@codeconstruct.com.au> |
Merge tag 'v6.6.78' into for/openbmc/dev-6.6
This is the 6.6.78 stable release
# -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmey9hUACgkQONu9yGCS # aT7Ecw//Ts3+DVy
Merge tag 'v6.6.78' into for/openbmc/dev-6.6
This is the 6.6.78 stable release
# -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmey9hUACgkQONu9yGCS # aT7Ecw//Ts3+DVyM1iMAUj6zZHQ7+UVqRxvVQ0yJwe1gzECrasxhu+ack0MDuRXb # RTOHzrVkpHrOZ58T0kkkp4DVea4bq8kpq9wnnOxpta4SzQYuwxuypxw9ZML2u8kR # A77akcb4MPBpeTwlLUTEX1K2CrF+Wfz9ZGauJRTmrnWogJe1hZWTxr3tc9TqGeMA # tk93g9kWy7hxxubPJpAUbNVmWbpm/TfZuMAEyktpNf8E0DLukHjr0If85t3BC0KZ # kxLSCN05ZmWoZVQjmaerS8pXFvwj08OeRbUtW+b4oaraUV7vsrwxW/WcOqb6vIBn # AEohV3w7CpFj0moRPXJO+UuxmP5TrSCIGUaEGjnrMCPJfjxwnmFYaf+9DYi3bR4H # U8UyU55PhGTWlWg238Qp64KsDn41M/rlNKOiPEGq08+1Qnhoj4LWfFFHzLhO8y4R # xLfsOzu6cHgEUnMKPTV6TnkWSCEL9t51wgzsqa7iKdO7kyAL1YCb4+LkskJAqUzW # t3i8Sw8nygE7cKQ5eHzG6CClKEfgxtMGiR63gan9npEUgcFbzoVP0uz9RYz7+0Vz # 5oE2ZSGXSoiJNWhdjJVrr1gqg/TwrzmVjsmUEnf4uTDABh9GXL+g+UZHGSMvvvYi # T8gUY4aFwXO5fGKN1RW8RXJSbJr4nKYde2s/h4ZT1EwRVdj5Zcc= # =+i1A # -----END PGP SIGNATURE----- # gpg: Signature made Mon 17 Feb 2025 19:10:53 ACDT # gpg: using RSA key 647F28654894E3BD457199BE38DBBDC86092693E # gpg: Good signature from "Greg Kroah-Hartman <gregkh@kernel.org>" [marginal] # gpg: gregkh@kernel.org: Verified 7 signatures in the past 3 weeks. Encrypted # 0 messages. # gpg: Warning: you have yet to encrypt a message to this key! # gpg: Warning: if you think you've seen more signatures by this key and user # id, then this key might be a forgery! Carefully examine the email address # for small variations. If the key is suspect, then use # gpg --tofu-policy bad 647F28654894E3BD457199BE38DBBDC86092693E # to mark it as being bad. # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 647F 2865 4894 E3BD 4571 99BE 38DB BDC8 6092 693E
show more ...
|
Revision tags: v6.6.78, v6.6.77, v6.6.76, v6.6.75 |
|
#
858e1fea |
| 30-Jan-2025 |
Sankararaman Jayaraman <sankararaman.jayaraman@broadcom.com> |
vmxnet3: Fix tx queue race condition with XDP
[ Upstream commit 3f1baa91a1fdf3de9dbad4bd615b35fab347874b ]
If XDP traffic runs on a CPU which is greater than or equal to the number of the Tx queues
vmxnet3: Fix tx queue race condition with XDP
[ Upstream commit 3f1baa91a1fdf3de9dbad4bd615b35fab347874b ]
If XDP traffic runs on a CPU which is greater than or equal to the number of the Tx queues of the NIC, then vmxnet3_xdp_get_tq() always picks up queue 0 for transmission as it uses reciprocal scale instead of simple modulo operation.
vmxnet3_xdp_xmit() and vmxnet3_xdp_xmit_frame() use the above returned queue without any locking which can lead to race conditions when multiple XDP xmits run in parallel on different CPU's.
This patch uses a simple module scheme when the current CPU equals or exceeds the number of Tx queues on the NIC. It also adds locking in vmxnet3_xdp_xmit() and vmxnet3_xdp_xmit_frame() functions.
Fixes: 54f00cce1178 ("vmxnet3: Add XDP support.") Signed-off-by: Sankararaman Jayaraman <sankararaman.jayaraman@broadcom.com> Signed-off-by: Ronak Doshi <ronak.doshi@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250131042340.156547-1-sankararaman.jayaraman@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v6.6.74, v6.6.73, v6.6.72, v6.6.71, v6.12.9, v6.6.70, v6.12.8, v6.6.69, v6.12.7, v6.6.68, v6.12.6, v6.6.67, v6.12.5, v6.6.66, v6.6.65, v6.12.4, v6.6.64, v6.12.3, v6.12.2, v6.6.63, v6.12.1, v6.12, v6.6.62, v6.6.61, v6.6.60 |
|
#
e50e86db |
| 03-Nov-2024 |
Andrew Jeffery <andrew@codeconstruct.com.au> |
Merge tag 'v6.6.59' into for/openbmc/dev-6.6
This is the 6.6.59 stable release
|
Revision tags: v6.6.59, v6.6.58, v6.6.57 |
|
#
59ba6cda |
| 14-Oct-2024 |
Daniel Borkmann <daniel@iogearbox.net> |
vmxnet3: Fix packet corruption in vmxnet3_xdp_xmit_frame
[ Upstream commit 4678adf94da4a9e9683817b246b58ce15fb81782 ]
Andrew and Nikolay reported connectivity issues with Cilium's service load-bala
vmxnet3: Fix packet corruption in vmxnet3_xdp_xmit_frame
[ Upstream commit 4678adf94da4a9e9683817b246b58ce15fb81782 ]
Andrew and Nikolay reported connectivity issues with Cilium's service load-balancing in case of vmxnet3.
If a BPF program for native XDP adds an encapsulation header such as IPIP and transmits the packet out the same interface, then in case of vmxnet3 a corrupted packet is being sent and subsequently dropped on the path.
vmxnet3_xdp_xmit_frame() which is called e.g. via vmxnet3_run_xdp() through vmxnet3_xdp_xmit_back() calculates an incorrect DMA address:
page = virt_to_page(xdpf->data); tbi->dma_addr = page_pool_get_dma_addr(page) + VMXNET3_XDP_HEADROOM; dma_sync_single_for_device(&adapter->pdev->dev, tbi->dma_addr, buf_size, DMA_TO_DEVICE);
The above assumes a fixed offset (VMXNET3_XDP_HEADROOM), but the XDP BPF program could have moved xdp->data. While the passed buf_size is correct (xdpf->len), the dma_addr needs to have a dynamic offset which can be calculated as xdpf->data - (void *)xdpf, that is, xdp->data - xdp->data_hard_start.
Fixes: 54f00cce1178 ("vmxnet3: Add XDP support.") Reported-by: Andrew Sauber <andrew.sauber@isovalent.com> Reported-by: Nikolay Nikolaev <nikolay.nikolaev@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Nikolay Nikolaev <nikolay.nikolaev@isovalent.com> Acked-by: Anton Protopopov <aspsk@isovalent.com> Cc: William Tu <witu@nvidia.com> Cc: Ronak Doshi <ronak.doshi@broadcom.com> Link: https://patch.msgid.link/a0888656d7f09028f9984498cc698bb5364d89fc.1728931137.git.daniel@iogearbox.net Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v6.6.56, v6.6.55, v6.6.54, v6.6.53, v6.6.52, v6.6.51, v6.6.50, v6.6.49, v6.6.48, v6.6.47, v6.6.46, v6.6.45, v6.6.44, v6.6.43, v6.6.42, v6.6.41, v6.6.40, v6.6.39, v6.6.38, v6.6.37, v6.6.36, v6.6.35, v6.6.34, v6.6.33, v6.6.32, v6.6.31, v6.6.30, v6.6.29, v6.6.28, v6.6.27, v6.6.26, v6.6.25, v6.6.24 |
|
#
5ee9cd06 |
| 27-Mar-2024 |
Andrew Jeffery <andrew@codeconstruct.com.au> |
Merge tag 'v6.6.23' into dev-6.6
Linux 6.6.23
|
Revision tags: v6.6.23 |
|
#
aba8659c |
| 09-Mar-2024 |
William Tu <witu@nvidia.com> |
vmxnet3: Fix missing reserved tailroom
[ Upstream commit e127ce7699c1e05279ee5ee61f00893e7bfa9671 ]
Use rbi->len instead of rcd->len for non-dataring packet.
Found issue: XDP_WARN: xdp_update_fr
vmxnet3: Fix missing reserved tailroom
[ Upstream commit e127ce7699c1e05279ee5ee61f00893e7bfa9671 ]
Use rbi->len instead of rcd->len for non-dataring packet.
Found issue: XDP_WARN: xdp_update_frame_from_buff(line:278): Driver BUG: missing reserved tailroom WARNING: CPU: 0 PID: 0 at net/core/xdp.c:586 xdp_warn+0xf/0x20 CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W O 6.5.1 #1 RIP: 0010:xdp_warn+0xf/0x20 ... ? xdp_warn+0xf/0x20 xdp_do_redirect+0x15f/0x1c0 vmxnet3_run_xdp+0x17a/0x400 [vmxnet3] vmxnet3_process_xdp+0xe4/0x760 [vmxnet3] ? vmxnet3_tq_tx_complete.isra.0+0x21e/0x2c0 [vmxnet3] vmxnet3_rq_rx_complete+0x7ad/0x1120 [vmxnet3] vmxnet3_poll_rx_only+0x2d/0xa0 [vmxnet3] __napi_poll+0x20/0x180 net_rx_action+0x177/0x390
Reported-by: Martin Zaharinov <micron10@gmail.com> Tested-by: Martin Zaharinov <micron10@gmail.com> Link: https://lore.kernel.org/netdev/74BF3CC8-2A3A-44FF-98C2-1E20F110A92E@gmail.com/ Fixes: 54f00cce1178 ("vmxnet3: Add XDP support.") Signed-off-by: William Tu <witu@nvidia.com> Link: https://lore.kernel.org/r/20240309183147.28222-1-witu@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3 |
|
#
c900529f |
| 12-Sep-2023 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-fixes into drm-misc-fixes
Forwarding to v6.6-rc1.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
Revision tags: v6.5.2, v6.1.51, v6.5.1, v6.1.50 |
|
#
bd6c11bc |
| 29-Aug-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'net-next-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni: "Core:
- Increase size limits for to-be-sent skb frag allocat
Merge tag 'net-next-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni: "Core:
- Increase size limits for to-be-sent skb frag allocations. This allows tun, tap devices and packet sockets to better cope with large writes operations
- Store netdevs in an xarray, to simplify iterating over netdevs
- Refactor nexthop selection for multipath routes
- Improve sched class lifetime handling
- Add backup nexthop ID support for bridge
- Implement drop reasons support in openvswitch
- Several data races annotations and fixes
- Constify the sk parameter of routing functions
- Prepend kernel version to netconsole message
Protocols:
- Implement support for TCP probing the peer being under memory pressure
- Remove hard coded limitation on IPv6 specific info placement inside the socket struct
- Get rid of sysctl_tcp_adv_win_scale and use an auto-estimated per socket scaling factor
- Scaling-up the IPv6 expired route GC via a separated list of expiring routes
- In-kernel support for the TLS alert protocol
- Better support for UDP reuseport with connected sockets
- Add NEXT-C-SID support for SRv6 End.X behavior, reducing the SR header size
- Get rid of additional ancillary per MPTCP connection struct socket
- Implement support for BPF-based MPTCP packet schedulers
- Format MPTCP subtests selftests results in TAP
- Several new SMC 2.1 features including unique experimental options, max connections per lgr negotiation, max links per lgr negotiation
BPF:
- Multi-buffer support in AF_XDP
- Add multi uprobe BPF links for attaching multiple uprobes and usdt probes, which is significantly faster and saves extra fds
- Implement an fd-based tc BPF attach API (TCX) and BPF link support on top of it
- Add SO_REUSEPORT support for TC bpf_sk_assign
- Support new instructions from cpu v4 to simplify the generated code and feature completeness, for x86, arm64, riscv64
- Support defragmenting IPv(4|6) packets in BPF
- Teach verifier actual bounds of bpf_get_smp_processor_id() and fix perf+libbpf issue related to custom section handling
- Introduce bpf map element count and enable it for all program types
- Add a BPF hook in sys_socket() to change the protocol ID from IPPROTO_TCP to IPPROTO_MPTCP to cover migration for legacy
- Introduce bpf_me_mcache_free_rcu() and fix OOM under stress
- Add uprobe support for the bpf_get_func_ip helper
- Check skb ownership against full socket
- Support for up to 12 arguments in BPF trampoline
- Extend link_info for kprobe_multi and perf_event links
Netfilter:
- Speed-up process exit by aborting ruleset validation if a fatal signal is pending
- Allow NLA_POLICY_MASK to be used with BE16/BE32 types
Driver API:
- Page pool optimizations, to improve data locality and cache usage
- Introduce ndo_hwtstamp_get() and ndo_hwtstamp_set() to avoid the need for raw ioctl() handling in drivers
- Simplify genetlink dump operations (doit/dumpit) providing them the common information already populated in struct genl_info
- Extend and use the yaml devlink specs to [re]generate the split ops
- Introduce devlink selective dumps, to allow SF filtering SF based on handle and other attributes
- Add yaml netlink spec for netlink-raw families, allow route, link and address related queries via the ynl tool
- Remove phylink legacy mode support
- Support offload LED blinking to phy
- Add devlink port function attributes for IPsec
New hardware / drivers:
- Ethernet: - Broadcom ASP 2.0 (72165) ethernet controller - MediaTek MT7988 SoC - Texas Instruments AM654 SoC - Texas Instruments IEP driver - Atheros qca8081 phy - Marvell 88Q2110 phy - NXP TJA1120 phy
- WiFi: - MediaTek mt7981 support
- Can: - Kvaser SmartFusion2 PCI Express devices - Allwinner T113 controllers - Texas Instruments tcan4552/4553 chips
- Bluetooth: - Intel Gale Peak - Qualcomm WCN3988 and WCN7850 - NXP AW693 and IW624 - Mediatek MT2925
Drivers:
- Ethernet NICs: - nVidia/Mellanox: - mlx5: - support UDP encapsulation in packet offload mode - IPsec packet offload support in eswitch mode - improve aRFS observability by adding new set of counters - extends MACsec offload support to cover RoCE traffic - dynamic completion EQs - mlx4: - convert to use auxiliary bus instead of custom interface logic - Intel - ice: - implement switchdev bridge offload, even for LAG interfaces - implement SRIOV support for LAG interfaces - igc: - add support for multiple in-flight TX timestamps - Broadcom: - bnxt: - use the unified RX page pool buffers for XDP and non-XDP - use the NAPI skb allocation cache - OcteonTX2: - support Round Robin scheduling HTB offload - TC flower offload support for SPI field - Freescale: - add XDP_TX feature support - AMD: - ionic: add support for PCI FLR event - sfc: - basic conntrack offload - introduce eth, ipv4 and ipv6 pedit offloads - ST Microelectronics: - stmmac: maximze PTP timestamping resolution
- Virtual NICs: - Microsoft vNIC: - batch ringing RX queue doorbell on receiving packets - add page pool for RX buffers - Virtio vNIC: - add per queue interrupt coalescing support - Google vNIC: - add queue-page-list mode support
- Ethernet high-speed switches: - nVidia/Mellanox (mlxsw): - add port range matching tc-flower offload - permit enslavement to netdevices with uppers
- Ethernet embedded switches: - Marvell (mv88e6xxx): - convert to phylink_pcs - Renesas: - r8A779fx: add speed change support - rzn1: enables vlan support
- Ethernet PHYs: - convert mv88e6xxx to phylink_pcs
- WiFi: - Qualcomm Wi-Fi 7 (ath12k): - extremely High Throughput (EHT) PHY support - RealTek (rtl8xxxu): - enable AP mode for: RTL8192FU, RTL8710BU (RTL8188GU), RTL8192EU and RTL8723BU - RealTek (rtw89): - Introduce Time Averaged SAR (TAS) support
- Connector: - support for event filtering"
* tag 'net-next-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1806 commits) net: ethernet: mtk_wed: minor change in wed_{tx,rx}info_show net: ethernet: mtk_wed: add some more info in wed_txinfo_show handler net: stmmac: clarify difference between "interface" and "phy_interface" r8152: add vendor/device ID pair for D-Link DUB-E250 devlink: move devlink_notify_register/unregister() to dev.c devlink: move small_ops definition into netlink.c devlink: move tracepoint definitions into core.c devlink: push linecard related code into separate file devlink: push rate related code into separate file devlink: push trap related code into separate file devlink: use tracepoint_enabled() helper devlink: push region related code into separate file devlink: push param related code into separate file devlink: push resource related code into separate file devlink: push dpipe related code into separate file devlink: move and rename devlink_dpipe_send_and_alloc_skb() helper devlink: push shared buffer related code into separate file devlink: push port related code into separate file devlink: push object register/unregister notifications into separate helpers inet: fix IP_TRANSPARENT error handling ...
show more ...
|
Revision tags: v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45 |
|
#
54f00cce |
| 09-Aug-2023 |
William Tu <u9012063@gmail.com> |
vmxnet3: Add XDP support.
The patch adds native-mode XDP support: XDP DROP, PASS, TX, and REDIRECT.
Background: The vmxnet3 rx consists of three rings: ring0, ring1, and dataring. For r0 and r1, bu
vmxnet3: Add XDP support.
The patch adds native-mode XDP support: XDP DROP, PASS, TX, and REDIRECT.
Background: The vmxnet3 rx consists of three rings: ring0, ring1, and dataring. For r0 and r1, buffers at r0 are allocated using alloc_skb APIs and dma mapped to the ring's descriptor. If LRO is enabled and packet size larger than 3K, VMXNET3_MAX_SKB_BUF_SIZE, then r1 is used to mapped the rest of the buffer larger than VMXNET3_MAX_SKB_BUF_SIZE. Each buffer in r1 is allocated using alloc_page. So for LRO packets, the payload will be in one buffer from r0 and multiple from r1, for non-LRO packets, only one descriptor in r0 is used for packet size less than 3k.
When receiving a packet, the first descriptor will have the sop (start of packet) bit set, and the last descriptor will have the eop (end of packet) bit set. Non-LRO packets will have only one descriptor with both sop and eop set.
Other than r0 and r1, vmxnet3 dataring is specifically designed for handling packets with small size, usually 128 bytes, defined in VMXNET3_DEF_RXDATA_DESC_SIZE, by simply copying the packet from the backend driver in ESXi to the ring's memory region at front-end vmxnet3 driver, in order to avoid memory mapping/unmapping overhead. In summary, packet size: A. < 128B: use dataring B. 128B - 3K: use ring0 (VMXNET3_RX_BUF_SKB) C. > 3K: use ring0 and ring1 (VMXNET3_RX_BUF_SKB + VMXNET3_RX_BUF_PAGE) As a result, the patch adds XDP support for packets using dataring and r0 (case A and B), not the large packet size when LRO is enabled.
XDP Implementation: When user loads and XDP prog, vmxnet3 driver checks configurations, such as mtu, lro, and re-allocate the rx buffer size for reserving the extra headroom, XDP_PACKET_HEADROOM, for XDP frame. The XDP prog will then be associated with every rx queue of the device. Note that when using dataring for small packet size, vmxnet3 (front-end driver) doesn't control the buffer allocation, as a result we allocate a new page and copy packet from the dataring to XDP frame.
The receive side of XDP is implemented for case A and B, by invoking the bpf program at vmxnet3_rq_rx_complete and handle its returned action. The vmxnet3_process_xdp(), vmxnet3_process_xdp_small() function handles the ring0 and dataring case separately, and decides the next journey of the packet afterward.
For TX, vmxnet3 has split header design. Outgoing packets are parsed first and protocol headers (L2/L3/L4) are copied to the backend. The rest of the payload are dma mapped. Since XDP_TX does not parse the packet protocol, the entire XDP frame is dma mapped for transmission and transmitted in a batch. Later on, the frame is freed and recycled back to the memory pool.
Performance: Tested using two VMs inside one ESXi vSphere 7.0 machine, using single core on each vmxnet3 device, sender using DPDK testpmd tx-mode attached to vmxnet3 device, sending 64B or 512B UDP packet.
VM1 txgen: $ dpdk-testpmd -l 0-3 -n 1 -- -i --nb-cores=3 \ --forward-mode=txonly --eth-peer=0,<mac addr of vm2> option: add "--txonly-multi-flow" option: use --txpkts=512 or 64 byte
VM2 running XDP: $ ./samples/bpf/xdp_rxq_info -d ens160 -a <options> --skb-mode $ ./samples/bpf/xdp_rxq_info -d ens160 -a <options> options: XDP_DROP, XDP_PASS, XDP_TX
To test REDIRECT to cpu 0, use $ ./samples/bpf/xdp_redirect_cpu -d ens160 -c 0 -e drop
Single core performance comparison with skb-mode. 64B: skb-mode -> native-mode XDP_DROP: 1.6Mpps -> 2.4Mpps XDP_PASS: 338Kpps -> 367Kpps XDP_TX: 1.1Mpps -> 2.3Mpps REDIRECT-drop: 1.3Mpps -> 2.3Mpps
512B: skb-mode -> native-mode XDP_DROP: 863Kpps -> 1.3Mpps XDP_PASS: 275Kpps -> 376Kpps XDP_TX: 554Kpps -> 1.2Mpps REDIRECT-drop: 659Kpps -> 1.2Mpps
Demo: https://youtu.be/4lm1CSCi78Q
Future work: - XDP frag support - use napi_consume_skb() instead of dev_kfree_skb_any at unmap - stats using u64_stats_t - using bitfield macro BIT() - optimization for DMA synchronization using actual frame length, instead of always max_len
Signed-off-by: William Tu <u9012063@gmail.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|