History log of /openbmc/linux/net/ipv6/fib6_rules.c (Results 1 – 25 of 769)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.6.67, v6.6.66, v6.6.65, v6.6.64, v6.6.63, v6.6.62, v6.6.61, v6.6.60, v6.6.59, v6.6.58, v6.6.57, v6.6.56, v6.6.55, v6.6.54, v6.6.53, v6.6.52, v6.6.51, v6.6.50, v6.6.49, v6.6.48, v6.6.47, v6.6.46, v6.6.45, v6.6.44, v6.6.43, v6.6.42, v6.6.41, v6.6.40, v6.6.39, v6.6.38, v6.6.37, v6.6.36, v6.6.35, v6.6.34, v6.6.33, v6.6.32
# e0d77d0f 19-May-2024 Andrew Jeffery <andrew@codeconstruct.com.au>

Merge tag 'v6.6.31' into dev-6.6

This is the 6.6.31 stable release


Revision tags: v6.6.31
# 8745a8d7 07-May-2024 Eric Dumazet <edumazet@google.com>

ipv6: fib6_rules: avoid possible NULL dereference in fib6_rule_action()

[ Upstream commit d101291b2681e5ab938554e3e323f7a7ee33e3aa ]

syzbot is able to trigger the following crash [1],
caused by uns

ipv6: fib6_rules: avoid possible NULL dereference in fib6_rule_action()

[ Upstream commit d101291b2681e5ab938554e3e323f7a7ee33e3aa ]

syzbot is able to trigger the following crash [1],
caused by unsafe ip6_dst_idev() use.

Indeed ip6_dst_idev() can return NULL, and must always be checked.

[1]

Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 0 PID: 31648 Comm: syz-executor.0 Not tainted 6.9.0-rc4-next-20240417-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
RIP: 0010:__fib6_rule_action net/ipv6/fib6_rules.c:237 [inline]
RIP: 0010:fib6_rule_action+0x241/0x7b0 net/ipv6/fib6_rules.c:267
Code: 02 00 00 49 8d 9f d8 00 00 00 48 89 d8 48 c1 e8 03 42 80 3c 20 00 74 08 48 89 df e8 f9 32 bf f7 48 8b 1b 48 89 d8 48 c1 e8 03 <42> 80 3c 20 00 74 08 48 89 df e8 e0 32 bf f7 4c 8b 03 48 89 ef 4c
RSP: 0018:ffffc9000fc1f2f0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 1a772f98c8186700
RDX: 0000000000000003 RSI: ffffffff8bcac4e0 RDI: ffffffff8c1f9760
RBP: ffff8880673fb980 R08: ffffffff8fac15ef R09: 1ffffffff1f582bd
R10: dffffc0000000000 R11: fffffbfff1f582be R12: dffffc0000000000
R13: 0000000000000080 R14: ffff888076509000 R15: ffff88807a029a00
FS: 00007f55e82ca6c0(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b31d23000 CR3: 0000000022b66000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
fib_rules_lookup+0x62c/0xdb0 net/core/fib_rules.c:317
fib6_rule_lookup+0x1fd/0x790 net/ipv6/fib6_rules.c:108
ip6_route_output_flags_noref net/ipv6/route.c:2637 [inline]
ip6_route_output_flags+0x38e/0x610 net/ipv6/route.c:2649
ip6_route_output include/net/ip6_route.h:93 [inline]
ip6_dst_lookup_tail+0x189/0x11a0 net/ipv6/ip6_output.c:1120
ip6_dst_lookup_flow+0xb9/0x180 net/ipv6/ip6_output.c:1250
sctp_v6_get_dst+0x792/0x1e20 net/sctp/ipv6.c:326
sctp_transport_route+0x12c/0x2e0 net/sctp/transport.c:455
sctp_assoc_add_peer+0x614/0x15c0 net/sctp/associola.c:662
sctp_connect_new_asoc+0x31d/0x6c0 net/sctp/socket.c:1099
__sctp_connect+0x66d/0xe30 net/sctp/socket.c:1197
sctp_connect net/sctp/socket.c:4819 [inline]
sctp_inet_connect+0x149/0x1f0 net/sctp/socket.c:4834
__sys_connect_file net/socket.c:2048 [inline]
__sys_connect+0x2df/0x310 net/socket.c:2065
__do_sys_connect net/socket.c:2075 [inline]
__se_sys_connect net/socket.c:2072 [inline]
__x64_sys_connect+0x7a/0x90 net/socket.c:2072
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f

Fixes: 5e5f3f0f8013 ("[IPV6] ADDRCONF: Convert ipv6_get_saddr() to ipv6_dev_get_saddr().")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20240507163145.835254-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


Revision tags: v6.6.30, v6.6.29, v6.6.28, v6.6.27, v6.6.26, v6.6.25, v6.6.24
# 5ee9cd06 27-Mar-2024 Andrew Jeffery <andrew@codeconstruct.com.au>

Merge tag 'v6.6.23' into dev-6.6

Linux 6.6.23


Revision tags: v6.6.23
# 2db74b0d 07-Mar-2024 Shiming Cheng <shiming.cheng@mediatek.com>

ipv6: fib6_rules: flush route cache when rule is changed

[ Upstream commit c4386ab4f6c600f75fdfd21143f89bac3e625d0d ]

When rule policy is changed, ipv6 socket cache is not refreshed.
The sock's skb

ipv6: fib6_rules: flush route cache when rule is changed

[ Upstream commit c4386ab4f6c600f75fdfd21143f89bac3e625d0d ]

When rule policy is changed, ipv6 socket cache is not refreshed.
The sock's skb still uses a outdated route cache and was sent to
a wrong interface.

To avoid this error we should update fib node's version when
rule is changed. Then skb's route will be reroute checked as
route cache version is already different with fib node version.
The route cache is refreshed to match the latest rule.

Fixes: 101367c2f8c4 ("[IPV6]: Policy Routing Rules")
Signed-off-by: Shiming Cheng <shiming.cheng@mediatek.com>
Signed-off-by: Lena Wang <lena.wang@mediatek.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


Revision tags: v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8, v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13
# 4f2c0a4a 13-Dec-2022 Nick Terrell <terrelln@fb.com>

Merge branch 'main' into zstd-linus


Revision tags: v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80, v6.0.9, v5.15.79, v6.0.8, v5.15.78, v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4
# 14e77332 21-Oct-2022 Nick Terrell <terrelln@fb.com>

Merge branch 'main' into zstd-next


Revision tags: v6.0.3, v6.0.2, v5.15.74, v5.15.73, v6.0.1, v5.15.72, v6.0, v5.15.71, v5.15.70, v5.15.69, v5.15.68, v5.15.67, v5.15.66, v5.15.65, v5.15.64, v5.15.63, v5.15.62, v5.15.61, v5.15.60, v5.15.59
# 8bb5e7f4 02-Aug-2022 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge branch 'next' into for-linus

Prepare input updates for 5.20 (or 6.0) merge window.


Revision tags: v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55, v5.15.54, v5.15.53, v5.15.52, v5.15.51, v5.15.50, v5.15.49, v5.15.48, v5.15.47, v5.15.46, v5.15.45
# 03ab8e62 31-May-2022 Konstantin Komarov <almaz.alexandrovich@paragon-software.com>

Merge tag 'v5.18'

Linux 5.18


Revision tags: v5.15.44
# 690e1790 27-May-2022 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge tag 'v5.18' into next

Sync up with mainline to get updates to OMAP4 keypad driver and other
upstream goodies.


Revision tags: v5.15.43, v5.15.42, v5.18, v5.15.41, v5.15.40, v5.15.39, v5.15.38, v5.15.37, v5.15.36, v5.15.35, v5.15.34
# 651a8879 13-Apr-2022 Takashi Iwai <tiwai@suse.de>

Merge branch 'topic/cs35l41' into for-next

Pull CS35L41 codec updates

Signed-off-by: Takashi Iwai <tiwai@suse.de>


# c16c8bfa 12-Apr-2022 Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Merge drm/drm-next into drm-intel-gt-next

Pull in TTM changes needed for DG2 CCS enabling from Ram.

Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 83970cd6 11-Apr-2022 Jani Nikula <jani.nikula@intel.com>

Merge drm/drm-next into drm-intel-next

Sync up with v5.18-rc1, in particular to get 5e3094cfd9fb
("drm/i915/xehpsdv: Add has_flat_ccs to device info").

Signed-off-by: Jani Nikula <jani.nikula@intel

Merge drm/drm-next into drm-intel-next

Sync up with v5.18-rc1, in particular to get 5e3094cfd9fb
("drm/i915/xehpsdv: Add has_flat_ccs to device info").

Signed-off-by: Jani Nikula <jani.nikula@intel.com>

show more ...


Revision tags: v5.15.33
# 9cbbd694 05-Apr-2022 Maxime Ripard <maxime@cerno.tech>

Merge drm/drm-next into drm-misc-next

Let's start the 5.19 development cycle.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>


# 0aea30a0 19-Apr-2022 Takashi Iwai <tiwai@suse.de>

Merge tag 'asoc-fix-v5.18-rc3' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v5.18

A collection of fixes that came in since the merge window, plus

Merge tag 'asoc-fix-v5.18-rc3' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v5.18

A collection of fixes that came in since the merge window, plus one new
device ID for an x86 laptop. Nothing that really stands out with
particularly big impact outside of the affected device.

show more ...


# cf5c5763 05-Apr-2022 Maxime Ripard <maxime@cerno.tech>

Merge drm/drm-fixes into drm-misc-fixes

Let's start the 5.18 fixes cycle.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>


# de4fb176 01-Apr-2022 Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

Merge branches 'fixes' and 'misc' into for-linus


Revision tags: v5.15.32
# 169e7776 24-Mar-2022 Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'net-next-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
"The sprinkling of SPI drivers is because we added a new one

Merge tag 'net-next-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
"The sprinkling of SPI drivers is because we added a new one and Mark
sent us a SPI driver interface conversion pull request.

Core
----

- Introduce XDP multi-buffer support, allowing the use of XDP with
jumbo frame MTUs and combination with Rx coalescing offloads (LRO).

- Speed up netns dismantling (5x) and lower the memory cost a little.
Remove unnecessary per-netns sockets. Scope some lists to a netns.
Cut down RCU syncing. Use batch methods. Allow netdev registration
to complete out of order.

- Support distinguishing timestamp types (ingress vs egress) and
maintaining them across packet scrubbing points (e.g. redirect).

- Continue the work of annotating packet drop reasons throughout the
stack.

- Switch netdev error counters from an atomic to dynamically
allocated per-CPU counters.

- Rework a few preempt_disable(), local_irq_save() and busy waiting
sections problematic on PREEMPT_RT.

- Extend the ref_tracker to allow catching use-after-free bugs.

BPF
---

- Introduce "packing allocator" for BPF JIT images. JITed code is
marked read only, and used to be allocated at page granularity.
Custom allocator allows for more efficient memory use, lower iTLB
pressure and prevents identity mapping huge pages from getting
split.

- Make use of BTF type annotations (e.g. __user, __percpu) to enforce
the correct probe read access method, add appropriate helpers.

- Convert the BPF preload to use light skeleton and drop the
user-mode-driver dependency.

- Allow XDP BPF_PROG_RUN test infra to send real packets, enabling
its use as a packet generator.

- Allow local storage memory to be allocated with GFP_KERNEL if
called from a hook allowed to sleep.

- Introduce fprobe (multi kprobe) to speed up mass attachment (arch
bits to come later).

- Add unstable conntrack lookup helpers for BPF by using the BPF
kfunc infra.

- Allow cgroup BPF progs to return custom errors to user space.

- Add support for AF_UNIX iterator batching.

- Allow iterator programs to use sleepable helpers.

- Support JIT of add, and, or, xor and xchg atomic ops on arm64.

- Add BTFGen support to bpftool which allows to use CO-RE in kernels
without BTF info.

- Large number of libbpf API improvements, cleanups and deprecations.

Protocols
---------

- Micro-optimize UDPv6 Tx, gaining up to 5% in test on dummy netdev.

- Adjust TSO packet sizes based on min_rtt, allowing very low latency
links (data centers) to always send full-sized TSO super-frames.

- Make IPv6 flow label changes (AKA hash rethink) more configurable,
via sysctl and setsockopt. Distinguish between server and client
behavior.

- VxLAN support to "collect metadata" devices to terminate only
configured VNIs. This is similar to VLAN filtering in the bridge.

- Support inserting IPv6 IOAM information to a fraction of frames.

- Add protocol attribute to IP addresses to allow identifying where
given address comes from (kernel-generated, DHCP etc.)

- Support setting socket and IPv6 options via cmsg on ping6 sockets.

- Reject mis-use of ECN bits in IP headers as part of DSCP/TOS.
Define dscp_t and stop taking ECN bits into account in fib-rules.

- Add support for locked bridge ports (for 802.1X).

- tun: support NAPI for packets received from batched XDP buffs,
doubling the performance in some scenarios.

- IPv6 extension header handling in Open vSwitch.

- Support IPv6 control message load balancing in bonding, prevent
neighbor solicitation and advertisement from using the wrong port.
Support NS/NA monitor selection similar to existing ARP monitor.

- SMC
- improve performance with TCP_CORK and sendfile()
- support auto-corking
- support TCP_NODELAY

- MCTP (Management Component Transport Protocol)
- add user space tag control interface
- I2C binding driver (as specified by DMTF DSP0237)

- Multi-BSSID beacon handling in AP mode for WiFi.

- Bluetooth:
- handle MSFT Monitor Device Event
- add MGMT Adv Monitor Device Found/Lost events

- Multi-Path TCP:
- add support for the SO_SNDTIMEO socket option
- lots of selftest cleanups and improvements

- Increase the max PDU size in CAN ISOTP to 64 kB.

Driver API
----------

- Add HW counters for SW netdevs, a mechanism for devices which
offload packet forwarding to report packet statistics back to
software interfaces such as tunnels.

- Select the default NIC queue count as a fraction of number of
physical CPU cores, instead of hard-coding to 8.

- Expose devlink instance locks to drivers. Allow device layer of
drivers to use that lock directly instead of creating their own
which always runs into ordering issues in devlink callbacks.

- Add header/data split indication to guide user space enabling of
TCP zero-copy Rx.

- Allow configuring completion queue event size.

- Refactor page_pool to enable fragmenting after allocation.

- Add allocation and page reuse statistics to page_pool.

- Improve Multiple Spanning Trees support in the bridge to allow
reuse of topologies across VLANs, saving HW resources in switches.

- DSA (Distributed Switch Architecture):
- replay and offload of host VLAN entries
- offload of static and local FDB entries on LAG interfaces
- FDB isolation and unicast filtering

New hardware / drivers
----------------------

- Ethernet:
- LAN937x T1 PHYs
- Davicom DM9051 SPI NIC driver
- Realtek RTL8367S, RTL8367RB-VB switch and MDIO
- Microchip ksz8563 switches
- Netronome NFP3800 SmartNICs
- Fungible SmartNICs
- MediaTek MT8195 switches

- WiFi:
- mt76: MediaTek mt7916
- mt76: MediaTek mt7921u USB adapters
- brcmfmac: Broadcom BCM43454/6

- Mobile:
- iosm: Intel M.2 7360 WWAN card

Drivers
-------

- Convert many drivers to the new phylink API built for split PCS
designs but also simplifying other cases.

- Intel Ethernet NICs:
- add TTY for GNSS module for E810T device
- improve AF_XDP performance
- GTP-C and GTP-U filter offload
- QinQ VLAN support

- Mellanox Ethernet NICs (mlx5):
- support xdp->data_meta
- multi-buffer XDP
- offload tc push_eth and pop_eth actions

- Netronome Ethernet NICs (nfp):
- flow-independent tc action hardware offload (police / meter)
- AF_XDP

- Other Ethernet NICs:
- at803x: fiber and SFP support
- xgmac: mdio: preamble suppression and custom MDC frequencies
- r8169: enable ASPM L1.2 if system vendor flags it as safe
- macb/gem: ZynqMP SGMII
- hns3: add TX push mode
- dpaa2-eth: software TSO
- lan743x: multi-queue, mdio, SGMII, PTP
- axienet: NAPI and GRO support

- Mellanox Ethernet switches (mlxsw):
- source and dest IP address rewrites
- RJ45 ports

- Marvell Ethernet switches (prestera):
- basic routing offload
- multi-chain TC ACL offload

- NXP embedded Ethernet switches (ocelot & felix):
- PTP over UDP with the ocelot-8021q DSA tagging protocol
- basic QoS classification on Felix DSA switch using dcbnl
- port mirroring for ocelot switches

- Microchip high-speed industrial Ethernet (sparx5):
- offloading of bridge port flooding flags
- PTP Hardware Clock

- Other embedded switches:
- lan966x: PTP Hardward Clock
- qca8k: mdio read/write operations via crafted Ethernet packets

- Qualcomm 802.11ax WiFi (ath11k):
- add LDPC FEC type and 802.11ax High Efficiency data in radiotap
- enable RX PPDU stats in monitor co-exist mode

- Intel WiFi (iwlwifi):
- UHB TAS enablement via BIOS
- band disablement via BIOS
- channel switch offload
- 32 Rx AMPDU sessions in newer devices

- MediaTek WiFi (mt76):
- background radar detection
- thermal management improvements on mt7915
- SAR support for more mt76 platforms
- MBSSID and 6 GHz band on mt7915

- RealTek WiFi:
- rtw89: AP mode
- rtw89: 160 MHz channels and 6 GHz band
- rtw89: hardware scan

- Bluetooth:
- mt7921s: wake on Bluetooth, SCO over I2S, wide-band-speed (WBS)

- Microchip CAN (mcp251xfd):
- multiple RX-FIFOs and runtime configurable RX/TX rings
- internal PLL, runtime PM handling simplification
- improve chip detection and error handling after wakeup"

* tag 'net-next-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2521 commits)
llc: fix netdevice reference leaks in llc_ui_bind()
drivers: ethernet: cpsw: fix panic when interrupt coaleceing is set via ethtool
ice: don't allow to run ice_send_event_to_aux() in atomic ctx
ice: fix 'scheduling while atomic' on aux critical err interrupt
net/sched: fix incorrect vlan_push_eth dest field
net: bridge: mst: Restrict info size queries to bridge ports
net: marvell: prestera: add missing destroy_workqueue() in prestera_module_init()
drivers: net: xgene: Fix regression in CRC stripping
net: geneve: add missing netlink policy and size for IFLA_GENEVE_INNER_PROTO_INHERIT
net: dsa: fix missing host-filtered multicast addresses
net/mlx5e: Fix build warning, detected write beyond size of field
iwlwifi: mvm: Don't fail if PPAG isn't supported
selftests/bpf: Fix kprobe_multi test.
Revert "rethook: x86: Add rethook x86 implementation"
Revert "arm64: rethook: Add arm64 rethook implementation"
Revert "powerpc: Add rethook support"
Revert "ARM: rethook: Add rethook arm implementation"
netdevice: add missing dm_private kdoc
net: bridge: mst: prevent NULL deref in br_mst_info_size()
selftests: forwarding: Use same VRF for port and VLAN upper
...

show more ...


Revision tags: v5.15.31, v5.17, v5.15.30, v5.15.29, v5.15.28, v5.15.27, v5.15.26
# 1136fa0c 01-Mar-2022 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge tag 'v5.17-rc4' into for-linus

Merge with mainline to get the Intel ASoC generic helpers header and
other changes.


Revision tags: v5.15.25
# 986c6f7c 18-Feb-2022 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge tag 'v5.17-rc4' into next

Sync up with mainline to get the latest changes in HID subsystem.


Revision tags: v5.15.24, v5.15.23
# 4caaf758 08-Feb-2022 Jakub Kicinski <kuba@kernel.org>

Merge branch 'net-speedup-netns-dismantles'

Eric Dumazet says:

====================
net: speedup netns dismantles

From: Eric Dumazet <edumazet@google.com>

In this series, I made network namespace

Merge branch 'net-speedup-netns-dismantles'

Eric Dumazet says:

====================
net: speedup netns dismantles

From: Eric Dumazet <edumazet@google.com>

In this series, I made network namespace deletions more scalable,
by 4x on the little benchmark described in this cover letter.

- Remove bottleneck on ipv6 addrconf, by replacing a global
hash table to a per netns one.

- Rework many (struct pernet_operations)->exit() handlers to
exit_batch() ones. This removes many rtnl acquisitions,
and gives to cleanup_net() kind of a priority over rtnl
ownership.

Tested on a host with 24 cpus (48 HT)

Test script:

for nr in {1..10}
do
(for i in {1..10000}; do unshare -n /bin/bash -c "ifconfig lo up"; done) &
done
wait

for i in {1..10}
do
sleep 1
echo 3 >/proc/sys/vm/drop_caches
grep net_namespace /proc/slabinfo
done

Before: We can see host struggles to clean the netns, even after there are no new creations.
Memory cost is high, because each netns consumes a good amount of memory.

time ./unshare10.sh
net_namespace 82634 82634 3968 1 1 : tunables 24 12 8 : slabdata 82634 82634 0
net_namespace 82634 82634 3968 1 1 : tunables 24 12 8 : slabdata 82634 82634 0
net_namespace 82634 82634 3968 1 1 : tunables 24 12 8 : slabdata 82634 82634 0
net_namespace 82634 82634 3968 1 1 : tunables 24 12 8 : slabdata 82634 82634 0
net_namespace 82634 82634 3968 1 1 : tunables 24 12 8 : slabdata 82634 82634 0
net_namespace 82634 82634 3968 1 1 : tunables 24 12 8 : slabdata 82634 82634 0
net_namespace 82634 82634 3968 1 1 : tunables 24 12 8 : slabdata 82634 82634 0
net_namespace 82634 82634 3968 1 1 : tunables 24 12 8 : slabdata 82634 82634 0
net_namespace 82634 82634 3968 1 1 : tunables 24 12 8 : slabdata 82634 82634 0
net_namespace 37214 37792 3968 1 1 : tunables 24 12 8 : slabdata 37214 37792 192

real 6m57.766s
user 3m37.277s
sys 40m4.826s

After: We can see the script completes much faster,
the kernel thread doing the cleanup_net() keeps up just fine.
Memory cost is not too big.

time ./unshare10.sh
net_namespace 9945 9945 4096 1 1 : tunables 24 12 8 : slabdata 9945 9945 0
net_namespace 4087 4665 4096 1 1 : tunables 24 12 8 : slabdata 4087 4665 192
net_namespace 4082 4607 4096 1 1 : tunables 24 12 8 : slabdata 4082 4607 192
net_namespace 234 761 4096 1 1 : tunables 24 12 8 : slabdata 234 761 192
net_namespace 224 751 4096 1 1 : tunables 24 12 8 : slabdata 224 751 192
net_namespace 218 745 4096 1 1 : tunables 24 12 8 : slabdata 218 745 192
net_namespace 193 667 4096 1 1 : tunables 24 12 8 : slabdata 193 667 172
net_namespace 167 609 4096 1 1 : tunables 24 12 8 : slabdata 167 609 152
net_namespace 167 609 4096 1 1 : tunables 24 12 8 : slabdata 167 609 152
net_namespace 157 609 4096 1 1 : tunables 24 12 8 : slabdata 157 609 152

real 1m43.876s
user 3m39.728s
sys 7m36.342s
====================

Link: https://lore.kernel.org/r/20220208045038.2635826-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

show more ...


Revision tags: v5.15.22
# ea3e9166 07-Feb-2022 Eric Dumazet <edumazet@google.com>

ipv6: change fib6_rules_net_exit() to batch mode

cleanup_net() is competing with other rtnl users.

fib6_rules_net_exit() seems a good candidate for exit_batch(),
as this gives chance for cleanup_ne

ipv6: change fib6_rules_net_exit() to batch mode

cleanup_net() is competing with other rtnl users.

fib6_rules_net_exit() seems a good candidate for exit_batch(),
as this gives chance for cleanup_net() to progress much faster,
holding rtnl a bit longer.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

show more ...


# c3e676b9 07-Feb-2022 Jakub Kicinski <kuba@kernel.org>

Merge branch 'inet-separate-dscp-from-ecn-bits-using-new-dscp_t-type'

Guillaume Nault says:

====================
inet: Separate DSCP from ECN bits using new dscp_t type

The networking stack curren

Merge branch 'inet-separate-dscp-from-ecn-bits-using-new-dscp_t-type'

Guillaume Nault says:

====================
inet: Separate DSCP from ECN bits using new dscp_t type

The networking stack currently doesn't clearly distinguish between DSCP
and ECN bits. The entire DSCP+ECN bits are stored in u8 variables (or
structure fields), and each part of the stack handles them in their own
way, using different macros. This has created several bugs in the past
and some uncommon code paths are still unfixed.

Such bugs generally manifest by selecting invalid routes because of ECN
bits interfering with FIB routes and rules lookups (more details in the
LPC 2021 talk[1] and in the RFC of this series[2]).

This patch series aims at preventing the introduction of such bugs (and
detecting existing ones), by introducing a dscp_t type, representing
"sanitised" DSCP values (that is, with no ECN information), as opposed
to plain u8 values that contain both DSCP and ECN information. dscp_t
makes it clear for the reader what we're working on, and Sparse can
flag invalid interactions between dscp_t and plain u8.

This series converts only a few variables and structures:

* Patch 1 converts the tclass field of struct fib6_rule. It
effectively forbids the use of ECN bits in the tos/dsfield option
of ip -6 rule. Rules now match packets solely based on their DSCP
bits, so ECN doesn't influence the result any more. This contrasts
with the previous behaviour where all 8 bits of the Traffic Class
field were used. It is believed that this change is acceptable as
matching ECN bits wasn't usable for IPv4, so only IPv6-only
deployments could be depending on it. Also the previous behaviour
made DSCP-based ip6-rules fail for packets with both a DSCP and an
ECN mark, which is another reason why any such deploy is unlikely.

* Patch 2 converts the tos field of struct fib4_rule. This one too
effectively forbids defining ECN bits, this time in ip -4 rule.
Before that, setting ECN bit 1 was accepted, while ECN bit 0 was
rejected. But even when accepted, the rule would never match, as
the packets would have their ECN bits cleared before doing the
rule lookup.

* Patch 3 converts the fc_tos field of struct fib_config. This is
equivalent to patch 2, but for IPv4 routes. Routes using a
tos/dsfield option with any ECN bit set is now rejected. Before
this patch, they were accepted but, as with ip4 rules, these routes
couldn't match any packet, since their ECN bits are cleared before
the lookup.

* Patch 4 converts the fa_tos field of struct fib_alias. This one is
pure internal u8 to dscp_t conversion. While patches 1-3 had user
facing consequences, this patch shouldn't have any side effect and
is there to give an overview of what future conversion patches will
look like. Conversions are quite mechanical, but imply some code
churn, which is the price for the extra clarity a possibility of
type checking.

To summarise, all the behaviour changes required for the dscp_t type
approach to work should be contained in patches 1-3. These changes are
edge cases of ip-route and ip-rule that don't currently work properly.
So they should be safe. Also, a kernel selftest is added for each of
them.

Finally, this work also paves the way for allowing the usage of the 3
high order DSCP bits in IPv4 (a few call paths already handle them, but
in general the stack clears them before IPv4 rule and route lookups).

References:
[1] LPC 2021 talk:
- https://linuxplumbersconf.org/event/11/contributions/943/
- Direct link to slide deck:
https://linuxplumbersconf.org/event/11/contributions/943/attachments/901/1780/inet_tos_lpc2021.pdf
[2] RFC version of this series:
- https://lore.kernel.org/netdev/cover.1638814614.git.gnault@redhat.com/
====================

Link: https://lore.kernel.org/r/cover.1643981839.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

show more ...


Revision tags: v5.15.21, v5.15.20
# a410a0cf 04-Feb-2022 Guillaume Nault <gnault@redhat.com>

ipv6: Define dscp_t and stop taking ECN bits into account in fib6-rules

Define a dscp_t type and its appropriate helpers that ensure ECN bits
are not taken into account when handling DSCP.

Use this

ipv6: Define dscp_t and stop taking ECN bits into account in fib6-rules

Define a dscp_t type and its appropriate helpers that ensure ECN bits
are not taken into account when handling DSCP.

Use this new type to replace the tclass field of struct fib6_rule, so
that fib6-rules don't get influenced by ECN bits anymore.

Before this patch, fib6-rules didn't make any distinction between the
DSCP and ECN bits. Therefore, rules specifying a DSCP (tos or dsfield
options in iproute2) stopped working as soon a packets had at least one
of its ECN bits set (as a work around one could create four rules for
each DSCP value to match, one for each possible ECN value).

After this patch fib6-rules only compare the DSCP bits. ECN doesn't
influence the result anymore. Also, fib6-rules now must have the ECN
bits cleared or they will be rejected.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Acked-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

show more ...


# 542898c5 07-Feb-2022 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Merge remote-tracking branch 'drm/drm-next' into drm-misc-next

First backmerge into drm-misc-next. Required for more helpers backmerged,
and to pull in 5.17 (rc2).

Signed-off-by: Maarten Lankhorst

Merge remote-tracking branch 'drm/drm-next' into drm-misc-next

First backmerge into drm-misc-next. Required for more helpers backmerged,
and to pull in 5.17 (rc2).

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

show more ...


# 7e6a6b40 04-Feb-2022 Paolo Bonzini <pbonzini@redhat.com>

Merge tag 'kvmarm-fixes-5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 5.17, take #2

- A couple of fixes when handling an exception while a SEr

Merge tag 'kvmarm-fixes-5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 5.17, take #2

- A couple of fixes when handling an exception while a SError has been
delivered

- Workaround for Cortex-A510's single-step[ erratum

show more ...


12345678910>>...31