Revision tags: v6.6.67, v6.6.66, v6.6.65, v6.6.64, v6.6.63, v6.6.62, v6.6.61, v6.6.60, v6.6.59, v6.6.58 |
|
#
7b7fd0ac |
| 17-Oct-2024 |
Andrew Jeffery <andrew@codeconstruct.com.au> |
Merge tag 'v6.6.57' into for/openbmc/dev-6.6
This is the 6.6.57 stable release
|
Revision tags: v6.6.57, v6.6.56, v6.6.55, v6.6.54, v6.6.53, v6.6.52, v6.6.51 |
|
#
c43622d6 |
| 10-Sep-2024 |
Tao Chen <chen.dylane@gmail.com> |
bpf: Check percpu map value size first
[ Upstream commit 1d244784be6b01162b732a5a7d637dfc024c3203 ]
Percpu map is often used, but the map value size limit often ignored, like issue: https://github.
bpf: Check percpu map value size first
[ Upstream commit 1d244784be6b01162b732a5a7d637dfc024c3203 ]
Percpu map is often used, but the map value size limit often ignored, like issue: https://github.com/iovisor/bcc/issues/2519. Actually, percpu map value size is bound by PCPU_MIN_UNIT_SIZE, so we can check the value size whether it exceeds PCPU_MIN_UNIT_SIZE first, like percpu map of local_storage. Maybe the error message seems clearer compared with "cannot allocate memory".
Signed-off-by: Jinke Han <jinkehan@didiglobal.com> Signed-off-by: Tao Chen <chen.dylane@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240910144111.1464912-2-chen.dylane@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v6.6.50, v6.6.49, v6.6.48, v6.6.47, v6.6.46, v6.6.45, v6.6.44, v6.6.43, v6.6.42, v6.6.41, v6.6.40, v6.6.39, v6.6.38, v6.6.37, v6.6.36, v6.6.35, v6.6.34, v6.6.33, v6.6.32, v6.6.31, v6.6.30, v6.6.29, v6.6.28, v6.6.27, v6.6.26, v6.6.25, v6.6.24, v6.6.23 |
|
#
87832e93 |
| 10-Feb-2024 |
Andrew Jeffery <andrew@codeconstruct.com.au> |
Merge tag 'v6.6.16' into dev-6.6
This is the 6.6.16 stable release
|
#
1188f7f1 |
| 10-Feb-2024 |
Andrew Jeffery <andrew@codeconstruct.com.au> |
Merge tag 'v6.6.14' into dev-6.6
This is the 6.6.14 stable release
|
#
3e7759b9 |
| 10-Feb-2024 |
Andrew Jeffery <andrew@codeconstruct.com.au> |
Merge tag 'v6.6.9' into dev-6.6
This is the 6.6.9 stable release
|
Revision tags: v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5 |
|
#
a9bf3a49 |
| 04-Dec-2023 |
Hou Tao <houtao1@huawei.com> |
bpf: Set need_defer as false when clearing fd array during map free
[ Upstream commit 79d93b3c6ffd79abcd8e43345980aa1e904879c4 ]
Both map deletion operation, map release and map free operation use
bpf: Set need_defer as false when clearing fd array during map free
[ Upstream commit 79d93b3c6ffd79abcd8e43345980aa1e904879c4 ]
Both map deletion operation, map release and map free operation use fd_array_map_delete_elem() to remove the element from fd array and need_defer is always true in fd_array_map_delete_elem(). For the map deletion operation and map release operation, need_defer=true is necessary, because the bpf program, which accesses the element in fd array, may still alive. However for map free operation, it is certain that the bpf program which owns the fd array has already been exited, so setting need_defer as false is appropriate for map free operation.
So fix it by adding need_defer parameter to bpf_fd_array_map_clear() and adding a new helper __fd_array_map_delete_elem() to handle the map deletion, map release and map free operations correspondingly.
Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20231204140425.1480317-4-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
1c40ec6b |
| 04-Dec-2023 |
Hou Tao <houtao1@huawei.com> |
bpf: Add map and need_defer parameters to .map_fd_put_ptr()
[ Upstream commit 20c20bd11a0702ce4dc9300c3da58acf551d9725 ]
map is the pointer of outer map, and need_defer needs some explanation. need
bpf: Add map and need_defer parameters to .map_fd_put_ptr()
[ Upstream commit 20c20bd11a0702ce4dc9300c3da58acf551d9725 ]
map is the pointer of outer map, and need_defer needs some explanation. need_defer tells the implementation to defer the reference release of the passed element and ensure that the element is still alive before the bpf program, which may manipulate it, exits.
The following three cases will invoke map_fd_put_ptr() and different need_defer values will be passed to these callers:
1) release the reference of the old element in the map during map update or map deletion. The release must be deferred, otherwise the bpf program may incur use-after-free problem, so need_defer needs to be true. 2) release the reference of the to-be-added element in the error path of map update. The to-be-added element is not visible to any bpf program, so it is OK to pass false for need_defer parameter. 3) release the references of all elements in the map during map release. Any bpf program which has access to the map must have been exited and released, so need_defer=false will be OK.
These two parameters will be used by the following patches to fix the potential use-after-free problem for map-in-map.
Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20231204140425.1480317-3-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Stable-dep-of: 876673364161 ("bpf: Defer the free of inner map when necessary") Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
f64b2dc8 |
| 06-Dec-2023 |
Jiri Olsa <jolsa@kernel.org> |
bpf: Fix prog_array_map_poke_run map poke update
commit 4b7de801606e504e69689df71475d27e35336fb3 upstream.
Lee pointed out issue found by syscaller [0] hitting BUG in prog array map poke update in
bpf: Fix prog_array_map_poke_run map poke update
commit 4b7de801606e504e69689df71475d27e35336fb3 upstream.
Lee pointed out issue found by syscaller [0] hitting BUG in prog array map poke update in prog_array_map_poke_run function due to error value returned from bpf_arch_text_poke function.
There's race window where bpf_arch_text_poke can fail due to missing bpf program kallsym symbols, which is accounted for with check for -EINVAL in that BUG_ON call.
The problem is that in such case we won't update the tail call jump and cause imbalance for the next tail call update check which will fail with -EBUSY in bpf_arch_text_poke.
I'm hitting following race during the program load:
CPU 0 CPU 1
bpf_prog_load bpf_check do_misc_fixups prog_array_map_poke_track
map_update_elem bpf_fd_array_map_update_elem prog_array_map_poke_run
bpf_arch_text_poke returns -EINVAL
bpf_prog_kallsyms_add
After bpf_arch_text_poke (CPU 1) fails to update the tail call jump, the next poke update fails on expected jump instruction check in bpf_arch_text_poke with -EBUSY and triggers the BUG_ON in prog_array_map_poke_run.
Similar race exists on the program unload.
Fixing this by moving the update to bpf_arch_poke_desc_update function which makes sure we call __bpf_arch_text_poke that skips the bpf address check.
Each architecture has slightly different approach wrt looking up bpf address in bpf_arch_text_poke, so instead of splitting the function or adding new 'checkip' argument in previous version, it seems best to move the whole map_poke_run update as arch specific code.
[0] https://syzkaller.appspot.com/bug?extid=97a4fe20470e9bc30810
Fixes: ebf7d1f508a7 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT") Reported-by: syzbot+97a4fe20470e9bc30810@syzkaller.appspotmail.com Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yonghong.song@linux.dev> Cc: Lee Jones <lee@kernel.org> Cc: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20231206083041.1306660-2-jolsa@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
Revision tags: v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1 |
|
#
1ac731c5 |
| 30-Aug-2023 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge branch 'next' into for-linus
Prepare input updates for 6.6 merge window.
|
Revision tags: v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39 |
|
#
50501936 |
| 17-Jul-2023 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'v6.4' into next
Sync up with mainline to bring in updates to shared infrastructure.
|
Revision tags: v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35 |
|
#
db6da59c |
| 15-Jun-2023 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-next into drm-misc-next-fixes
Backmerging to sync drm-misc-next-fixes with drm-misc-next.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
Revision tags: v6.1.34 |
|
#
03c60192 |
| 12-Jun-2023 |
Dmitry Baryshkov <dmitry.baryshkov@linaro.org> |
Merge branch 'drm-next' of git://anongit.freedesktop.org/drm/drm into msm-next-lumag-base
Merge the drm-next tree to pick up the DRM DSC helpers (merged via drm-intel-next tree). MSM DSC v1.2 patche
Merge branch 'drm-next' of git://anongit.freedesktop.org/drm/drm into msm-next-lumag-base
Merge the drm-next tree to pick up the DRM DSC helpers (merged via drm-intel-next tree). MSM DSC v1.2 patches depend on these helpers.
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
show more ...
|
Revision tags: v6.1.33 |
|
#
5c680050 |
| 06-Jun-2023 |
Miquel Raynal <miquel.raynal@bootlin.com> |
Merge tag 'v6.4-rc4' into wpan-next/staging
Linux 6.4-rc4
|
#
9ff17e6b |
| 05-Jun-2023 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
Merge drm/drm-next into drm-intel-gt-next
For conflict avoidance we need the following commit:
c9a9f18d3ad8 drm/i915/huc: use const struct bus_type pointers
Signed-off-by: Tvrtko Ursulin <tvrtko
Merge drm/drm-next into drm-intel-gt-next
For conflict avoidance we need the following commit:
c9a9f18d3ad8 drm/i915/huc: use const struct bus_type pointers
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
show more ...
|
Revision tags: v6.1.32, v6.1.31, v6.1.30 |
|
#
9c3a985f |
| 17-May-2023 |
Rodrigo Vivi <rodrigo.vivi@intel.com> |
Merge drm/drm-next into drm-intel-next
Backmerge to get some hwmon dependencies.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
Revision tags: v6.1.29 |
|
#
50282fd5 |
| 12-May-2023 |
Maxime Ripard <maxime@cerno.tech> |
Merge drm/drm-fixes into drm-misc-fixes
Let's bring 6.4-rc1 in drm-misc-fixes to start the new fix cycle.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
|
Revision tags: v6.1.28 |
|
#
ff32fcca |
| 09-May-2023 |
Maxime Ripard <maxime@cerno.tech> |
Merge drm/drm-next into drm-misc-next
Start the 6.5 release cycle.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
|
Revision tags: v6.1.27 |
|
#
6e98b09d |
| 26-Apr-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'net-next-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni: "Core:
- Introduce a config option to tweak MAX_SKB_FRAGS. In
Merge tag 'net-next-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni: "Core:
- Introduce a config option to tweak MAX_SKB_FRAGS. Increasing the default value allows for better BIG TCP performances
- Reduce compound page head access for zero-copy data transfers
- RPS/RFS improvements, avoiding unneeded NET_RX_SOFTIRQ when possible
- Threaded NAPI improvements, adding defer skb free support and unneeded softirq avoidance
- Address dst_entry reference count scalability issues, via false sharing avoidance and optimize refcount tracking
- Add lockless accesses annotation to sk_err[_soft]
- Optimize again the skb struct layout
- Extends the skb drop reasons to make it usable by multiple subsystems
- Better const qualifier awareness for socket casts
BPF:
- Add skb and XDP typed dynptrs which allow BPF programs for more ergonomic and less brittle iteration through data and variable-sized accesses
- Add a new BPF netfilter program type and minimal support to hook BPF programs to netfilter hooks such as prerouting or forward
- Add more precise memory usage reporting for all BPF map types
- Adds support for using {FOU,GUE} encap with an ipip device operating in collect_md mode and add a set of BPF kfuncs for controlling encap params
- Allow BPF programs to detect at load time whether a particular kfunc exists or not, and also add support for this in light skeleton
- Bigger batch of BPF verifier improvements to prepare for upcoming BPF open-coded iterators allowing for less restrictive looping capabilities
- Rework RCU enforcement in the verifier, add kptr_rcu and enforce BPF programs to NULL-check before passing such pointers into kfunc
- Add support for kptrs in percpu hashmaps, percpu LRU hashmaps and in local storage maps
- Enable RCU semantics for task BPF kptrs and allow referenced kptr tasks to be stored in BPF maps
- Add support for refcounted local kptrs to the verifier for allowing shared ownership, useful for adding a node to both the BPF list and rbtree
- Add BPF verifier support for ST instructions in convert_ctx_access() which will help new -mcpu=v4 clang flag to start emitting them
- Add ARM32 USDT support to libbpf
- Improve bpftool's visual program dump which produces the control flow graph in a DOT format by adding C source inline annotations
Protocols:
- IPv4: Allow adding to IPv4 address a 'protocol' tag. Such value indicates the provenance of the IP address
- IPv6: optimize route lookup, dropping unneeded R/W lock acquisition
- Add the handshake upcall mechanism, allowing the user-space to implement generic TLS handshake on kernel's behalf
- Bridge: support per-{Port, VLAN} neighbor suppression, increasing resilience to nodes failures
- SCTP: add support for Fair Capacity and Weighted Fair Queueing schedulers
- MPTCP: delay first subflow allocation up to its first usage. This will allow for later better LSM interaction
- xfrm: Remove inner/outer modes from input/output path. These are not needed anymore
- WiFi: - reduced neighbor report (RNR) handling for AP mode - HW timestamping support - support for randomized auth/deauth TA for PASN privacy - per-link debugfs for multi-link - TC offload support for mac80211 drivers - mac80211 mesh fast-xmit and fast-rx support - enable Wi-Fi 7 (EHT) mesh support
Netfilter:
- Add nf_tables 'brouting' support, to force a packet to be routed instead of being bridged
- Update bridge netfilter and ovs conntrack helpers to handle IPv6 Jumbo packets properly, i.e. fetch the packet length from hop-by-hop extension header. This is needed for BIT TCP support
- The iptables 32bit compat interface isn't compiled in by default anymore
- Move ip(6)tables builtin icmp matches to the udptcp one. This has the advantage that icmp/icmpv6 match doesn't load the iptables/ip6tables modules anymore when iptables-nft is used
- Extended netlink error report for netdevice in flowtables and netdev/chains. Allow for incrementally add/delete devices to netdev basechain. Allow to create netdev chain without device
Driver API:
- Remove redundant Device Control Error Reporting Enable, as PCI core has already error reporting enabled at enumeration time
- Move Multicast DB netlink handlers to core, allowing devices other then bridge to use them
- Allow the page_pool to directly recycle the pages from safely localized NAPI
- Implement lockless TX queue stop/wake combo macros, allowing for further code de-duplication and sanitization
- Add YNL support for user headers and struct attrs
- Add partial YNL specification for devlink
- Add partial YNL specification for ethtool
- Add tc-mqprio and tc-taprio support for preemptible traffic classes
- Add tx push buf len param to ethtool, specifies the maximum number of bytes of a transmitted packet a driver can push directly to the underlying device
- Add basic LED support for switch/phy
- Add NAPI documentation, stop relaying on external links
- Convert dsa_master_ioctl() to netdev notifier. This is a preparatory work to make the hardware timestamping layer selectable by user space
- Add transceiver support and improve the error messages for CAN-FD controllers
New hardware / drivers:
- Ethernet: - AMD/Pensando core device support - MediaTek MT7981 SoC - MediaTek MT7988 SoC - Broadcom BCM53134 embedded switch - Texas Instruments CPSW9G ethernet switch - Qualcomm EMAC3 DWMAC ethernet - StarFive JH7110 SoC - NXP CBTX ethernet PHY
- WiFi: - Apple M1 Pro/Max devices - RealTek rtl8710bu/rtl8188gu - RealTek rtl8822bs, rtl8822cs and rtl8821cs SDIO chipset
- Bluetooth: - Realtek RTL8821CS, RTL8851B, RTL8852BS - Mediatek MT7663, MT7922 - NXP w8997 - Actions Semi ATS2851 - QTI WCN6855 - Marvell 88W8997
- Can: - STMicroelectronics bxcan stm32f429
Drivers:
- Ethernet NICs: - Intel (1G, icg): - add tracking and reporting of QBV config errors - add support for configuring max SDU for each Tx queue - Intel (100G, ice): - refactor mailbox overflow detection to support Scalable IOV - GNSS interface optimization - Intel (i40e): - support XDP multi-buffer - nVidia/Mellanox: - add the support for linux bridge multicast offload - enable TC offload for egress and engress MACVLAN over bond - add support for VxLAN GBP encap/decap flows offload - extend packet offload to fully support libreswan - support tunnel mode in mlx5 IPsec packet offload - extend XDP multi-buffer support - support MACsec VLAN offload - add support for dynamic msix vectors allocation - drop RX page_cache and fully use page_pool - implement thermal zone to report NIC temperature - Netronome/Corigine: - add support for multi-zone conntrack offload - Solarflare/Xilinx: - support offloading TC VLAN push/pop actions to the MAE - support TC decap rules - support unicast PTP
- Other NICs: - Broadcom (bnxt): enforce software based freq adjustments only on shared PHC NIC - RealTek (r8169): refactor to addess ASPM issues during NAPI poll - Micrel (lan8841): add support for PTP_PF_PEROUT - Cadence (macb): enable PTP unicast - Engleder (tsnep): add XDP socket zero-copy support - virtio-net: implement exact header length guest feature - veth: add page_pool support for page recycling - vxlan: add MDB data path support - gve: add XDP support for GQI-QPL format - geneve: accept every ethertype - macvlan: allow some packets to bypass broadcast queue - mana: add support for jumbo frame
- Ethernet high-speed switches: - Microchip (sparx5): Add support for TC flower templates
- Ethernet embedded switches: - Broadcom (b54): - configure 6318 and 63268 RGMII ports - Marvell (mv88e6xxx): - faster C45 bus scan - Microchip: - lan966x: - add support for IS1 VCAP - better TX/RX from/to CPU performances - ksz9477: add ETS Qdisc support - ksz8: enhance static MAC table operations and error handling - sama7g5: add PTP capability - NXP (ocelot): - add support for external ports - add support for preemptible traffic classes - Texas Instruments: - add CPSWxG SGMII support for J7200 and J721E
- Intel WiFi (iwlwifi): - preparation for Wi-Fi 7 EHT and multi-link support - EHT (Wi-Fi 7) sniffer support - hardware timestamping support for some devices/firwmares - TX beacon protection on newer hardware
- Qualcomm 802.11ax WiFi (ath11k): - MU-MIMO parameters support - ack signal support for management packets
- RealTek WiFi (rtw88): - SDIO bus support - better support for some SDIO devices (e.g. MAC address from efuse)
- RealTek WiFi (rtw89): - HW scan support for 8852b - better support for 6 GHz scanning - support for various newer firmware APIs - framework firmware backwards compatibility
- MediaTek WiFi (mt76): - P2P support - mesh A-MSDU support - EHT (Wi-Fi 7) support - coredump support"
* tag 'net-next-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2078 commits) net: phy: hide the PHYLIB_LEDS knob net: phy: marvell-88x2222: remove unnecessary (void*) conversions tcp/udp: Fix memleaks of sk and zerocopy skbs with TX timestamp. net: amd: Fix link leak when verifying config failed net: phy: marvell: Fix inconsistent indenting in led_blink_set lan966x: Don't use xdp_frame when action is XDP_TX tsnep: Add XDP socket zero-copy TX support tsnep: Add XDP socket zero-copy RX support tsnep: Move skb receive action to separate function tsnep: Add functions for queue enable/disable tsnep: Rework TX/RX queue initialization tsnep: Replace modulo operation with mask net: phy: dp83867: Add led_brightness_set support net: phy: Fix reading LED reg property drivers: nfc: nfcsim: remove return value check of `dev_dir` net: phy: dp83867: Remove unnecessary (void*) conversions net: ethtool: coalesce: try to make user settings stick twice net: mana: Check if netdev/napi_alloc_frag returns single page net: mana: Rename mana_refill_rxoob and remove some empty lines net: veth: add page_pool stats ...
show more ...
|
Revision tags: v6.1.26, v6.3, v6.1.25 |
|
#
c2865b11 |
| 13-Apr-2023 |
Jakub Kicinski <kuba@kernel.org> |
Daniel Borkmann says:
==================== pull-request: bpf-next 2023-04-13
We've added 260 non-merge commits during the last 36 day(s) which contain a total of 356 files changed, 21786 insertions
Daniel Borkmann says:
==================== pull-request: bpf-next 2023-04-13
We've added 260 non-merge commits during the last 36 day(s) which contain a total of 356 files changed, 21786 insertions(+), 11275 deletions(-).
The main changes are:
1) Rework BPF verifier log behavior and implement it as a rotating log by default with the option to retain old-style fixed log behavior, from Andrii Nakryiko.
2) Adds support for using {FOU,GUE} encap with an ipip device operating in collect_md mode and add a set of BPF kfuncs for controlling encap params, from Christian Ehrig.
3) Allow BPF programs to detect at load time whether a particular kfunc exists or not, and also add support for this in light skeleton, from Alexei Starovoitov.
4) Optimize hashmap lookups when key size is multiple of 4, from Anton Protopopov.
5) Enable RCU semantics for task BPF kptrs and allow referenced kptr tasks to be stored in BPF maps, from David Vernet.
6) Add support for stashing local BPF kptr into a map value via bpf_kptr_xchg(). This is useful e.g. for rbtree node creation for new cgroups, from Dave Marchevsky.
7) Fix BTF handling of is_int_ptr to skip modifiers to work around tracing issues where a program cannot be attached, from Feng Zhou.
8) Migrate a big portion of test_verifier unit tests over to test_progs -a verifier_* via inline asm to ease {read,debug}ability, from Eduard Zingerman.
9) Several updates to the instruction-set.rst documentation which is subject to future IETF standardization (https://lwn.net/Articles/926882/), from Dave Thaler.
10) Fix BPF verifier in the __reg_bound_offset's 64->32 tnum sub-register known bits information propagation, from Daniel Borkmann.
11) Add skb bitfield compaction work related to BPF with the overall goal to make more of the sk_buff bits optional, from Jakub Kicinski.
12) BPF selftest cleanups for build id extraction which stand on its own from the upcoming integration work of build id into struct file object, from Jiri Olsa.
13) Add fixes and optimizations for xsk descriptor validation and several selftest improvements for xsk sockets, from Kal Conley.
14) Add BPF links for struct_ops and enable switching implementations of BPF TCP cong-ctls under a given name by replacing backing struct_ops map, from Kui-Feng Lee.
15) Remove a misleading BPF verifier env->bypass_spec_v1 check on variable offset stack read as earlier Spectre checks cover this, from Luis Gerhorst.
16) Fix issues in copy_from_user_nofault() for BPF and other tracers to resemble copy_from_user_nmi() from safety PoV, from Florian Lehner and Alexei Starovoitov.
17) Add --json-summary option to test_progs in order for CI tooling to ease parsing of test results, from Manu Bretelle.
18) Batch of improvements and refactoring to prep for upcoming bpf_local_storage conversion to bpf_mem_cache_{alloc,free} allocator, from Martin KaFai Lau.
19) Improve bpftool's visual program dump which produces the control flow graph in a DOT format by adding C source inline annotations, from Quentin Monnet.
20) Fix attaching fentry/fexit/fmod_ret/lsm to modules by extracting the module name from BTF of the target and searching kallsyms of the correct module, from Viktor Malik.
21) Improve BPF verifier handling of '<const> <cond> <non_const>' to better detect whether in particular jmp32 branches are taken, from Yonghong Song.
22) Allow BPF TCP cong-ctls to write app_limited of struct tcp_sock. A built-in cc or one from a kernel module is already able to write to app_limited, from Yixin Shen.
Conflicts:
Documentation/bpf/bpf_devel_QA.rst b7abcd9c656b ("bpf, doc: Link to submitting-patches.rst for general patch submission info") 0f10f647f455 ("bpf, docs: Use internal linking for link to netdev subsystem doc") https://lore.kernel.org/all/20230307095812.236eb1be@canb.auug.org.au/
include/net/ip_tunnels.h bc9d003dc48c3 ("ip_tunnel: Preserve pointer const in ip_tunnel_info_opts") ac931d4cdec3d ("ipip,ip_tunnel,sit: Add FOU support for externally controlled ipip devices") https://lore.kernel.org/all/20230413161235.4093777-1-broonie@kernel.org/
net/bpf/test_run.c e5995bc7e2ba ("bpf, test_run: fix crashes due to XDP frame overwriting/corruption") 294635a8165a ("bpf, test_run: fix &xdp_frame misplacement for LIVE_FRAMES") https://lore.kernel.org/all/20230320102619.05b80a98@canb.auug.org.au/ ====================
Link: https://lore.kernel.org/r/20230413191525.7295-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
Revision tags: v6.1.24, v6.1.23, v6.1.22 |
|
#
02adf9e9 |
| 22-Mar-2023 |
Alexei Starovoitov <ast@kernel.org> |
Merge branch 'error checking where helpers call bpf_map_ops'
JP Kobryn says:
====================
Within bpf programs, the bpf helper functions can make inline calls to kernel functions. In this s
Merge branch 'error checking where helpers call bpf_map_ops'
JP Kobryn says:
====================
Within bpf programs, the bpf helper functions can make inline calls to kernel functions. In this scenario there can be a disconnect between the register the kernel function writes a return value to and the register the bpf program uses to evaluate that return value.
As an example, this bpf code:
long err = bpf_map_update_elem(...); if (err && err != -EEXIST) // got some error other than -EEXIST
...can result in the bpf assembly:
; err = bpf_map_update_elem(&mymap, &key, &val, BPF_NOEXIST); 37: movabs $0xffff976a10730400,%rdi 41: mov $0x1,%ecx 46: call 0xffffffffe103291c ; htab_map_update_elem ; if (err && err != -EEXIST) { 4b: cmp $0xffffffffffffffef,%rax ; cmp -EEXIST,%rax 4f: je 0x000000000000008e 51: test %rax,%rax 54: je 0x000000000000008e
The compare operation here evaluates %rax, while in the preceding call to htab_map_update_elem the corresponding assembly returns -EEXIST via %eax (the lower 32 bits of %rax):
movl $0xffffffef, %r9d ... movl %r9d, %eax
...since it's returning int (32-bit). So the resulting comparison becomes:
cmp $0xffffffffffffffef, $0x00000000ffffffef
...making it not possible to check for negative errors or specific errors, since the sign value is left at the 32nd bit. It means in the original example, the conditional branch will be entered even when the error is -EEXIST, which was not intended.
The selftests added cover these cases for the different bpf_map_ops functions. When the second patch is applied, changing the return type of those functions to long, the comparison works as intended and the tests pass. ====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
#
d7ba4cc9 |
| 22-Mar-2023 |
JP Kobryn <inwardvessel@gmail.com> |
bpf: return long from bpf_map_ops funcs
This patch changes the return types of bpf_map_ops functions to long, where previously int was returned. Using long allows for bpf programs to maintain the si
bpf: return long from bpf_map_ops funcs
This patch changes the return types of bpf_map_ops functions to long, where previously int was returned. Using long allows for bpf programs to maintain the sign bit in the absence of sign extension during situations where inlined bpf helper funcs make calls to the bpf_map_ops funcs and a negative error is returned.
The definitions of the helper funcs are generated from comments in the bpf uapi header at `include/uapi/linux/bpf.h`. The return type of these helpers was previously changed from int to long in commit bdb7b79b4ce8. For any case where one of the map helpers call the bpf_map_ops funcs that are still returning 32-bit int, a compiler might not include sign extension instructions to properly convert the 32-bit negative value a 64-bit negative value.
For example: bpf assembly excerpt of an inlined helper calling a kernel function and checking for a specific error:
; err = bpf_map_update_elem(&mymap, &key, &val, BPF_NOEXIST); ... 46: call 0xffffffffe103291c ; htab_map_update_elem ; if (err && err != -EEXIST) { 4b: cmp $0xffffffffffffffef,%rax ; cmp -EEXIST,%rax
kernel function assembly excerpt of return value from `htab_map_update_elem` returning 32-bit int:
movl $0xffffffef, %r9d ... movl %r9d, %eax
...results in the comparison: cmp $0xffffffffffffffef, $0x00000000ffffffef
Fixes: bdb7b79b4ce8 ("bpf: Switch most helper return values from 32-bit int to 64-bit long") Tested-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: JP Kobryn <inwardvessel@gmail.com> Link: https://lore.kernel.org/r/20230322194754.185781-3-inwardvessel@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
Revision tags: v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16 |
|
#
ed69e066 |
| 08-Mar-2023 |
Jakub Kicinski <kuba@kernel.org> |
Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Andrii Nakryiko says:
==================== pull-request: bpf-next 2023-03-08
We've added 23 non-merge commits during the last 2 d
Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Andrii Nakryiko says:
==================== pull-request: bpf-next 2023-03-08
We've added 23 non-merge commits during the last 2 day(s) which contain a total of 28 files changed, 414 insertions(+), 104 deletions(-).
The main changes are:
1) Add more precise memory usage reporting for all BPF map types, from Yafang Shao.
2) Add ARM32 USDT support to libbpf, from Puranjay Mohan.
3) Fix BTF_ID_LIST size causing problems in !CONFIG_DEBUG_INFO_BTF, from Nathan Chancellor.
4) IMA selftests fix, from Roberto Sassu.
5) libbpf fix in APK support code, from Daniel Müller.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (23 commits) selftests/bpf: Fix IMA test libbpf: USDT arm arg parsing support libbpf: Refactor parse_usdt_arg() to re-use code libbpf: Fix theoretical u32 underflow in find_cd() function bpf: enforce all maps having memory usage callback bpf: offload map memory usage bpf, net: xskmap memory usage bpf, net: sock_map memory usage bpf, net: bpf_local_storage memory usage bpf: local_storage memory usage bpf: bpf_struct_ops memory usage bpf: queue_stack_maps memory usage bpf: devmap memory usage bpf: cpumap memory usage bpf: bloom_filter memory usage bpf: ringbuf memory usage bpf: reuseport_array memory usage bpf: stackmap memory usage bpf: arraymap memory usage bpf: hashtab memory usage ... ====================
Link: https://lore.kernel.org/r/20230308193533.1671597-1-andrii@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
a73dc912 |
| 07-Mar-2023 |
Alexei Starovoitov <ast@kernel.org> |
Merge branch 'bpf: bpf memory usage'
Yafang Shao says:
====================
Currently we can't get bpf memory usage reliably either from memcg or from bpftool.
In memcg, there's not a 'bpf' item
Merge branch 'bpf: bpf memory usage'
Yafang Shao says:
====================
Currently we can't get bpf memory usage reliably either from memcg or from bpftool.
In memcg, there's not a 'bpf' item in memory.stat, but only 'kernel', 'sock', 'vmalloc' and 'percpu' which may related to bpf memory. With these items we still can't get the bpf memory usage, because bpf memory usage may far less than the kmem in a memcg, for example, the dentry may consume lots of kmem.
bpftool now shows the bpf memory footprint, which is difference with bpf memory usage. The difference can be quite great in some cases, for example,
- non-preallocated bpf map The non-preallocated bpf map memory usage is dynamically changed. The allocated elements count can be from 0 to the max entries. But the memory footprint in bpftool only shows a fixed number.
- bpf metadata consumes more memory than bpf element In some corner cases, the bpf metadata can consumes a lot more memory than bpf element consumes. For example, it can happen when the element size is quite small.
- some maps don't have key, value or max_entries For example the key_size and value_size of ringbuf is 0, so its memlock is always 0.
We need a way to show the bpf memory usage especially there will be more and more bpf programs running on the production environment and thus the bpf memory usage is not trivial.
This patchset introduces a new map ops ->map_mem_usage to calculate the memory usage. Note that we don't intend to make the memory usage 100% accurate, while our goal is to make sure there is only a small difference between what bpftool reports and the real memory. That small difference can be ignored compared to the total usage. That is enough to monitor the bpf memory usage. For example, the user can rely on this value to monitor the trend of bpf memory usage, compare the difference in bpf memory usage between different bpf program versions, figure out which maps consume large memory, and etc.
This patchset implements the bpf memory usage for all maps, and yet there's still work to do. We don't want to introduce runtime overhead in the element update and delete path, but we have to do it for some non-preallocated maps, - devmap, xskmap When we update or delete an element, it will allocate or free memory. In order to track this dynamic memory, we have to track the count in element update and delete path.
- cpumap The element size of each cpumap element is not determinated. If we want to track the usage, we have to count the size of all elements in the element update and delete path. So I just put it aside currently.
- local_storage, bpf_local_storage When we attach or detach a cgroup, it will allocate or free memory. If we want to track the dynamic memory, we also need to do something in the update and delete path. So I just put it aside currently.
- offload map The element update and delete of offload map is via the netdev dev_ops, in which it may dynamically allocate or free memory, but this dynamic memory isn't counted in offload map memory usage currently.
The result of each map can be found in the individual patch.
We may also need to track per-container bpf memory usage, that will be addressed by a different patchset.
Changes: v3->v4: code improvement on ringbuf (Andrii) use READ_ONCE() to read lpm_trie (Tao) explain why we can't get bpf memory usage from memcg. v2->v3: check callback at map creation time and avoid warning (Alexei) fix build error under CONFIG_BPF=n (lkp@intel.com) v1->v2: calculate the memory usage within bpf (Alexei) - [v1] bpf, mm: bpf memory usage https://lwn.net/Articles/921991/ - [RFC PATCH v2] mm, bpf: Add BPF into /proc/meminfo https://lwn.net/Articles/919848/ - [RFC PATCH v1] mm, bpf: Add BPF into /proc/meminfo https://lwn.net/Articles/917647/ - [RFC PATCH] bpf, mm: Add a new item bpf into memory.stat https://lore.kernel.org/bpf/20220921170002.29557-1-laoar.shao@gmail].com/ ====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
#
1746d055 |
| 05-Mar-2023 |
Yafang Shao <laoar.shao@gmail.com> |
bpf: arraymap memory usage
Introduce array_map_mem_usage() to calculate arraymap memory usage. In this helper, some small memory allocations are ignored, like the allocation of struct bpf_array_aux
bpf: arraymap memory usage
Introduce array_map_mem_usage() to calculate arraymap memory usage. In this helper, some small memory allocations are ignored, like the allocation of struct bpf_array_aux in prog_array. The inner_map_meta in array_of_map is also ignored.
The result as follows,
- before 11: array name count_map flags 0x0 key 4B value 4B max_entries 65536 memlock 524288B 12: percpu_array name count_map flags 0x0 key 4B value 4B max_entries 65536 memlock 8912896B 13: perf_event_array name count_map flags 0x0 key 4B value 4B max_entries 65536 memlock 524288B 14: prog_array name count_map flags 0x0 key 4B value 4B max_entries 65536 memlock 524288B 15: cgroup_array name count_map flags 0x0 key 4B value 4B max_entries 65536 memlock 524288B
- after 11: array name count_map flags 0x0 key 4B value 4B max_entries 65536 memlock 524608B 12: percpu_array name count_map flags 0x0 key 4B value 4B max_entries 65536 memlock 17301824B 13: perf_event_array name count_map flags 0x0 key 4B value 4B max_entries 65536 memlock 524608B 14: prog_array name count_map flags 0x0 key 4B value 4B max_entries 65536 memlock 524608B 15: cgroup_array name count_map flags 0x0 key 4B value 4B max_entries 65536 memlock 524608B
Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/r/20230305124615.12358-5-laoar.shao@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
Revision tags: v6.1.15, v6.1.14, v6.1.13 |
|
#
7ae9fb1b |
| 21-Feb-2023 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge branch 'next' into for-linus
Prepare input updates for 6.3 merge window.
|