Revision tags: v6.6.67, v6.6.66, v6.6.65, v6.6.64, v6.6.63, v6.6.62, v6.6.61, v6.6.60, v6.6.59, v6.6.58, v6.6.57, v6.6.56, v6.6.55, v6.6.54, v6.6.53, v6.6.52, v6.6.51, v6.6.50, v6.6.49, v6.6.48, v6.6.47, v6.6.46, v6.6.45, v6.6.44, v6.6.43, v6.6.42, v6.6.41, v6.6.40, v6.6.39, v6.6.38, v6.6.37, v6.6.36, v6.6.35, v6.6.34, v6.6.33, v6.6.32, v6.6.31, v6.6.30, v6.6.29, v6.6.28, v6.6.27, v6.6.26, v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8, v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13 |
|
#
4f2c0a4a |
| 13-Dec-2022 |
Nick Terrell <terrelln@fb.com> |
Merge branch 'main' into zstd-linus
|
#
e291c116 |
| 12-Dec-2022 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge branch 'next' into for-linus
Prepare input updates for 6.2 merge window.
|
Revision tags: v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80 |
|
#
29583dfc |
| 21-Nov-2022 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-next into drm-misc-next-fixes
Backmerging to update drm-misc-next-fixes for the final phase of the release cycle.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
Revision tags: v6.0.9, v5.15.79 |
|
#
002c6ca7 |
| 14-Nov-2022 |
Rodrigo Vivi <rodrigo.vivi@intel.com> |
Merge drm/drm-next into drm-intel-next
Catch up on 6.1-rc cycle in order to solve the intel_backlight conflict on linux-next.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
Revision tags: v6.0.8, v5.15.78 |
|
#
d93618da |
| 04-Nov-2022 |
Joonas Lahtinen <joonas.lahtinen@linux.intel.com> |
Merge drm/drm-next into drm-intel-gt-next
Needed to bring in v6.1-rc1 which contains commit f683b9d61319 ("i915: use the VMA iterator") which is needed for series https://patchwork.freedesktop.org/s
Merge drm/drm-next into drm-intel-gt-next
Needed to bring in v6.1-rc1 which contains commit f683b9d61319 ("i915: use the VMA iterator") which is needed for series https://patchwork.freedesktop.org/series/110083/ .
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
show more ...
|
Revision tags: v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4 |
|
#
14e77332 |
| 21-Oct-2022 |
Nick Terrell <terrelln@fb.com> |
Merge branch 'main' into zstd-next
|
Revision tags: v6.0.3 |
|
#
1aca5ce0 |
| 20-Oct-2022 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-fixes into drm-misc-fixes
Backmerging to get v6.1-rc1.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
#
008f05a7 |
| 19-Oct-2022 |
Mark Brown <broonie@kernel.org> |
ASoC: jz4752b: Capture fixes
Merge series from Siarhei Volkau <lis8215@gmail.com>:
The patchset fixes: - Line In path stays powered off during capturing or bypass to mixer. - incorrectly repre
ASoC: jz4752b: Capture fixes
Merge series from Siarhei Volkau <lis8215@gmail.com>:
The patchset fixes: - Line In path stays powered off during capturing or bypass to mixer. - incorrectly represented dB values in alsamixer, et al. - incorrect represented Capture input selector in alsamixer in Playback tab. - wrong control selected as Capture Master
show more ...
|
#
a140a6a2 |
| 18-Oct-2022 |
Maxime Ripard <maxime@cerno.tech> |
Merge drm/drm-next into drm-misc-next
Let's kick-off this release cycle.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
|
#
c29a017f |
| 17-Oct-2022 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'v6.1-rc1' into next
Merge with mainline to bring in the latest changes to twl4030 driver.
|
#
8048b835 |
| 16-Oct-2022 |
Andrew Morton <akpm@linux-foundation.org> |
Merge branch 'master' into mm-hotfixes-stable
|
Revision tags: v6.0.2, v5.15.74, v5.15.73, v6.0.1 |
|
#
dfd2d876 |
| 10-Oct-2022 |
Johannes Berg <johannes.berg@intel.com> |
Merge remote-tracking branch 'wireless/main' into wireless-next
Pull in wireless/main content since some new code would otherwise conflict with it.
Signed-off-by: Johannes Berg <johannes.berg@intel
Merge remote-tracking branch 'wireless/main' into wireless-next
Pull in wireless/main content since some new code would otherwise conflict with it.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
show more ...
|
Revision tags: v5.15.72 |
|
#
0326074f |
| 04-Oct-2022 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'net-next-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski: "Core:
- Introduce and use a single page frag cache for al
Merge tag 'net-next-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski: "Core:
- Introduce and use a single page frag cache for allocating small skb heads, clawing back the 10-20% performance regression in UDP flood test from previous fixes.
- Run packets which already went thru HW coalescing thru SW GRO. This significantly improves TCP segment coalescing and simplifies deployments as different workloads benefit from HW or SW GRO.
- Shrink the size of the base zero-copy send structure.
- Move TCP init under a new slow / sleepable version of DO_ONCE().
BPF:
- Add BPF-specific, any-context-safe memory allocator.
- Add helpers/kfuncs for PKCS#7 signature verification from BPF programs.
- Define a new map type and related helpers for user space -> kernel communication over a ring buffer (BPF_MAP_TYPE_USER_RINGBUF).
- Allow targeting BPF iterators to loop through resources of one task/thread.
- Add ability to call selected destructive functions. Expose crash_kexec() to allow BPF to trigger a kernel dump. Use CAP_SYS_BOOT check on the loading process to judge permissions.
- Enable BPF to collect custom hierarchical cgroup stats efficiently by integrating with the rstat framework.
- Support struct arguments for trampoline based programs. Only structs with size <= 16B and x86 are supported.
- Invoke cgroup/connect{4,6} programs for unprivileged ICMP ping sockets (instead of just TCP and UDP sockets).
- Add a helper for accessing CLOCK_TAI for time sensitive network related programs.
- Support accessing network tunnel metadata's flags.
- Make TCP SYN ACK RTO tunable by BPF programs with TCP Fast Open.
- Add support for writing to Netfilter's nf_conn:mark.
Protocols:
- WiFi: more Extremely High Throughput (EHT) and Multi-Link Operation (MLO) work (802.11be, WiFi 7).
- vsock: improve support for SO_RCVLOWAT.
- SMC: support SO_REUSEPORT.
- Netlink: define and document how to use netlink in a "modern" way. Support reporting missing attributes via extended ACK.
- IPSec: support collect metadata mode for xfrm interfaces.
- TCPv6: send consistent autoflowlabel in SYN_RECV state and RST packets.
- TCP: introduce optional per-netns connection hash table to allow better isolation between namespaces (opt-in, at the cost of memory and cache pressure).
- MPTCP: support TCP_FASTOPEN_CONNECT.
- Add NEXT-C-SID support in Segment Routing (SRv6) End behavior.
- Adjust IP_UNICAST_IF sockopt behavior for connected UDP sockets.
- Open vSwitch: - Allow specifying ifindex of new interfaces. - Allow conntrack and metering in non-initial user namespace.
- TLS: support the Korean ARIA-GCM crypto algorithm.
- Remove DECnet support.
Driver API:
- Allow selecting the conduit interface used by each port in DSA switches, at runtime.
- Ethernet Power Sourcing Equipment and Power Device support.
- Add tc-taprio support for queueMaxSDU parameter, i.e. setting per traffic class max frame size for time-based packet schedules.
- Support PHY rate matching - adapting between differing host-side and link-side speeds.
- Introduce QUSGMII PHY mode and 1000BASE-KX interface mode.
- Validate OF (device tree) nodes for DSA shared ports; make phylink-related properties mandatory on DSA and CPU ports. Enforcing more uniformity should allow transitioning to phylink.
- Require that flash component name used during update matches one of the components for which version is reported by info_get().
- Remove "weight" argument from driver-facing NAPI API as much as possible. It's one of those magic knobs which seemed like a good idea at the time but is too indirect to use in practice.
- Support offload of TLS connections with 256 bit keys.
New hardware / drivers:
- Ethernet: - Microchip KSZ9896 6-port Gigabit Ethernet Switch - Renesas Ethernet AVB (EtherAVB-IF) Gen4 SoCs - Analog Devices ADIN1110 and ADIN2111 industrial single pair Ethernet (10BASE-T1L) MAC+PHY. - Rockchip RV1126 Gigabit Ethernet (a version of stmmac IP).
- Ethernet SFPs / modules: - RollBall / Hilink / Turris 10G copper SFPs - HALNy GPON module
- WiFi: - CYW43439 SDIO chipset (brcmfmac) - CYW89459 PCIe chipset (brcmfmac) - BCM4378 on Apple platforms (brcmfmac)
Drivers:
- CAN: - gs_usb: HW timestamp support
- Ethernet PHYs: - lan8814: cable diagnostics
- Ethernet NICs: - Intel (100G): - implement control of FCS/CRC stripping - port splitting via devlink - L2TPv3 filtering offload - nVidia/Mellanox: - tunnel offload for sub-functions - MACSec offload, w/ Extended packet number and replay window offload - significantly restructure, and optimize the AF_XDP support, align the behavior with other vendors - Huawei: - configuring DSCP map for traffic class selection - querying standard FEC statistics - querying SerDes lane number via ethtool - Marvell/Cavium: - egress priority flow control - MACSec offload - AMD/SolarFlare: - PTP over IPv6 and raw Ethernet - small / embedded: - ax88772: convert to phylink (to support SFP cages) - altera: tse: convert to phylink - ftgmac100: support fixed link - enetc: standard Ethtool counters - macb: ZynqMP SGMII dynamic configuration support - tsnep: support multi-queue and use page pool - lan743x: Rx IP & TCP checksum offload - igc: add xdp frags support to ndo_xdp_xmit
- Ethernet high-speed switches: - Marvell (prestera): - support SPAN port features (traffic mirroring) - nexthop object offloading - Microchip (sparx5): - multicast forwarding offload - QoS queuing offload (tc-mqprio, tc-tbf, tc-ets)
- Ethernet embedded switches: - Marvell (mv88e6xxx): - support RGMII cmode - NXP (felix): - standardized ethtool counters - Microchip (lan966x): - QoS queuing offload (tc-mqprio, tc-tbf, tc-cbs, tc-ets) - traffic policing and mirroring - link aggregation / bonding offload - QUSGMII PHY mode support
- Qualcomm 802.11ax WiFi (ath11k): - cold boot calibration support on WCN6750 - support to connect to a non-transmit MBSSID AP profile - enable remain-on-channel support on WCN6750 - Wake-on-WLAN support for WCN6750 - support to provide transmit power from firmware via nl80211 - support to get power save duration for each client - spectral scan support for 160 MHz
- MediaTek WiFi (mt76): - WiFi-to-Ethernet bridging offload for MT7986 chips
- RealTek WiFi (rtw89): - P2P support"
* tag 'net-next-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1864 commits) eth: pse: add missing static inlines once: rename _SLOW to _SLEEPABLE net: pse-pd: add regulator based PSE driver dt-bindings: net: pse-dt: add bindings for regulator based PoDL PSE controller ethtool: add interface to interact with Ethernet Power Equipment net: mdiobus: search for PSE nodes by parsing PHY nodes. net: mdiobus: fwnode_mdiobus_register_phy() rework error handling net: add framework to support Ethernet PSE and PDs devices dt-bindings: net: phy: add PoDL PSE property net: marvell: prestera: Propagate nh state from hw to kernel net: marvell: prestera: Add neighbour cache accounting net: marvell: prestera: add stub handler neighbour events net: marvell: prestera: Add heplers to interact with fib_notifier_info net: marvell: prestera: Add length macros for prestera_ip_addr net: marvell: prestera: add delayed wq and flush wq on deinit net: marvell: prestera: Add strict cleanup of fib arbiter net: marvell: prestera: Add cleanup of allocated fib_nodes net: marvell: prestera: Add router nexthops ABI eth: octeon: fix build after netif_napi_add() changes net/mlx5: E-Switch, Return EBUSY if can't get mode lock ...
show more ...
|
#
a08d97a1 |
| 03-Oct-2022 |
Jakub Kicinski <kuba@kernel.org> |
Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
==================== pull-request: bpf-next 2022-10-03
We've added 143 non-merge commits during the last 27
Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
==================== pull-request: bpf-next 2022-10-03
We've added 143 non-merge commits during the last 27 day(s) which contain a total of 151 files changed, 8321 insertions(+), 1402 deletions(-).
The main changes are:
1) Add kfuncs for PKCS#7 signature verification from BPF programs, from Roberto Sassu.
2) Add support for struct-based arguments for trampoline based BPF programs, from Yonghong Song.
3) Fix entry IP for kprobe-multi and trampoline probes under IBT enabled, from Jiri Olsa.
4) Batch of improvements to veristat selftest tool in particular to add CSV output, a comparison mode for CSV outputs and filtering, from Andrii Nakryiko.
5) Add preparatory changes needed for the BPF core for upcoming BPF HID support, from Benjamin Tissoires.
6) Support for direct writes to nf_conn's mark field from tc and XDP BPF program types, from Daniel Xu.
7) Initial batch of documentation improvements for BPF insn set spec, from Dave Thaler.
8) Add a new BPF_MAP_TYPE_USER_RINGBUF map which provides single-user-space-producer / single-kernel-consumer semantics for BPF ring buffer, from David Vernet.
9) Follow-up fixes to BPF allocator under RT to always use raw spinlock for the BPF hashtab's bucket lock, from Hou Tao.
10) Allow creating an iterator that loops through only the resources of one task/thread instead of all, from Kui-Feng Lee.
11) Add support for kptrs in the per-CPU arraymap, from Kumar Kartikeya Dwivedi.
12) Add a new kfunc helper for nf to set src/dst NAT IP/port in a newly allocated CT entry which is not yet inserted, from Lorenzo Bianconi.
13) Remove invalid recursion check for struct_ops for TCP congestion control BPF programs, from Martin KaFai Lau.
14) Fix W^X issue with BPF trampoline and BPF dispatcher, from Song Liu.
15) Fix percpu_counter leakage in BPF hashtab allocation error path, from Tetsuo Handa.
16) Various cleanups in BPF selftests to use preferred ASSERT_* macros, from Wang Yufen.
17) Add invocation for cgroup/connect{4,6} BPF programs for ICMP pings, from YiFei Zhu.
18) Lift blinding decision under bpf_jit_harden = 1 to bpf_capable(), from Yauheni Kaliuta.
19) Various libbpf fixes and cleanups including a libbpf NULL pointer deref, from Xin Liu.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (143 commits) net: netfilter: move bpf_ct_set_nat_info kfunc in nf_nat_bpf.c Documentation: bpf: Add implementation notes documentations to table of contents bpf, docs: Delete misformatted table. selftests/xsk: Fix double free bpftool: Fix error message of strerror libbpf: Fix overrun in netlink attribute iteration selftests/bpf: Fix spelling mistake "unpriviledged" -> "unprivileged" samples/bpf: Fix typo in xdp_router_ipv4 sample bpftool: Remove unused struct event_ring_info bpftool: Remove unused struct btf_attach_point bpf, docs: Add TOC and fix formatting. bpf, docs: Add Clang note about BPF_ALU bpf, docs: Move Clang notes to a separate file bpf, docs: Linux byteswap note bpf, docs: Move legacy packet instructions to a separate file selftests/bpf: Check -EBUSY for the recurred bpf_setsockopt(TCP_CONGESTION) bpf: tcp: Stop bpf_setsockopt(TCP_CONGESTION) in init ops to recur itself bpf: Refactor bpf_setsockopt(TCP_CONGESTION) handling into another function bpf: Move the "cdg" tcp-cc check to the common sol_tcp_sockopt() bpf: Add __bpf_prog_{enter,exit}_struct_ops for struct_ops trampoline ... ====================
Link: https://lore.kernel.org/r/20221003194915.11847-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
Revision tags: v6.0, v5.15.71, v5.15.70, v5.15.69 |
|
#
e0401dce |
| 19-Sep-2022 |
Yosry Ahmed <yosryahmed@google.com> |
selftests/bpf: Simplify cgroup_hierarchical_stats selftest
The cgroup_hierarchical_stats selftest is complicated. It has to be, because it tests an entire workflow of recording, aggregating, and dum
selftests/bpf: Simplify cgroup_hierarchical_stats selftest
The cgroup_hierarchical_stats selftest is complicated. It has to be, because it tests an entire workflow of recording, aggregating, and dumping cgroup stats. However, some of the complexity is unnecessary. The test now enables the memory controller in a cgroup hierarchy, invokes reclaim, measure reclaim time, THEN uses that reclaim time to test the stats collection and aggregation. We don't need to use such a complicated stat, as the context in which the stat is collected is orthogonal.
Simplify the test by using a simple stat instead of reclaim time, the total number of times a process has ever entered a cgroup. This makes the test simpler and removes the dependency on the memory controller and the memory reclaim interface.
Signed-off-by: Yosry Ahmed <yosryahmed@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: KP Singh <kpsingh@kernel.org> Link: https://lore.kernel.org/bpf/20220919175330.890793-1-yosryahmed@google.com
show more ...
|
Revision tags: v5.15.68, v5.15.67, v5.15.66 |
|
#
2786bcff |
| 06-Sep-2022 |
Paolo Abeni <pabeni@redhat.com> |
Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
==================== pull-request: bpf-next 2022-09-05
The following pull-request contains BPF updates for
Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
==================== pull-request: bpf-next 2022-09-05
The following pull-request contains BPF updates for your *net-next* tree.
We've added 106 non-merge commits during the last 18 day(s) which contain a total of 159 files changed, 5225 insertions(+), 1358 deletions(-).
There are two small merge conflicts, resolve them as follows:
1) tools/testing/selftests/bpf/DENYLIST.s390x
Commit 27e23836ce22 ("selftests/bpf: Add lru_bug to s390x deny list") in bpf tree was needed to get BPF CI green on s390x, but it conflicted with newly added tests on bpf-next. Resolve by adding both hunks, result:
[...] lru_bug # prog 'printk': failed to auto-attach: -524 setget_sockopt # attach unexpected error: -524 (trampoline) cb_refs # expected error message unexpected error: -524 (trampoline) cgroup_hierarchical_stats # JIT does not support calling kernel function (kfunc) htab_update # failed to attach: ERROR: strerror_r(-524)=22 (trampoline) [...]
2) net/core/filter.c
Commit 1227c1771dd2 ("net: Fix data-races around sysctl_[rw]mem_(max|default).") from net tree conflicts with commit 29003875bd5b ("bpf: Change bpf_setsockopt(SOL_SOCKET) to reuse sk_setsockopt()") from bpf-next tree. Take the code as it is from bpf-next tree, result:
[...] if (getopt) { if (optname == SO_BINDTODEVICE) return -EINVAL; return sk_getsockopt(sk, SOL_SOCKET, optname, KERNEL_SOCKPTR(optval), KERNEL_SOCKPTR(optlen)); }
return sk_setsockopt(sk, SOL_SOCKET, optname, KERNEL_SOCKPTR(optval), *optlen); [...]
The main changes are:
1) Add any-context BPF specific memory allocator which is useful in particular for BPF tracing with bonus of performance equal to full prealloc, from Alexei Starovoitov.
2) Big batch to remove duplicated code from bpf_{get,set}sockopt() helpers as an effort to reuse the existing core socket code as much as possible, from Martin KaFai Lau.
3) Extend BPF flow dissector for BPF programs to just augment the in-kernel dissector with custom logic. In other words, allow for partial replacement, from Shmulik Ladkani.
4) Add a new cgroup iterator to BPF with different traversal options, from Hao Luo.
5) Support for BPF to collect hierarchical cgroup statistics efficiently through BPF integration with the rstat framework, from Yosry Ahmed.
6) Support bpf_{g,s}et_retval() under more BPF cgroup hooks, from Stanislav Fomichev.
7) BPF hash table and local storages fixes under fully preemptible kernel, from Hou Tao.
8) Add various improvements to BPF selftests and libbpf for compilation with gcc BPF backend, from James Hilliard.
9) Fix verifier helper permissions and reference state management for synchronous callbacks, from Kumar Kartikeya Dwivedi.
10) Add support for BPF selftest's xskxceiver to also be used against real devices that support MAC loopback, from Maciej Fijalkowski.
11) Various fixes to the bpf-helpers(7) man page generation script, from Quentin Monnet.
12) Document BPF verifier's tnum_in(tnum_range(), ...) gotchas, from Shung-Hsi Yu.
13) Various minor misc improvements all over the place.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (106 commits) bpf: Optimize rcu_barrier usage between hash map and bpf_mem_alloc. bpf: Remove usage of kmem_cache from bpf_mem_cache. bpf: Remove prealloc-only restriction for sleepable bpf programs. bpf: Prepare bpf_mem_alloc to be used by sleepable bpf programs. bpf: Remove tracing program restriction on map types bpf: Convert percpu hash map to per-cpu bpf_mem_alloc. bpf: Add percpu allocation support to bpf_mem_alloc. bpf: Batch call_rcu callbacks instead of SLAB_TYPESAFE_BY_RCU. bpf: Adjust low/high watermarks in bpf_mem_cache bpf: Optimize call_rcu in non-preallocated hash map. bpf: Optimize element count in non-preallocated hash map. bpf: Relax the requirement to use preallocated hash maps in tracing progs. samples/bpf: Reduce syscall overhead in map_perf_test. selftests/bpf: Improve test coverage of test_maps bpf: Convert hash map to bpf_mem_alloc. bpf: Introduce any context BPF specific memory allocator. selftest/bpf: Add test for bpf_getsockopt() bpf: Change bpf_getsockopt(SOL_IPV6) to reuse do_ipv6_getsockopt() bpf: Change bpf_getsockopt(SOL_IP) to reuse do_ip_getsockopt() bpf: Change bpf_getsockopt(SOL_TCP) to reuse do_tcp_getsockopt() ... ====================
Link: https://lore.kernel.org/r/20220905161136.9150-1-daniel@iogearbox.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>
show more ...
|
Revision tags: v5.15.65, v5.15.64 |
|
#
d4ffb6f3 |
| 25-Aug-2022 |
Hao Luo <haoluo@google.com> |
bpf: Add CGROUP prefix to cgroup_iter_order
bpf_cgroup_iter_order is globally visible but the entries do not have CGROUP prefix. As requested by Andrii, put a CGROUP in the names in bpf_cgroup_iter_
bpf: Add CGROUP prefix to cgroup_iter_order
bpf_cgroup_iter_order is globally visible but the entries do not have CGROUP prefix. As requested by Andrii, put a CGROUP in the names in bpf_cgroup_iter_order.
This patch fixes two previous commits: one introduced the API and the other uses the API in bpf selftest (that is, the selftest cgroup_hierarchical_stats).
I tested this patch via the following command:
test_progs -t cgroup,iter,btf_dump
Fixes: d4ccaf58a847 ("bpf: Introduce cgroup iter") Fixes: 88886309d2e8 ("selftests/bpf: add a selftest for cgroup hierarchical stats collection") Suggested-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Hao Luo <haoluo@google.com> Link: https://lore.kernel.org/r/20220825223936.1865810-1-haoluo@google.com Signed-off-by: Martin KaFai Lau <kafai@fb.com>
show more ...
|
#
eef3c3d3 |
| 25-Aug-2022 |
Alexei Starovoitov <ast@kernel.org> |
Merge branch 'bpf: rstat: cgroup hierarchical'
Hao Luo says:
====================
This patch series allows for using bpf to collect hierarchical cgroup stats efficiently by integrating with the rs
Merge branch 'bpf: rstat: cgroup hierarchical'
Hao Luo says:
====================
This patch series allows for using bpf to collect hierarchical cgroup stats efficiently by integrating with the rstat framework. The rstat framework provides an efficient way to collect cgroup stats percpu and propagate them through the cgroup hierarchy.
The stats are exposed to userspace in textual form by reading files in bpffs, similar to cgroupfs stats by using a cgroup_iter program. cgroup_iter is a type of bpf_iter. It walks over cgroups in four modes: - walking a cgroup's descendants in pre-order. - walking a cgroup's descendants in post-order. - walking a cgroup's ancestors. - process only a single object.
When attaching cgroup_iter, one needs to set a cgroup to the iter_link created from attaching. This cgroup can be passed either as a file descriptor or a cgroup id. That cgroup serves as the starting point of the walk.
One can also terminate the walk early by returning 1 from the iter program.
Note that because walking cgroup hierarchy holds cgroup_mutex, the iter program is called with cgroup_mutex held.
** Background on rstat for stats collection ** (I am using a subscriber analogy that is not commonly used)
The rstat framework maintains a tree of cgroups that have updates and which cpus have updates. A subscriber to the rstat framework maintains their own stats. The framework is used to tell the subscriber when and what to flush, for the most efficient stats propagation. The workflow is as follows:
- When a subscriber updates a cgroup on a cpu, it informs the rstat framework by calling cgroup_rstat_updated(cgrp, cpu).
- When a subscriber wants to read some stats for a cgroup, it asks the rstat framework to initiate a stats flush (propagation) by calling cgroup_rstat_flush(cgrp).
- When the rstat framework initiates a flush, it makes callbacks to subscribers to aggregate stats on cpus that have updates, and propagate updates to their parent.
Currently, the main subscribers to the rstat framework are cgroup subsystems (e.g. memory, block). This patch series allow bpf programs to become subscribers as well.
Patches in this series are organized as follows: * Patches 1-2 introduce cgroup_iter prog, and a selftest. * Patches 3-5 allow bpf programs to integrate with rstat by adding the necessary hook points and kfunc. A comprehensive selftest that demonstrates the entire workflow for using bpf and rstat to efficiently collect and output cgroup stats is added. --- Changelog: v8 -> v9: - Make UNSPEC (an invalid option) as the default order for cgroup_iter. - Use enum for specifying cgroup_iter order, instead of u32. - Add BPF_ITER_RESHCED to cgroup_iter. - Add cgroup_hierarchical_stats to s390x denylist.
v7 -> v8: - Removed the confusing BPF_ITER_DEFAULT (Andrii) - s/SELF/SELF_ONLY/g - Fixed typo (e.g. outputing) (Andrii) - Use "descendants_pre", "descendants_post" etc. instead of "pre", "post" (Andrii)
v6 -> v7: - Updated commit/comments in cgroup_iter for read() behavior (Yonghong) - Extracted BPF_ITER_SELF and other options out of cgroup_iter, so that they can be used in other iters. Also renamed them. (Andrii) - Supports both cgroup_fd and cgroup_id when specifying target cgroup. (Andrii) - Avoided using macro for formatting expected output in cgroup_iter selftest. (Andrii) - Applied 'static' on all vars and functions in cgroup_iter selftest. (Andrii) - Fixed broken buf reading in cgroup_iter selftest. (Andrii) - Switched to use bpf_link__destroy() unconditionally. (Andrii) - Removed 'volatile' for non-const global vars in selftests. (Andrii) - Started using bpf_core_enum_value() to get memory_cgrp_id. (Andrii)
v5 -> v6: - Rebased on bpf-next - Tidy up cgroup_hierarchical_stats test (Andrii) * 'static' and 'inline' * avoid using libbpf_get_error() * string literals of cgroup paths. - Rename patch 8/8 to 'selftests/bpf' (Yonghong) - Fix cgroup_iter comments (e.g. PAGE_SIZE and uapi) (Yonghong) - Make sure further read() returns OK after previous read() finished properly (Yonghong) - Release cgroup_mutex before the last call of show() (Kumar)
v4 -> v5: - Rebased on top of new kfunc flags infrastructure, updated patch 1 and patch 6 accordingly. - Added docs for sleepable kfuncs.
v3 -> v4: - cgroup_iter: * reorder fields in bpf_link_info to avoid break uapi (Yonghong) * comment the behavior when cgroup_fd=0 (Yonghong) * comment on the limit of number of cgroups supported by cgroup_iter. (Yonghong) - cgroup_hierarchical_stats selftest: * Do not return -1 if stats are not found (causes overflow in userspace). * Check if child process failed to join cgroup. * Make buf and path arrays in get_cgroup_vmscan_delay() static. * Increase the test map sizes to accomodate cgroups that are not created by the test.
v2 -> v3: - cgroup_iter: * Added conditional compilation of cgroup_iter.c in kernel/bpf/Makefile (kernel test) and dropped the !CONFIG_CGROUP patch. * Added validation of traversal_order when attaching (Yonghong). * Fixed previous wording "two modes" to "three modes" (Yonghong). * Fixed the btf_dump selftest broken by this patch (Yonghong). * Fixed ctx_arg_info[0] to use "PTR_TO_BTF_ID_OR_NULL" instead of "PTR_TO_BTF_ID", because the "cgroup" pointer passed to iter prog can be null. - Use __diag_push to eliminate __weak noinline warning in bpf_rstat_flush(). - cgroup_hierarchical_stats selftest: * Added write_cgroup_file_parent() helper. * Added error handling for failed map updates. * Added null check for cgroup in vmscan_flush. * Fixed the signature of vmscan_[start/end]. * Correctly return error code when attaching trace programs fail. * Make sure all links are destroyed correctly and not leaking in cgroup_hierarchical_stats selftest. * Use memory.reclaim instead of memory.high as a more reliable way to invoke reclaim. * Eliminated sleeps, the test now runs faster.
v1 -> v2: - Redesign of cgroup_iter from v1, based on Alexei's idea [1]: * supports walking cgroup subtree. * supports walking ancestors of a cgroup. (Andrii) * supports terminating the walk early. * uses fd instead of cgroup_id as parameter for iter_link. Using fd is a convention in bpf. * gets cgroup's ref at attach time and deref at detach. * brought back cgroup1 support for cgroup_iter. - Squashed the patches adding the rstat flush hook points and kfuncs (Tejun). - Added a comment explaining why bpf_rstat_flush() needs to be weak (Tejun). - Updated the final selftest with the new cgroup_iter design. - Changed CHECKs in the selftest with ASSERTs (Yonghong, Andrii). - Removed empty line at the end of the selftest (Yonghong). - Renamed test files to cgroup_hierarchical_stats.c. - Reordered CGROUP_PATH params order to match struct declaration in the selftest (Michal). - Removed memory_subsys_enabled() and made sure memcg controller enablement checks make sense and are documented (Michal).
RFC v2 -> v1: - Instead of introducing a new program type for rstat flushing, add an empty hook point, bpf_rstat_flush(), and use fentry bpf programs to attach to it and flush bpf stats. - Instead of using helpers, use kfuncs for rstat functions. - These changes simplify the patchset greatly, with minimal changes to uapi.
RFC v1 -> RFC v2: - Instead of rstat flush programs attach to subsystems, they now attach to rstat (global flushers, not per-subsystem), based on discussions with Tejun. The first patch is entirely rewritten. - Pass cgroup pointers to rstat flushers instead of cgroup ids. This is much more flexibility and less likely to need a uapi update later. - rstat helpers are now only defined if CGROUP_CONFIG. - Most of the code is now only defined if CGROUP_CONFIG and CONFIG_BPF_SYSCALL. - Move rstat helper protos from bpf_base_func_proto() to tracing_prog_func_proto(). - rstat helpers argument (cgroup pointer) is now ARG_PTR_TO_BTF_ID, not ARG_ANYTHING. - Rewrote the selftest to use the cgroup helpers. - Dropped bpf_map_lookup_percpu_elem (already added by Feng). - Dropped patch to support cgroup v1 for cgroup_iter. - Dropped patch to define some cgroup_put() when !CONFIG_CGROUP. The code that calls it is no longer compiled when !CONFIG_CGROUP.
cgroup_iter was originally introduced in a different patch series[2]. Hao and I agreed that it fits better as part of this series. RFC v1 of this patch series had the following changes from [2]: - Getting the cgroup's reference at the time at attaching, instead of at the time when iterating. (Yonghong) - Remove .init_seq_private and .fini_seq_private callbacks for cgroup_iter. They are not needed now. (Yonghong)
[1] https://lore.kernel.org/bpf/20220520221919.jnqgv52k4ajlgzcl@MBP-98dd607d3435.dhcp.thefacebook.com/ [2] https://lore.kernel.org/lkml/20220225234339.2386398-9-haoluo@google.com/
Hao Luo (2): bpf: Introduce cgroup iter selftests/bpf: Test cgroup_iter. ====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
Revision tags: v5.15.63 |
|
#
88886309 |
| 24-Aug-2022 |
Yosry Ahmed <yosryahmed@google.com> |
selftests/bpf: add a selftest for cgroup hierarchical stats collection
Add a selftest that tests the whole workflow for collecting, aggregating (flushing), and displaying cgroup hierarchical stats.
selftests/bpf: add a selftest for cgroup hierarchical stats collection
Add a selftest that tests the whole workflow for collecting, aggregating (flushing), and displaying cgroup hierarchical stats.
TL;DR: - Userspace program creates a cgroup hierarchy and induces memcg reclaim in parts of it. - Whenever reclaim happens, vmscan_start and vmscan_end update per-cgroup percpu readings, and tell rstat which (cgroup, cpu) pairs have updates. - When userspace tries to read the stats, vmscan_dump calls rstat to flush the stats, and outputs the stats in text format to userspace (similar to cgroupfs stats). - rstat calls vmscan_flush once for every (cgroup, cpu) pair that has updates, vmscan_flush aggregates cpu readings and propagates updates to parents. - Userspace program makes sure the stats are aggregated and read correctly.
Detailed explanation: - The test loads tracing bpf programs, vmscan_start and vmscan_end, to measure the latency of cgroup reclaim. Per-cgroup readings are stored in percpu maps for efficiency. When a cgroup reading is updated on a cpu, cgroup_rstat_updated(cgroup, cpu) is called to add the cgroup to the rstat updated tree on that cpu.
- A cgroup_iter program, vmscan_dump, is loaded and pinned to a file, for each cgroup. Reading this file invokes the program, which calls cgroup_rstat_flush(cgroup) to ask rstat to propagate the updates for all cpus and cgroups that have updates in this cgroup's subtree. Afterwards, the stats are exposed to the user. vmscan_dump returns 1 to terminate iteration early, so that we only expose stats for one cgroup per read.
- An ftrace program, vmscan_flush, is also loaded and attached to bpf_rstat_flush. When rstat flushing is ongoing, vmscan_flush is invoked once for each (cgroup, cpu) pair that has updates. cgroups are popped from the rstat tree in a bottom-up fashion, so calls will always be made for cgroups that have updates before their parents. The program aggregates percpu readings to a total per-cgroup reading, and also propagates them to the parent cgroup. After rstat flushing is over, all cgroups will have correct updated hierarchical readings (including all cpus and all their descendants).
- Finally, the test creates a cgroup hierarchy and induces memcg reclaim in parts of it, and makes sure that the stats collection, aggregation, and reading workflow works as expected.
Signed-off-by: Yosry Ahmed <yosryahmed@google.com> Signed-off-by: Hao Luo <haoluo@google.com> Link: https://lore.kernel.org/r/20220824233117.1312810-6-haoluo@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|