History log of /openbmc/linux/kernel/bpf/helpers.c (Results 1 – 25 of 728)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.6.67, v6.6.66, v6.6.65, v6.6.64, v6.6.63, v6.6.62, v6.6.61, v6.6.60, v6.6.59, v6.6.58, v6.6.57, v6.6.56, v6.6.55, v6.6.54, v6.6.53, v6.6.52, v6.6.51, v6.6.50, v6.6.49, v6.6.48, v6.6.47, v6.6.46, v6.6.45, v6.6.44, v6.6.43, v6.6.42, v6.6.41, v6.6.40, v6.6.39, v6.6.38, v6.6.37, v6.6.36, v6.6.35, v6.6.34, v6.6.33, v6.6.32, v6.6.31, v6.6.30, v6.6.29, v6.6.28, v6.6.27, v6.6.26, v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8, v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14
# 1a931707 16-Dec-2022 Arnaldo Carvalho de Melo <acme@redhat.com>

Merge remote-tracking branch 'torvalds/master' into perf/core

To resolve a trivial merge conflict with c302378bc157f6a7 ("libbpf:
Hashmap interface update to allow both long and void* keys/values"),

Merge remote-tracking branch 'torvalds/master' into perf/core

To resolve a trivial merge conflict with c302378bc157f6a7 ("libbpf:
Hashmap interface update to allow both long and void* keys/values"),
where a function present upstream was removed in the perf tools
development tree.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

show more ...


Revision tags: v6.0.13
# 4f2c0a4a 13-Dec-2022 Nick Terrell <terrelln@fb.com>

Merge branch 'main' into zstd-linus


# 7e68dd7d 13-Dec-2022 Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'net-next-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Paolo Abeni:
"Core:

- Allow live renaming when an interface is up

- Ad

Merge tag 'net-next-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Paolo Abeni:
"Core:

- Allow live renaming when an interface is up

- Add retpoline wrappers for tc, improving considerably the
performances of complex queue discipline configurations

- Add inet drop monitor support

- A few GRO performance improvements

- Add infrastructure for atomic dev stats, addressing long standing
data races

- De-duplicate common code between OVS and conntrack offloading
infrastructure

- A bunch of UBSAN_BOUNDS/FORTIFY_SOURCE improvements

- Netfilter: introduce packet parser for tunneled packets

- Replace IPVS timer-based estimators with kthreads to scale up the
workload with the number of available CPUs

- Add the helper support for connection-tracking OVS offload

BPF:

- Support for user defined BPF objects: the use case is to allocate
own objects, build own object hierarchies and use the building
blocks to build own data structures flexibly, for example, linked
lists in BPF

- Make cgroup local storage available to non-cgroup attached BPF
programs

- Avoid unnecessary deadlock detection and failures wrt BPF task
storage helpers

- A relevant bunch of BPF verifier fixes and improvements

- Veristat tool improvements to support custom filtering, sorting,
and replay of results

- Add LLVM disassembler as default library for dumping JITed code

- Lots of new BPF documentation for various BPF maps

- Add bpf_rcu_read_{,un}lock() support for sleepable programs

- Add RCU grace period chaining to BPF to wait for the completion of
access from both sleepable and non-sleepable BPF programs

- Add support storing struct task_struct objects as kptrs in maps

- Improve helper UAPI by explicitly defining BPF_FUNC_xxx integer
values

- Add libbpf *_opts API-variants for bpf_*_get_fd_by_id() functions

Protocols:

- TCP: implement Protective Load Balancing across switch links

- TCP: allow dynamically disabling TCP-MD5 static key, reverting back
to fast[er]-path

- UDP: Introduce optional per-netns hash lookup table

- IPv6: simplify and cleanup sockets disposal

- Netlink: support different type policies for each generic netlink
operation

- MPTCP: add MSG_FASTOPEN and FastOpen listener side support

- MPTCP: add netlink notification support for listener sockets events

- SCTP: add VRF support, allowing sctp sockets binding to VRF devices

- Add bridging MAC Authentication Bypass (MAB) support

- Extensions for Ethernet VPN bridging implementation to better
support multicast scenarios

- More work for Wi-Fi 7 support, comprising conversion of all the
existing drivers to internal TX queue usage

- IPSec: introduce a new offload type (packet offload) allowing
complete header processing and crypto offloading

- IPSec: extended ack support for more descriptive XFRM error
reporting

- RXRPC: increase SACK table size and move processing into a
per-local endpoint kernel thread, reducing considerably the
required locking

- IEEE 802154: synchronous send frame and extended filtering support,
initial support for scanning available 15.4 networks

- Tun: bump the link speed from 10Mbps to 10Gbps

- Tun/VirtioNet: implement UDP segmentation offload support

Driver API:

- PHY/SFP: improve power level switching between standard level 1 and
the higher power levels

- New API for netdev <-> devlink_port linkage

- PTP: convert existing drivers to new frequency adjustment
implementation

- DSA: add support for rx offloading

- Autoload DSA tagging driver when dynamically changing protocol

- Add new PCP and APPTRUST attributes to Data Center Bridging

- Add configuration support for 800Gbps link speed

- Add devlink port function attribute to enable/disable RoCE and
migratable

- Extend devlink-rate to support strict prioriry and weighted fair
queuing

- Add devlink support to directly reading from region memory

- New device tree helper to fetch MAC address from nvmem

- New big TCP helper to simplify temporary header stripping

New hardware / drivers:

- Ethernet:
- Marvel Octeon CNF95N and CN10KB Ethernet Switches
- Marvel Prestera AC5X Ethernet Switch
- WangXun 10 Gigabit NIC
- Motorcomm yt8521 Gigabit Ethernet
- Microchip ksz9563 Gigabit Ethernet Switch
- Microsoft Azure Network Adapter
- Linux Automation 10Base-T1L adapter

- PHY:
- Aquantia AQR112 and AQR412
- Motorcomm YT8531S

- PTP:
- Orolia ART-CARD

- WiFi:
- MediaTek Wi-Fi 7 (802.11be) devices
- RealTek rtw8821cu, rtw8822bu, rtw8822cu and rtw8723du USB
devices

- Bluetooth:
- Broadcom BCM4377/4378/4387 Bluetooth chipsets
- Realtek RTL8852BE and RTL8723DS
- Cypress.CYW4373A0 WiFi + Bluetooth combo device

Drivers:

- CAN:
- gs_usb: bus error reporting support
- kvaser_usb: listen only and bus error reporting support

- Ethernet NICs:
- Intel (100G):
- extend action skbedit to RX queue mapping
- implement devlink-rate support
- support direct read from memory
- nVidia/Mellanox (mlx5):
- SW steering improvements, increasing rules update rate
- Support for enhanced events compression
- extend H/W offload packet manipulation capabilities
- implement IPSec packet offload mode
- nVidia/Mellanox (mlx4):
- better big TCP support
- Netronome Ethernet NICs (nfp):
- IPsec offload support
- add support for multicast filter
- Broadcom:
- RSS and PTP support improvements
- AMD/SolarFlare:
- netlink extened ack improvements
- add basic flower matches to offload, and related stats
- Virtual NICs:
- ibmvnic: introduce affinity hint support
- small / embedded:
- FreeScale fec: add initial XDP support
- Marvel mv643xx_eth: support MII/GMII/RGMII modes for Kirkwood
- TI am65-cpsw: add suspend/resume support
- Mediatek MT7986: add RX wireless wthernet dispatch support
- Realtek 8169: enable GRO software interrupt coalescing per
default

- Ethernet high-speed switches:
- Microchip (sparx5):
- add support for Sparx5 TC/flower H/W offload via VCAP
- Mellanox mlxsw:
- add 802.1X and MAC Authentication Bypass offload support
- add ip6gre support

- Embedded Ethernet switches:
- Mediatek (mtk_eth_soc):
- improve PCS implementation, add DSA untag support
- enable flow offload support
- Renesas:
- add rswitch R-Car Gen4 gPTP support
- Microchip (lan966x):
- add full XDP support
- add TC H/W offload via VCAP
- enable PTP on bridge interfaces
- Microchip (ksz8):
- add MTU support for KSZ8 series

- Qualcomm 802.11ax WiFi (ath11k):
- support configuring channel dwell time during scan

- MediaTek WiFi (mt76):
- enable Wireless Ethernet Dispatch (WED) offload support
- add ack signal support
- enable coredump support
- remain_on_channel support

- Intel WiFi (iwlwifi):
- enable Wi-Fi 7 Extremely High Throughput (EHT) PHY capabilities
- 320 MHz channels support

- RealTek WiFi (rtw89):
- new dynamic header firmware format support
- wake-over-WLAN support"

* tag 'net-next-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2002 commits)
ipvs: fix type warning in do_div() on 32 bit
net: lan966x: Remove a useless test in lan966x_ptp_add_trap()
net: ipa: add IPA v4.7 support
dt-bindings: net: qcom,ipa: Add SM6350 compatible
bnxt: Use generic HBH removal helper in tx path
IPv6/GRO: generic helper to remove temporary HBH/jumbo header in driver
selftests: forwarding: Add bridge MDB test
selftests: forwarding: Rename bridge_mdb test
bridge: mcast: Support replacement of MDB port group entries
bridge: mcast: Allow user space to specify MDB entry routing protocol
bridge: mcast: Allow user space to add (*, G) with a source list and filter mode
bridge: mcast: Add support for (*, G) with a source list and filter mode
bridge: mcast: Avoid arming group timer when (S, G) corresponds to a source
bridge: mcast: Add a flag for user installed source entries
bridge: mcast: Expose __br_multicast_del_group_src()
bridge: mcast: Expose br_multicast_new_group_src()
bridge: mcast: Add a centralized error path
bridge: mcast: Place netlink policy before validation functions
bridge: mcast: Split (*, G) and (S, G) addition into different functions
bridge: mcast: Do not derive entry type from its filter mode
...

show more ...


# 26f708a2 12-Dec-2022 Jakub Kicinski <kuba@kernel.org>

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Alexei Starovoitov says:

====================
pull-request: bpf-next 2022-12-11

We've added 74 non-merge comm

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Alexei Starovoitov says:

====================
pull-request: bpf-next 2022-12-11

We've added 74 non-merge commits during the last 11 day(s) which contain
a total of 88 files changed, 3362 insertions(+), 789 deletions(-).

The main changes are:

1) Decouple prune and jump points handling in the verifier, from Andrii.

2) Do not rely on ALLOW_ERROR_INJECTION for fmod_ret, from Benjamin.
Merged from hid tree.

3) Do not zero-extend kfunc return values. Necessary fix for 32-bit archs,
from Björn.

4) Don't use rcu_users to refcount in task kfuncs, from David.

5) Three reg_state->id fixes in the verifier, from Eduard.

6) Optimize bpf_mem_alloc by reusing elements from free_by_rcu, from Hou.

7) Refactor dynptr handling in the verifier, from Kumar.

8) Remove the "/sys" mount and umount dance in {open,close}_netns
in bpf selftests, from Martin.

9) Enable sleepable support for cgrp local storage, from Yonghong.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (74 commits)
selftests/bpf: test case for relaxed prunning of active_lock.id
selftests/bpf: Add pruning test case for bpf_spin_lock
bpf: use check_ids() for active_lock comparison
selftests/bpf: verify states_equal() maintains idmap across all frames
bpf: states_equal() must build idmap for all function frames
selftests/bpf: test cases for regsafe() bug skipping check_id()
bpf: regsafe() must not skip check_ids()
docs/bpf: Add documentation for BPF_MAP_TYPE_SK_STORAGE
selftests/bpf: Add test for dynptr reinit in user_ringbuf callback
bpf: Use memmove for bpf_dynptr_{read,write}
bpf: Move PTR_TO_STACK alignment check to process_dynptr_func
bpf: Rework check_func_arg_reg_off
bpf: Rework process_dynptr_func
bpf: Propagate errors from process_* checks in check_func_arg
bpf: Refactor ARG_PTR_TO_DYNPTR checks into process_dynptr_func
bpf: Skip rcu_barrier() if rcu_trace_implies_rcu_gp() is true
bpf: Reuse freed element in free_by_rcu during allocation
selftests/bpf: Bring test_offload.py back to life
bpf: Fix comment error in fixup_kfunc_call function
bpf: Do not zero-extend kfunc return values
...
====================

Link: https://lore.kernel.org/r/20221212024701.73809-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

show more ...


# e291c116 12-Dec-2022 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge branch 'next' into for-linus

Prepare input updates for 6.2 merge window.


Revision tags: v6.1
# 26d6506a 08-Dec-2022 Alexei Starovoitov <ast@kernel.org>

Merge branch 'Dynptr refactorings'

Kumar Kartikeya Dwivedi says:

====================

This is part 1 of https://lore.kernel.org/bpf/20221018135920.726360-1-memxor@gmail.com.
This thread also gives

Merge branch 'Dynptr refactorings'

Kumar Kartikeya Dwivedi says:

====================

This is part 1 of https://lore.kernel.org/bpf/20221018135920.726360-1-memxor@gmail.com.
This thread also gives some background on why the refactor is being done:
https://lore.kernel.org/bpf/CAEf4Bzb4beTHgVo+G+jehSj8oCeAjRbRcm6MRe=Gr+cajRBwEw@mail.gmail.com

As requested in patch 6 by Alexei, it only includes patches which
refactors the code, on top of which further fixes will be made in part
2. The refactor itself fixes another issue as a side effect. No
functional change is intended (except a few modified log messages).

Changelog:
----------
v1 -> v2
v1: https://lore.kernel.org/bpf/20221115000130.1967465-1-memxor@gmail.com

* Address feedback from Joanne and David, add acks

Fixes v1 -> v1
Fixes v1: https://lore.kernel.org/bpf/20221018135920.726360-1-memxor@gmail.com

* Collect acks from Joanne and David
* Fix misc nits pointed out by Joanne, David
* Split move of reg->off alignment check for dynptr into separate
change (Alexei)
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


Revision tags: v6.0.12
# 76d16077 07-Dec-2022 Kumar Kartikeya Dwivedi <memxor@gmail.com>

bpf: Use memmove for bpf_dynptr_{read,write}

It may happen that destination buffer memory overlaps with memory dynptr
points to. Hence, we must use memmove to correctly copy from dynptr to
destinati

bpf: Use memmove for bpf_dynptr_{read,write}

It may happen that destination buffer memory overlaps with memory dynptr
points to. Hence, we must use memmove to correctly copy from dynptr to
destination buffer, or source buffer to dynptr.

This actually isn't a problem right now, as memcpy implementation falls
back to memmove on detecting overlap and warns about it, but we
shouldn't be relying on that.

Acked-by: Joanne Koong <joannelkoong@gmail.com>
Acked-by: David Vernet <void@manifault.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20221207204141.308952-7-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


# 27060531 07-Dec-2022 Kumar Kartikeya Dwivedi <memxor@gmail.com>

bpf: Rework process_dynptr_func

Recently, user ringbuf support introduced a PTR_TO_DYNPTR register type
for use in callback state, because in case of user ringbuf helpers,
there is no dynptr on the

bpf: Rework process_dynptr_func

Recently, user ringbuf support introduced a PTR_TO_DYNPTR register type
for use in callback state, because in case of user ringbuf helpers,
there is no dynptr on the stack that is passed into the callback. To
reflect such a state, a special register type was created.

However, some checks have been bypassed incorrectly during the addition
of this feature. First, for arg_type with MEM_UNINIT flag which
initialize a dynptr, they must be rejected for such register type.
Secondly, in the future, there are plans to add dynptr helpers that
operate on the dynptr itself and may change its offset and other
properties.

In all of these cases, PTR_TO_DYNPTR shouldn't be allowed to be passed
to such helpers, however the current code simply returns 0.

The rejection for helpers that release the dynptr is already handled.

For fixing this, we take a step back and rework existing code in a way
that will allow fitting in all classes of helpers and have a coherent
model for dealing with the variety of use cases in which dynptr is used.

First, for ARG_PTR_TO_DYNPTR, it can either be set alone or together
with a DYNPTR_TYPE_* constant that denotes the only type it accepts.

Next, helpers which initialize a dynptr use MEM_UNINIT to indicate this
fact. To make the distinction clear, use MEM_RDONLY flag to indicate
that the helper only operates on the memory pointed to by the dynptr,
not the dynptr itself. In C parlance, it would be equivalent to taking
the dynptr as a point to const argument.

When either of these flags are not present, the helper is allowed to
mutate both the dynptr itself and also the memory it points to.
Currently, the read only status of the memory is not tracked in the
dynptr, but it would be trivial to add this support inside dynptr state
of the register.

With these changes and renaming PTR_TO_DYNPTR to CONST_PTR_TO_DYNPTR to
better reflect its usage, it can no longer be passed to helpers that
initialize a dynptr, i.e. bpf_dynptr_from_mem, bpf_ringbuf_reserve_dynptr.

A note to reviewers is that in code that does mark_stack_slots_dynptr,
and unmark_stack_slots_dynptr, we implicitly rely on the fact that
PTR_TO_STACK reg is the only case that can reach that code path, as one
cannot pass CONST_PTR_TO_DYNPTR to helpers that don't set MEM_RDONLY. In
both cases such helpers won't be setting that flag.

The next patch will add a couple of selftest cases to make sure this
doesn't break.

Fixes: 205715673844 ("bpf: Add bpf_user_ringbuf_drain() helper")
Acked-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20221207204141.308952-4-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


# 2d141236 07-Dec-2022 Alexei Starovoitov <ast@kernel.org>

Merge branch 'Document some recent core kfunc additions'

David Vernet says:

====================

A series of recent patch sets introduced kfuncs that allowed struct
task_struct and struct cgroup o

Merge branch 'Document some recent core kfunc additions'

David Vernet says:

====================

A series of recent patch sets introduced kfuncs that allowed struct
task_struct and struct cgroup objects to be used as kptrs. These were
introduced in [0], [1], and [2].

[0]: https://lore.kernel.org/lkml/20221120051004.3605026-1-void@manifault.com/
[1]: https://lore.kernel.org/lkml/20221122145300.251210-2-void@manifault.com/T/
[2]: https://lore.kernel.org/lkml/20221122055458.173143-1-void@manifault.com/

These are "core" kfuncs, in that they may be used by a wide variety of
possible BPF tracepoint or struct_ops programs, and are defined in
kernel/bpf/helpers.c. Even though as kfuncs they have no ABI stability
guarantees, they should still be properly documented. This patch set
adds that documentation.

Some other kfuncs were added recently as well, such as
bpf_rcu_read_lock() and bpf_rcu_read_unlock(). Those could and should be
added to this "Core kfuncs" section as well in subsequent patch sets.

Note that this patch set does not contain documentation for
bpf_task_acquire_not_zero(), or bpf_task_kptr_get(). As discussed in
[3], those kfuncs currently always return NULL pending resolution on how
to properly protect their arguments using RCU.

[3]: https://lore.kernel.org/all/20221206210538.597606-1-void@manifault.com/
---
Changelog:
v2 -> v3:
- Don't document bpf_task_kptr_get(), and instead provide a more
substantive example for bpf_cgroup_kptr_get().
- Further clarify expected behavior of bpf_task_from_pid() in comments
(Alexei)

v1 -> v2:
- Expand comment to specify that a map holds a reference to a task kptr
if we don't end up releasing it (Alexei)
- Just read task->pid instead of using a probed read (Alexei)
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


# 36aa10ff 07-Dec-2022 David Vernet <void@manifault.com>

bpf/docs: Document struct cgroup * kfuncs

bpf_cgroup_acquire(), bpf_cgroup_release(), bpf_cgroup_kptr_get(), and
bpf_cgroup_ancestor(), are kfuncs that were recently added to
kernel/bpf/helpers.c. T

bpf/docs: Document struct cgroup * kfuncs

bpf_cgroup_acquire(), bpf_cgroup_release(), bpf_cgroup_kptr_get(), and
bpf_cgroup_ancestor(), are kfuncs that were recently added to
kernel/bpf/helpers.c. These are "core" kfuncs in that they're available
for use in any tracepoint or struct_ops BPF program. Though they have no
ABI stability guarantees, we should still document them. This patch adds
a struct cgroup * subsection to the Core kfuncs section which describes
each of these kfuncs.

Signed-off-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20221207204911.873646-3-void@manifault.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


# 25c5e92d 07-Dec-2022 David Vernet <void@manifault.com>

bpf/docs: Document struct task_struct * kfuncs

bpf_task_acquire(), bpf_task_release(), and bpf_task_from_pid() are
kfuncs that were recently added to kernel/bpf/helpers.c. These are
"core" kfuncs in

bpf/docs: Document struct task_struct * kfuncs

bpf_task_acquire(), bpf_task_release(), and bpf_task_from_pid() are
kfuncs that were recently added to kernel/bpf/helpers.c. These are
"core" kfuncs in that they're available for use for any tracepoint or
struct_ops BPF program. Though they have no ABI stability guarantees, we
should still document them. This patch adds a new Core kfuncs section to
the BPF kfuncs doc, and adds entries for all of these task kfuncs.

Note that bpf_task_kptr_get() is not documented, as it still returns
NULL while we're working to resolve how it can use RCU to ensure struct
task_struct * lifetime.

Signed-off-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20221207204911.873646-2-void@manifault.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


# 156ed20d 06-Dec-2022 David Vernet <void@manifault.com>

bpf: Don't use rcu_users to refcount in task kfuncs

A series of prior patches added some kfuncs that allow struct
task_struct * objects to be used as kptrs. These kfuncs leveraged the
'refcount_t rc

bpf: Don't use rcu_users to refcount in task kfuncs

A series of prior patches added some kfuncs that allow struct
task_struct * objects to be used as kptrs. These kfuncs leveraged the
'refcount_t rcu_users' field of the task for performing refcounting.
This field was used instead of 'refcount_t usage', as we wanted to
leverage the safety provided by RCU for ensuring a task's lifetime.

A struct task_struct is refcounted by two different refcount_t fields:

1. p->usage: The "true" refcount field which task lifetime. The
task is freed as soon as this refcount drops to 0.

2. p->rcu_users: An "RCU users" refcount field which is statically
initialized to 2, and is co-located in a union with
a struct rcu_head field (p->rcu). p->rcu_users
essentially encapsulates a single p->usage
refcount, and when p->rcu_users goes to 0, an RCU
callback is scheduled on the struct rcu_head which
decrements the p->usage refcount.

Our logic was that by using p->rcu_users, we would be able to use RCU to
safely issue refcount_inc_not_zero() a task's rcu_users field to
determine if a task could still be acquired, or was exiting.
Unfortunately, this does not work due to p->rcu_users and p->rcu sharing
a union. When p->rcu_users goes to 0, an RCU callback is scheduled to
drop a single p->usage refcount, and because the fields share a union,
the refcount immediately becomes nonzero again after the callback is
scheduled.

If we were to split the fields out of the union, this wouldn't be a
problem. Doing so should also be rather non-controversial, as there are
a number of places in struct task_struct that have padding which we
could use to avoid growing the structure by splitting up the fields.

For now, so as to fix the kfuncs to be correct, this patch instead
updates bpf_task_acquire() and bpf_task_release() to use the p->usage
field for refcounting via the get_task_struct() and put_task_struct()
functions. Because we can no longer rely on RCU, the change also guts
the bpf_task_acquire_not_zero() and bpf_task_kptr_get() functions
pending a resolution on the above problem.

In addition, the task fixes the kfunc and rcu_read_lock selftests to
expect this new behavior.

Fixes: 90660309b0c7 ("bpf: Add kfuncs for storing struct task_struct * as a kptr")
Fixes: fca1aa75518c ("bpf: Handle MEM_RCU type properly")
Reported-by: Matus Jokay <matus.jokay@stuba.sk>
Signed-off-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20221206210538.597606-1-void@manifault.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


# 1910676c 04-Dec-2022 Alexei Starovoitov <ast@kernel.org>

Merge branch 'bpf: Handle MEM_RCU type properly'

Yonghong Song says:

====================

Patch set [1] added rcu support for bpf programs. In [1], a rcu
pointer is considered to be trusted and no

Merge branch 'bpf: Handle MEM_RCU type properly'

Yonghong Song says:

====================

Patch set [1] added rcu support for bpf programs. In [1], a rcu
pointer is considered to be trusted and not null. This is actually
not true in some cases. The rcu pointer could be null, and for non-null
rcu pointer, it may have reference count of 0. This small patch set
fixed this problem. Patch 1 is the kernel fix. Patch 2 adjusted
selftests properly. Patch 3 added documentation for newly-introduced
KF_RCU flag.

[1] https://lore.kernel.org/all/20221124053201.2372298-1-yhs@fb.com/

Changelogs:
v1 -> v2:
- rcu ptr could be NULL.
- non_null_rcu_ptr->rcu_field can be marked as MEM_RCU as well.
- Adjust the code to avoid existing error message change.
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


# fca1aa75 03-Dec-2022 Yonghong Song <yhs@fb.com>

bpf: Handle MEM_RCU type properly

Commit 9bb00b2895cb ("bpf: Add kfunc bpf_rcu_read_lock/unlock()")
introduced MEM_RCU and bpf_rcu_read_lock/unlock() support. In that
commit, a rcu pointer is tagged

bpf: Handle MEM_RCU type properly

Commit 9bb00b2895cb ("bpf: Add kfunc bpf_rcu_read_lock/unlock()")
introduced MEM_RCU and bpf_rcu_read_lock/unlock() support. In that
commit, a rcu pointer is tagged with both MEM_RCU and PTR_TRUSTED
so that it can be passed into kfuncs or helpers as an argument.

Martin raised a good question in [1] such that the rcu pointer,
although being able to accessing the object, might have reference
count of 0. This might cause a problem if the rcu pointer is passed
to a kfunc which expects trusted arguments where ref count should
be greater than 0.

This patch makes the following changes related to MEM_RCU pointer:
- MEM_RCU pointer might be NULL (PTR_MAYBE_NULL).
- Introduce KF_RCU so MEM_RCU ptr can be acquired with
a KF_RCU tagged kfunc which assumes ref count of rcu ptr
could be zero.
- For mem access 'b = ptr->a', say 'ptr' is a MEM_RCU ptr, and
'a' is tagged with __rcu as well. Let us mark 'b' as
MEM_RCU | PTR_MAYBE_NULL.

[1] https://lore.kernel.org/bpf/ac70f574-4023-664e-b711-e0d3b18117fd@linux.dev/

Fixes: 9bb00b2895cb ("bpf: Add kfunc bpf_rcu_read_lock/unlock()")
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20221203184602.477272-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


Revision tags: v6.0.11
# d6dc62fc 28-Nov-2022 Jakub Kicinski <kuba@kernel.org>

Daniel Borkmann says:

====================
bpf-next 2022-11-25

We've added 101 non-merge commits during the last 11 day(s) which contain
a total of 109 files changed, 8827 insertions(+), 1129 delet

Daniel Borkmann says:

====================
bpf-next 2022-11-25

We've added 101 non-merge commits during the last 11 day(s) which contain
a total of 109 files changed, 8827 insertions(+), 1129 deletions(-).

The main changes are:

1) Support for user defined BPF objects: the use case is to allocate own
objects, build own object hierarchies and use the building blocks to
build own data structures flexibly, for example, linked lists in BPF,
from Kumar Kartikeya Dwivedi.

2) Add bpf_rcu_read_{,un}lock() support for sleepable programs,
from Yonghong Song.

3) Add support storing struct task_struct objects as kptrs in maps,
from David Vernet.

4) Batch of BPF map documentation improvements, from Maryam Tahhan
and Donald Hunter.

5) Improve BPF verifier to propagate nullness information for branches
of register to register comparisons, from Eduard Zingerman.

6) Fix cgroup BPF iter infra to hold reference on the start cgroup,
from Hou Tao.

7) Fix BPF verifier to not mark fentry/fexit program arguments as trusted
given it is not the case for them, from Alexei Starovoitov.

8) Improve BPF verifier's realloc handling to better play along with dynamic
runtime analysis tools like KASAN and friends, from Kees Cook.

9) Remove legacy libbpf mode support from bpftool,
from Sahid Orentino Ferdjaoui.

10) Rework zero-len skb redirection checks to avoid potentially breaking
existing BPF test infra users, from Stanislav Fomichev.

11) Two small refactorings which are independent and have been split out
of the XDP queueing RFC series, from Toke Høiland-Jørgensen.

12) Fix a memory leak in LSM cgroup BPF selftest, from Wang Yufen.

13) Documentation on how to run BPF CI without patch submission,
from Daniel Müller.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================

Link: https://lore.kernel.org/r/20221125012450.441-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

show more ...


Revision tags: v6.0.10, v5.15.80
# 6099754a 24-Nov-2022 Alexei Starovoitov <ast@kernel.org>

Merge branch 'bpf: Add bpf_rcu_read_lock() support'

Yonghong Song says:

====================

Currently, without rcu attribute info in BTF, the verifier treats
rcu tagged pointer as a normal pointe

Merge branch 'bpf: Add bpf_rcu_read_lock() support'

Yonghong Song says:

====================

Currently, without rcu attribute info in BTF, the verifier treats
rcu tagged pointer as a normal pointer. This might be a problem
for sleepable program where rcu_read_lock()/unlock() is not available.
For example, for a sleepable fentry program, if rcu protected memory
access is interleaved with a sleepable helper/kfunc, it is possible
the memory access after the sleepable helper/kfunc might be invalid
since the object might have been freed then. Even without
a sleepable helper/kfunc, without rcu_read_lock() protection,
it is possible that the rcu protected object might be release
in the middle of bpf program execution which may cause incorrect
result.

To prevent above cases, enable btf_type_tag("rcu") attributes,
introduce new bpf_rcu_read_lock/unlock() kfuncs and add verifier support.

In the rest of patch set, Patch 1 enabled btf_type_tag for __rcu
attribute. Patche 2 added might_sleep in bpf_func_proto. Patch 3 added new
bpf_rcu_read_lock/unlock() kfuncs and verifier support.
Patch 4 added some tests for these two new kfuncs.

Changelogs:
v9 -> v10:
. if no rcu tag support in vmlinux btf, using bpf_rcu_read_lock/unlock()
will cause verification error.
. at bpf_rcu_read_unlock(), invalidate rcu ptr to PTR_UNTRUSTED
instead of SCALAR_VALUE.
. a few other comment changes and other minor changes.
v8 -> v9:
. remove sleepable prog check for ld_abs/ind checking in rcu read
lock region.
. fix a test failure with gcc-compiled kernel.
. a couple of other minor fixes.
v7 -> v8:
. add might_sleep in bpf_func_proto so we can easily identify whether
a helper is sleepable or not.
. do not enforce rcu rules for sleepable, e.g., rcu dereference must
be in a bpf_rcu_read_lock region. This is to keep old code working
fine.
. Mark 'b' in 'b = a->b' (b is tagged with __rcu) as MEM_RCU only if
'b = a->b' in rcu read region and 'a' is trusted. This adds safety
guarantee for 'b' inside the rcu read region.
v6 -> v7:
. rebase on top of bpf-next.
. remove the patch which enables sleepable program using
cgrp_local_storage map. This is orthogonal to this patch set
and will be addressed separately.
. mark the rcu pointer dereference result as UNTRUSTED if inside
a bpf_rcu_read_lock() region.
v5 -> v6:
. fix selftest prog miss_unlock which tested nested locking.
. add comments in selftest prog cgrp_succ to explain how to handle
nested memory access after rcu memory load.
v4 -> v5:
. add new test to aarch64 deny list.
v3 -> v4:
. fix selftest failures when built with gcc. gcc doesn't support
btf_type_tag yet and some tests relies on that. skip these
tests if vmlinux BTF does not have btf_type_tag("rcu").
v2 -> v3:
. went back to MEM_RCU approach with invalidate rcu ptr registers
at bpf_rcu_read_unlock() place.
. remove KF_RCU_LOCK/UNLOCK flag and compare btf_id at verification
time instead.
v1 -> v2:
. use kfunc instead of helper for bpf_rcu_read_lock/unlock.
. not use MEM_RCU bpf_type_flag, instead use active_rcu_lock
in reg state to identify rcu ptr's.
. Add more self tests.
. add new test to s390x deny list.
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


# 9bb00b28 23-Nov-2022 Yonghong Song <yhs@fb.com>

bpf: Add kfunc bpf_rcu_read_lock/unlock()

Add two kfunc's bpf_rcu_read_lock() and bpf_rcu_read_unlock(). These two kfunc's
can be used for all program types. The following is an example about how
rc

bpf: Add kfunc bpf_rcu_read_lock/unlock()

Add two kfunc's bpf_rcu_read_lock() and bpf_rcu_read_unlock(). These two kfunc's
can be used for all program types. The following is an example about how
rcu pointer are used w.r.t. bpf_rcu_read_lock()/bpf_rcu_read_unlock().

struct task_struct {
...
struct task_struct *last_wakee;
struct task_struct __rcu *real_parent;
...
};

Let us say prog does 'task = bpf_get_current_task_btf()' to get a
'task' pointer. The basic rules are:
- 'real_parent = task->real_parent' should be inside bpf_rcu_read_lock
region. This is to simulate rcu_dereference() operation. The
'real_parent' is marked as MEM_RCU only if (1). task->real_parent is
inside bpf_rcu_read_lock region, and (2). task is a trusted ptr. So
MEM_RCU marked ptr can be 'trusted' inside the bpf_rcu_read_lock region.
- 'last_wakee = real_parent->last_wakee' should be inside bpf_rcu_read_lock
region since it tries to access rcu protected memory.
- the ptr 'last_wakee' will be marked as PTR_UNTRUSTED since in general
it is not clear whether the object pointed by 'last_wakee' is valid or
not even inside bpf_rcu_read_lock region.

The verifier will reset all rcu pointer register states to untrusted
at bpf_rcu_read_unlock() kfunc call site, so any such rcu pointer
won't be trusted any more outside the bpf_rcu_read_lock() region.

The current implementation does not support nested rcu read lock
region in the prog.

Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20221124053217.2373910-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


# 01685c5b 23-Nov-2022 Yonghong Song <yhs@fb.com>

bpf: Introduce might_sleep field in bpf_func_proto

Introduce bpf_func_proto->might_sleep to indicate a particular helper
might sleep. This will make later check whether a helper might be
sleepable o

bpf: Introduce might_sleep field in bpf_func_proto

Introduce bpf_func_proto->might_sleep to indicate a particular helper
might sleep. This will make later check whether a helper might be
sleepable or not easier.

Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20221124053211.2373553-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


# 3f0e6f2b 22-Nov-2022 David Vernet <void@manifault.com>

bpf: Add bpf_task_from_pid() kfunc

Callers can currently store tasks as kptrs using bpf_task_acquire(),
bpf_task_kptr_get(), and bpf_task_release(). These are useful if a
caller already has a struct

bpf: Add bpf_task_from_pid() kfunc

Callers can currently store tasks as kptrs using bpf_task_acquire(),
bpf_task_kptr_get(), and bpf_task_release(). These are useful if a
caller already has a struct task_struct *, but there may be some callers
who only have a pid, and want to look up the associated struct
task_struct * from that to e.g. find task->comm.

This patch therefore adds a new bpf_task_from_pid() kfunc which allows
BPF programs to get a struct task_struct * kptr from a pid.

Signed-off-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20221122145300.251210-2-void@manifault.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


# 2fcc6081 23-Nov-2022 David Vernet <void@manifault.com>

bpf: Don't use idx variable when registering kfunc dtors

In commit fda01efc6160 ("bpf: Enable cgroups to be used as kptrs"), I
added an 'int idx' variable to kfunc_init() which was meant to
dynamica

bpf: Don't use idx variable when registering kfunc dtors

In commit fda01efc6160 ("bpf: Enable cgroups to be used as kptrs"), I
added an 'int idx' variable to kfunc_init() which was meant to
dynamically set the index of the btf id entries of the
'generic_dtor_ids' array. This was done to make the code slightly less
brittle as the struct cgroup * kptr kfuncs such as bpf_cgroup_aquire()
are compiled out if CONFIG_CGROUPS is not defined. This, however, causes
an lkp build warning:

>> kernel/bpf/helpers.c:2005:40: warning: multiple unsequenced
modifications to 'idx' [-Wunsequenced]
.btf_id = generic_dtor_ids[idx++],

Fix the warning by just hard-coding the indices.

Fixes: fda01efc6160 ("bpf: Enable cgroups to be used as kptrs")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: David Vernet <void@manifault.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20221123135253.637525-1-void@manifault.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


# 8a2162a9 22-Nov-2022 Alexei Starovoitov <ast@kernel.org>

Merge branch 'Support storing struct cgroup * objects as kptrs'

David Vernet says:

====================

In [0], we added support for storing struct task_struct * objects as
kptrs. This patch set e

Merge branch 'Support storing struct cgroup * objects as kptrs'

David Vernet says:

====================

In [0], we added support for storing struct task_struct * objects as
kptrs. This patch set extends this effort to also include storing struct
cgroup * object as kptrs.

As with tasks, there are many possible use cases for storing cgroups in
maps. During tracing, for example, it could be useful to query cgroup
statistics such as PSI averages, or tracking which tasks are migrating
to and from the cgroup.

[0]: https://lore.kernel.org/all/20221120051004.3605026-1-void@manifault.com/
====================

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


# 5ca78670 21-Nov-2022 David Vernet <void@manifault.com>

bpf: Add bpf_cgroup_ancestor() kfunc

struct cgroup * objects have a variably sized struct cgroup *ancestors[]
field which stores pointers to their ancestor cgroups. If using a cgroup
as a kptr, it c

bpf: Add bpf_cgroup_ancestor() kfunc

struct cgroup * objects have a variably sized struct cgroup *ancestors[]
field which stores pointers to their ancestor cgroups. If using a cgroup
as a kptr, it can be useful to access these ancestors, but doing so
requires variable offset accesses for PTR_TO_BTF_ID, which is currently
unsupported.

This is a very useful field to access for cgroup kptrs, as programs may
wish to walk their ancestor cgroups when determining e.g. their
proportional cpu.weight. So as to enable this functionality with cgroup
kptrs before var_off is supported for PTR_TO_BTF_ID, this patch adds a
bpf_cgroup_ancestor() kfunc which accesses the cgroup node on behalf of
the caller, and acquires a reference on it. Once var_off is supported
for PTR_TO_BTF_ID, and fields inside a struct can be marked as trusted
so they retain the PTR_TRUSTED modifier when walked, this can be
removed.

Signed-off-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20221122055458.173143-4-void@manifault.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


# fda01efc 21-Nov-2022 David Vernet <void@manifault.com>

bpf: Enable cgroups to be used as kptrs

Now that tasks can be used as kfuncs, and the PTR_TRUSTED flag is
available for us to easily add basic acquire / get / release kfuncs, we
can do the same for

bpf: Enable cgroups to be used as kptrs

Now that tasks can be used as kfuncs, and the PTR_TRUSTED flag is
available for us to easily add basic acquire / get / release kfuncs, we
can do the same for cgroups. This patch set adds the following kfuncs
which enable using cgroups as kptrs:

struct cgroup *bpf_cgroup_acquire(struct cgroup *cgrp);
struct cgroup *bpf_cgroup_kptr_get(struct cgroup **cgrpp);
void bpf_cgroup_release(struct cgroup *cgrp);

A follow-on patch will add a selftest suite which validates these
kfuncs.

Signed-off-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20221122055458.173143-2-void@manifault.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


# 29583dfc 21-Nov-2022 Thomas Zimmermann <tzimmermann@suse.de>

Merge drm/drm-next into drm-misc-next-fixes

Backmerging to update drm-misc-next-fixes for the final phase
of the release cycle.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>


# 99429b22 20-Nov-2022 Alexei Starovoitov <ast@kernel.org>

Merge branch 'bpf: Implement two type cast kfuncs'

Yonghong Song says:

====================

Currenty, a non-tracing bpf program typically has a single 'context' argument
with predefined uapi struc

Merge branch 'bpf: Implement two type cast kfuncs'

Yonghong Song says:

====================

Currenty, a non-tracing bpf program typically has a single 'context' argument
with predefined uapi struct type. Following these uapi struct, user is able
to access other fields defined in uapi header. Inside the kernel, the
user-seen 'context' argument is replaced with 'kernel context' (or 'kctx'
in short) which can access more information than what uapi header provides.
To access other info not in uapi header, people typically do two things:
(1). extend uapi to access more fields rooted from 'context'.
(2). use bpf_probe_read_kernl() helper to read particular field based on
kctx.
Using (1) needs uapi change and using (2) makes code more complex since
direct memory access is not allowed.

There are already a few instances trying to access more information from
kctx:
. trying to access some fields from perf_event kctx ([1]).
. trying to access some fields from xdp kctx ([2]).

This patch set tried to allow direct memory access for kctx fields
by introducing bpf_cast_to_kern_ctx() kfunc.

Martin mentioned a use case like type casting below:
#define skb_shinfo(SKB) ((struct skb_shared_info *)(skb_end_pointer(SKB)))
basically a 'unsigned char *" casted to 'struct skb_shared_info *'. This patch
set tries to support such a use case as well with bpf_rdonly_cast().

For the patch series, Patch 1 added support for a kfunc available to all
prog types. Patch 2 added bpf_cast_to_kern_ctx() kfunc. Patch 3 added
bpf_rdonly_cast() kfunc. Patch 4 added a few positive and negative tests.

[1] https://lore.kernel.org/bpf/ad15b398-9069-4a0e-48cb-4bb651ec3088@meta.com/
[2] https://lore.kernel.org/bpf/20221109215242.1279993-1-john.fastabend@gmail.com/

Changelog:
v3 -> v4:
- remove unnecessary bpf_ctx_convert.t error checking
- add and use meta.ret_btf_id instead of meta.arg_constant.value for
bpf_cast_to_kern_ctx().
- add PTR_TRUSTED to the return PTR_TO_BTF_ID type for bpf_cast_to_kern_ctx().
v2 -> v3:
- rebase on top of bpf-next (for merging conflicts)
- add the selftest to s390x deny list
rfcv1 -> v2:
- break original one kfunc into two.
- add missing error checks and error logs.
- adapt to the new conventions in
https://lore.kernel.org/all/20221118015614.2013203-1-memxor@gmail.com/
for example, with __ign and __k suffix.
- added support in fixup_kfunc_call() to replace kfunc calls with a single mov.
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


12345678910>>...30