Documentation/bpf/bpf_design_QA.rst

1a6ac1d5SJesper Dangaard Brouer==============
1a6ac1d5SJesper Dangaard BrouerBPF Design Q&A
1a6ac1d5SJesper Dangaard Brouer==============
1a6ac1d5SJesper Dangaard Brouer
192092faSJesper Dangaard BrouerBPF extensibility and applicability to networking, tracing, security
192092faSJesper Dangaard Brouerin the linux kernel and several user space implementations of BPF
192092faSJesper Dangaard Brouervirtual machine led to a number of misunderstanding on what BPF actually is.
192092faSJesper Dangaard BrouerThis short QA is an attempt to address that and outline a direction
192092faSJesper Dangaard Brouerof where BPF is heading long term.
192092faSJesper Dangaard Brouer
1a6ac1d5SJesper Dangaard Brouer.. contents::
1a6ac1d5SJesper Dangaard Brouer    :local:
1a6ac1d5SJesper Dangaard Brouer    :depth: 3
1a6ac1d5SJesper Dangaard Brouer
1a6ac1d5SJesper Dangaard BrouerQuestions and Answers
1a6ac1d5SJesper Dangaard Brouer=====================
1a6ac1d5SJesper Dangaard Brouer
192092faSJesper Dangaard BrouerQ: Is BPF a generic instruction set similar to x64 and arm64?
1a6ac1d5SJesper Dangaard Brouer-------------------------------------------------------------
192092faSJesper Dangaard BrouerA: NO.
192092faSJesper Dangaard Brouer
192092faSJesper Dangaard BrouerQ: Is BPF a generic virtual machine ?
1a6ac1d5SJesper Dangaard Brouer-------------------------------------
192092faSJesper Dangaard BrouerA: NO.
192092faSJesper Dangaard Brouer
1a6ac1d5SJesper Dangaard BrouerBPF is generic instruction set *with* C calling convention.
1a6ac1d5SJesper Dangaard Brouer-----------------------------------------------------------
192092faSJesper Dangaard Brouer
192092faSJesper Dangaard BrouerQ: Why C calling convention was chosen?
1a6ac1d5SJesper Dangaard Brouer~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1a6ac1d5SJesper Dangaard Brouer
192092faSJesper Dangaard BrouerA: Because BPF programs are designed to run in the linux kernel
192092faSJesper Dangaard Brouerwhich is written in C, hence BPF defines instruction set compatible
192092faSJesper Dangaard Brouerwith two most used architectures x64 and arm64 (and takes into
192092faSJesper Dangaard Brouerconsideration important quirks of other architectures) and
192092faSJesper Dangaard Brouerdefines calling convention that is compatible with C calling
192092faSJesper Dangaard Brouerconvention of the linux kernel on those architectures.
192092faSJesper Dangaard Brouer
46604676SAndrii NakryikoQ: Can multiple return values be supported in the future?
1a6ac1d5SJesper Dangaard Brouer~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
192092faSJesper Dangaard BrouerA: NO. BPF allows only register R0 to be used as return value.
192092faSJesper Dangaard Brouer
46604676SAndrii NakryikoQ: Can more than 5 function arguments be supported in the future?
1a6ac1d5SJesper Dangaard Brouer~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
192092faSJesper Dangaard BrouerA: NO. BPF calling convention only allows registers R1-R5 to be used
192092faSJesper Dangaard Broueras arguments. BPF is not a standalone instruction set.
192092faSJesper Dangaard Brouer(unlike x64 ISA that allows msft, cdecl and other conventions)
192092faSJesper Dangaard Brouer
46604676SAndrii NakryikoQ: Can BPF programs access instruction pointer or return address?
1a6ac1d5SJesper Dangaard Brouer-----------------------------------------------------------------
192092faSJesper Dangaard BrouerA: NO.
192092faSJesper Dangaard Brouer
46604676SAndrii NakryikoQ: Can BPF programs access stack pointer ?
1a6ac1d5SJesper Dangaard Brouer------------------------------------------
1a6ac1d5SJesper Dangaard BrouerA: NO.
1a6ac1d5SJesper Dangaard Brouer
1a6ac1d5SJesper Dangaard BrouerOnly frame pointer (register R10) is accessible.
192092faSJesper Dangaard BrouerFrom compiler point of view it's necessary to have stack pointer.
46604676SAndrii NakryikoFor example, LLVM defines register R11 as stack pointer in its
192092faSJesper Dangaard BrouerBPF backend, but it makes sure that generated code never uses it.
192092faSJesper Dangaard Brouer
192092faSJesper Dangaard BrouerQ: Does C-calling convention diminishes possible use cases?
1a6ac1d5SJesper Dangaard Brouer-----------------------------------------------------------
1a6ac1d5SJesper Dangaard BrouerA: YES.
1a6ac1d5SJesper Dangaard Brouer
1a6ac1d5SJesper Dangaard BrouerBPF design forces addition of major functionality in the form
192092faSJesper Dangaard Brouerof kernel helper functions and kernel objects like BPF maps with
192092faSJesper Dangaard Brouerseamless interoperability between them. It lets kernel call into
46604676SAndrii NakryikoBPF programs and programs call kernel helpers with zero overhead,
46604676SAndrii Nakryikoas all of them were native C code. That is particularly the case
192092faSJesper Dangaard Brouerfor JITed BPF programs that are indistinguishable from
192092faSJesper Dangaard Brouernative kernel C code.
192092faSJesper Dangaard Brouer
192092faSJesper Dangaard BrouerQ: Does it mean that 'innovative' extensions to BPF code are disallowed?
1a6ac1d5SJesper Dangaard Brouer------------------------------------------------------------------------
1a6ac1d5SJesper Dangaard BrouerA: Soft yes.
1a6ac1d5SJesper Dangaard Brouer
46604676SAndrii NakryikoAt least for now, until BPF core has support for
192092faSJesper Dangaard Brouerbpf-to-bpf calls, indirect calls, loops, global variables,
46604676SAndrii Nakryikojump tables, read-only sections, and all other normal constructs
192092faSJesper Dangaard Brouerthat C code can produce.
192092faSJesper Dangaard Brouer
192092faSJesper Dangaard BrouerQ: Can loops be supported in a safe way?
1a6ac1d5SJesper Dangaard Brouer----------------------------------------
1a6ac1d5SJesper Dangaard BrouerA: It's not clear yet.
1a6ac1d5SJesper Dangaard Brouer
1a6ac1d5SJesper Dangaard BrouerBPF developers are trying to find a way to
3b880244SAlexei Starovoitovsupport bounded loops.
3b880244SAlexei Starovoitov
3b880244SAlexei StarovoitovQ: What are the verifier limits?
3b880244SAlexei Starovoitov--------------------------------
3b880244SAlexei StarovoitovA: The only limit known to the user space is BPF_MAXINSNS (4096).
3b880244SAlexei StarovoitovIt's the maximum number of instructions that the unprivileged bpf
3b880244SAlexei Starovoitovprogram can have. The verifier has various internal limits.
3b880244SAlexei StarovoitovLike the maximum number of instructions that can be explored during
3b880244SAlexei Starovoitovprogram analysis. Currently, that limit is set to 1 million.
3b880244SAlexei StarovoitovWhich essentially means that the largest program can consist
3b880244SAlexei Starovoitovof 1 million NOP instructions. There is a limit to the maximum number
3b880244SAlexei Starovoitovof subsequent branches, a limit to the number of nested bpf-to-bpf
3b880244SAlexei Starovoitovcalls, a limit to the number of the verifier states per instruction,
3b880244SAlexei Starovoitova limit to the number of maps used by the program.
3b880244SAlexei StarovoitovAll these limits can be hit with a sufficiently complex program.
3b880244SAlexei StarovoitovThere are also non-numerical limits that can cause the program
3b880244SAlexei Starovoitovto be rejected. The verifier used to recognize only pointer + constant
3b880244SAlexei Starovoitovexpressions. Now it can recognize pointer + bounded_register.
3b880244SAlexei Starovoitovbpf_lookup_map_elem(key) had a requirement that 'key' must be
3b880244SAlexei Starovoitova pointer to the stack. Now, 'key' can be a pointer to map value.
3b880244SAlexei StarovoitovThe verifier is steadily getting 'smarter'. The limits are
3b880244SAlexei Starovoitovbeing removed. The only way to know that the program is going to
3b880244SAlexei Starovoitovbe accepted by the verifier is to try to load it.
3b880244SAlexei StarovoitovThe bpf development process guarantees that the future kernel
3b880244SAlexei Starovoitovversions will accept all bpf programs that were accepted by
3b880244SAlexei Starovoitovthe earlier versions.
3b880244SAlexei Starovoitov
192092faSJesper Dangaard Brouer
1a6ac1d5SJesper Dangaard BrouerInstruction level questions
1a6ac1d5SJesper Dangaard Brouer---------------------------
1a6ac1d5SJesper Dangaard Brouer
1a6ac1d5SJesper Dangaard BrouerQ: LD_ABS and LD_IND instructions vs C code
1a6ac1d5SJesper Dangaard Brouer~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1a6ac1d5SJesper Dangaard Brouer
192092faSJesper Dangaard BrouerQ: How come LD_ABS and LD_IND instruction are present in BPF whereas
192092faSJesper Dangaard BrouerC code cannot express them and has to use builtin intrinsics?
1a6ac1d5SJesper Dangaard Brouer
192092faSJesper Dangaard BrouerA: This is artifact of compatibility with classic BPF. Modern
192092faSJesper Dangaard Brouernetworking code in BPF performs better without them.
192092faSJesper Dangaard BrouerSee 'direct packet access'.
192092faSJesper Dangaard Brouer
1a6ac1d5SJesper Dangaard BrouerQ: BPF instructions mapping not one-to-one to native CPU
1a6ac1d5SJesper Dangaard Brouer~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
192092faSJesper Dangaard BrouerQ: It seems not all BPF instructions are one-to-one to native CPU.
192092faSJesper Dangaard BrouerFor example why BPF_JNE and other compare and jumps are not cpu-like?
1a6ac1d5SJesper Dangaard Brouer
192092faSJesper Dangaard BrouerA: This was necessary to avoid introducing flags into ISA which are
192092faSJesper Dangaard Brouerimpossible to make generic and efficient across CPU architectures.
192092faSJesper Dangaard Brouer
46604676SAndrii NakryikoQ: Why BPF_DIV instruction doesn't map to x64 div?
1a6ac1d5SJesper Dangaard Brouer~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
192092faSJesper Dangaard BrouerA: Because if we picked one-to-one relationship to x64 it would have made
192092faSJesper Dangaard Brouerit more complicated to support on arm64 and other archs. Also it
192092faSJesper Dangaard Brouerneeds div-by-zero runtime check.
192092faSJesper Dangaard Brouer
192092faSJesper Dangaard BrouerQ: Why BPF has implicit prologue and epilogue?
1a6ac1d5SJesper Dangaard Brouer~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
192092faSJesper Dangaard BrouerA: Because architectures like sparc have register windows and in general
192092faSJesper Dangaard Brouerthere are enough subtle differences between architectures, so naive
192092faSJesper Dangaard Brouerstore return address into stack won't work. Another reason is BPF has
192092faSJesper Dangaard Brouerto be safe from division by zero (and legacy exception path
192092faSJesper Dangaard Brouerof LD_ABS insn). Those instructions need to invoke epilogue and
192092faSJesper Dangaard Brouerreturn implicitly.
192092faSJesper Dangaard Brouer
192092faSJesper Dangaard BrouerQ: Why BPF_JLT and BPF_JLE instructions were not introduced in the beginning?
1a6ac1d5SJesper Dangaard Brouer~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
192092faSJesper Dangaard BrouerA: Because classic BPF didn't have them and BPF authors felt that compiler
192092faSJesper Dangaard Brouerworkaround would be acceptable. Turned out that programs lose performance
192092faSJesper Dangaard Brouerdue to lack of these compare instructions and they were added.
192092faSJesper Dangaard BrouerThese two instructions is a perfect example what kind of new BPF
192092faSJesper Dangaard Brouerinstructions are acceptable and can be added in the future.
192092faSJesper Dangaard BrouerThese two already had equivalent instructions in native CPUs.
192092faSJesper Dangaard BrouerNew instructions that don't have one-to-one mapping to HW instructions
192092faSJesper Dangaard Brouerwill not be accepted.
192092faSJesper Dangaard Brouer
1a6ac1d5SJesper Dangaard BrouerQ: BPF 32-bit subregister requirements
1a6ac1d5SJesper Dangaard Brouer~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
192092faSJesper Dangaard BrouerQ: BPF 32-bit subregisters have a requirement to zero upper 32-bits of BPF
192092faSJesper Dangaard Brouerregisters which makes BPF inefficient virtual machine for 32-bit
192092faSJesper Dangaard BrouerCPU architectures and 32-bit HW accelerators. Can true 32-bit registers
192092faSJesper Dangaard Brouerbe added to BPF in the future?
1a6ac1d5SJesper Dangaard Brouer
c231c22aSJiong WangA: NO.
c231c22aSJiong Wang
c231c22aSJiong WangBut some optimizations on zero-ing the upper 32 bits for BPF registers are
c231c22aSJiong Wangavailable, and can be leveraged to improve the performance of JITed BPF
c231c22aSJiong Wangprograms for 32-bit architectures.
c231c22aSJiong Wang
c231c22aSJiong WangStarting with version 7, LLVM is able to generate instructions that operate
c231c22aSJiong Wangon 32-bit subregisters, provided the option -mattr=+alu32 is passed for
c231c22aSJiong Wangcompiling a program. Furthermore, the verifier can now mark the
c231c22aSJiong Wanginstructions for which zero-ing the upper bits of the destination register
c231c22aSJiong Wangis required, and insert an explicit zero-extension (zext) instruction
c231c22aSJiong Wang(a mov32 variant). This means that for architectures without zext hardware
c231c22aSJiong Wangsupport, the JIT back-ends do not need to clear the upper bits for
c231c22aSJiong Wangsubregisters written by alu32 instructions or narrow loads. Instead, the
c231c22aSJiong Wangback-ends simply need to support code generation for that mov32 variant,
c231c22aSJiong Wangand to overwrite bpf_jit_needs_zext() to make it return "true" (in order to
c231c22aSJiong Wangenable zext insertion in the verifier).
c231c22aSJiong Wang
c231c22aSJiong WangNote that it is possible for a JIT back-end to have partial hardware
c231c22aSJiong Wangsupport for zext. In that case, if verifier zext insertion is enabled,
c231c22aSJiong Wangit could lead to the insertion of unnecessary zext instructions. Such
c231c22aSJiong Wanginstructions could be removed by creating a simple peephole inside the JIT
c231c22aSJiong Wangback-end: if one instruction has hardware support for zext and if the next
c231c22aSJiong Wanginstruction is an explicit zext, then the latter can be skipped when doing
c231c22aSJiong Wangthe code generation.
192092faSJesper Dangaard Brouer
192092faSJesper Dangaard BrouerQ: Does BPF have a stable ABI?
1a6ac1d5SJesper Dangaard Brouer------------------------------
192092faSJesper Dangaard BrouerA: YES. BPF instructions, arguments to BPF programs, set of helper
192092faSJesper Dangaard Brouerfunctions and their arguments, recognized return codes are all part
a769fa72SDaniel Borkmannof ABI. However there is one specific exception to tracing programs
a769fa72SDaniel Borkmannwhich are using helpers like bpf_probe_read() to walk kernel internal
a769fa72SDaniel Borkmanndata structures and compile with kernel internal headers. Both of these
a769fa72SDaniel Borkmannkernel internals are subject to change and can break with newer kernels
a769fa72SDaniel Borkmannsuch that the program needs to be adapted accordingly.
192092faSJesper Dangaard Brouer
27b53b73SToke Høiland-JørgensenNew BPF functionality is generally added through the use of kfuncs instead of
27b53b73SToke Høiland-Jørgensennew helpers. Kfuncs are not considered part of the stable API, and have their own
27b53b73SToke Høiland-Jørgensenlifecycle expectations as described in :ref:`BPF_kfunc_lifecycle_expectations`.
27b53b73SToke Høiland-Jørgensen
6939f4efSQais YousefQ: Are tracepoints part of the stable ABI?
6939f4efSQais Yousef------------------------------------------
6939f4efSQais YousefA: NO. Tracepoints are tied to internal implementation details hence they are
6939f4efSQais Yousefsubject to change and can break with newer kernels. BPF programs need to change
6939f4efSQais Yousefaccordingly when this happens.
6939f4efSQais Yousef
b9b738eeSPaul E. McKenneyQ: Are places where kprobes can attach part of the stable ABI?
b9b738eeSPaul E. McKenney--------------------------------------------------------------
b9b738eeSPaul E. McKenneyA: NO. The places to which kprobes can attach are internal implementation
b9b738eeSPaul E. McKenneydetails, which means that they are subject to change and can break with
b9b738eeSPaul E. McKenneynewer kernels. BPF programs need to change accordingly when this happens.
b9b738eeSPaul E. McKenney
192092faSJesper Dangaard BrouerQ: How much stack space a BPF program uses?
1a6ac1d5SJesper Dangaard Brouer-------------------------------------------
192092faSJesper Dangaard BrouerA: Currently all program types are limited to 512 bytes of stack
192092faSJesper Dangaard Brouerspace, but the verifier computes the actual amount of stack used
192092faSJesper Dangaard Brouerand both interpreter and most JITed code consume necessary amount.
192092faSJesper Dangaard Brouer
192092faSJesper Dangaard BrouerQ: Can BPF be offloaded to HW?
1a6ac1d5SJesper Dangaard Brouer------------------------------
192092faSJesper Dangaard BrouerA: YES. BPF HW offload is supported by NFP driver.
192092faSJesper Dangaard Brouer
192092faSJesper Dangaard BrouerQ: Does classic BPF interpreter still exist?
1a6ac1d5SJesper Dangaard Brouer--------------------------------------------
192092faSJesper Dangaard BrouerA: NO. Classic BPF programs are converted into extend BPF instructions.
192092faSJesper Dangaard Brouer
192092faSJesper Dangaard BrouerQ: Can BPF call arbitrary kernel functions?
1a6ac1d5SJesper Dangaard Brouer-------------------------------------------
27b53b73SToke Høiland-JørgensenA: NO. BPF programs can only call specific functions exposed as BPF helpers or
27b53b73SToke Høiland-Jørgensenkfuncs. The set of available functions is defined for every program type.
192092faSJesper Dangaard Brouer
192092faSJesper Dangaard BrouerQ: Can BPF overwrite arbitrary kernel memory?
1a6ac1d5SJesper Dangaard Brouer---------------------------------------------
1a6ac1d5SJesper Dangaard BrouerA: NO.
1a6ac1d5SJesper Dangaard Brouer
1a6ac1d5SJesper Dangaard BrouerTracing bpf programs can *read* arbitrary memory with bpf_probe_read()
192092faSJesper Dangaard Brouerand bpf_probe_read_str() helpers. Networking programs cannot read
192092faSJesper Dangaard Brouerarbitrary memory, since they don't have access to these helpers.
192092faSJesper Dangaard BrouerPrograms can never read or write arbitrary memory directly.
192092faSJesper Dangaard Brouer
192092faSJesper Dangaard BrouerQ: Can BPF overwrite arbitrary user memory?
1a6ac1d5SJesper Dangaard Brouer-------------------------------------------
1a6ac1d5SJesper Dangaard BrouerA: Sort-of.
1a6ac1d5SJesper Dangaard Brouer
1a6ac1d5SJesper Dangaard BrouerTracing BPF programs can overwrite the user memory
192092faSJesper Dangaard Brouerof the current task with bpf_probe_write_user(). Every time such
192092faSJesper Dangaard Brouerprogram is loaded the kernel will print warning message, so
192092faSJesper Dangaard Brouerthis helper is only useful for experiments and prototypes.
192092faSJesper Dangaard BrouerTracing BPF programs are root only.
192092faSJesper Dangaard Brouer
1a6ac1d5SJesper Dangaard BrouerQ: New functionality via kernel modules?
1a6ac1d5SJesper Dangaard Brouer----------------------------------------
192092faSJesper Dangaard BrouerQ: Can BPF functionality such as new program or map types, new
192092faSJesper Dangaard Brouerhelpers, etc be added out of kernel module code?
1a6ac1d5SJesper Dangaard Brouer
27b53b73SToke Høiland-JørgensenA: Yes, through kfuncs and kptrs
27b53b73SToke Høiland-Jørgensen
27b53b73SToke Høiland-JørgensenThe core BPF functionality such as program types, maps and helpers cannot be
27b53b73SToke Høiland-Jørgensenadded to by modules. However, modules can expose functionality to BPF programs
27b53b73SToke Høiland-Jørgensenby exporting kfuncs (which may return pointers to module-internal data
27b53b73SToke Høiland-Jørgensenstructures as kptrs).
5bdca94fSMartin KaFai Lau
5bdca94fSMartin KaFai LauQ: Directly calling kernel function is an ABI?
5bdca94fSMartin KaFai Lau----------------------------------------------
5bdca94fSMartin KaFai LauQ: Some kernel functions (e.g. tcp_slow_start) can be called
5bdca94fSMartin KaFai Lauby BPF programs.  Do these kernel functions become an ABI?
5bdca94fSMartin KaFai Lau
5bdca94fSMartin KaFai LauA: NO.
5bdca94fSMartin KaFai Lau
5bdca94fSMartin KaFai LauThe kernel function protos will change and the bpf programs will be
5bdca94fSMartin KaFai Laurejected by the verifier.  Also, for example, some of the bpf-callable
5bdca94fSMartin KaFai Laukernel functions have already been used by other kernel tcp
5bdca94fSMartin KaFai Laucc (congestion-control) implementations.  If any of these kernel
5bdca94fSMartin KaFai Laufunctions has changed, both the in-tree and out-of-tree kernel tcp cc
5bdca94fSMartin KaFai Lauimplementations have to be changed.  The same goes for the bpf
27b53b73SToke Høiland-Jørgensenprograms and they have to be adjusted accordingly. See
27b53b73SToke Høiland-Jørgensen:ref:`BPF_kfunc_lifecycle_expectations` for details.
62fc770dSPaul E. McKenney
62fc770dSPaul E. McKenneyQ: Attaching to arbitrary kernel functions is an ABI?
62fc770dSPaul E. McKenney-----------------------------------------------------
62fc770dSPaul E. McKenneyQ: BPF programs can be attached to many kernel functions.  Do these
62fc770dSPaul E. McKenneykernel functions become part of the ABI?
62fc770dSPaul E. McKenney
62fc770dSPaul E. McKenneyA: NO.
62fc770dSPaul E. McKenney
62fc770dSPaul E. McKenneyThe kernel function prototypes will change, and BPF programs attaching to
62fc770dSPaul E. McKenneythem will need to change.  The BPF compile-once-run-everywhere (CO-RE)
62fc770dSPaul E. McKenneyshould be used in order to make it easier to adapt your BPF programs to
62fc770dSPaul E. McKenneydifferent versions of the kernel.
8fcf1969SPaul E. McKenney
8fcf1969SPaul E. McKenneyQ: Marking a function with BTF_ID makes that function an ABI?
8fcf1969SPaul E. McKenney-------------------------------------------------------------
8fcf1969SPaul E. McKenneyA: NO.
8fcf1969SPaul E. McKenney
8fcf1969SPaul E. McKenneyThe BTF_ID macro does not cause a function to become part of the ABI
8fcf1969SPaul E. McKenneyany more than does the EXPORT_SYMBOL_GPL macro.
9805af8dSKumar Kartikeya Dwivedi
9805af8dSKumar Kartikeya DwivediQ: What is the compatibility story for special BPF types in map values?
9805af8dSKumar Kartikeya Dwivedi-----------------------------------------------------------------------
9805af8dSKumar Kartikeya DwivediQ: Users are allowed to embed bpf_spin_lock, bpf_timer fields in their BPF map
9805af8dSKumar Kartikeya Dwivedivalues (when using BTF support for BPF maps). This allows to use helpers for
9805af8dSKumar Kartikeya Dwivedisuch objects on these fields inside map values. Users are also allowed to embed
*03b77e17SAlexei Starovoitovpointers to some kernel types (with __kptr_untrusted and __kptr BTF tags). Will the
9805af8dSKumar Kartikeya Dwivedikernel preserve backwards compatibility for these features?
9805af8dSKumar Kartikeya Dwivedi
9805af8dSKumar Kartikeya DwivediA: It depends. For bpf_spin_lock, bpf_timer: YES, for kptr and everything else:
9805af8dSKumar Kartikeya DwivediNO, but see below.
9805af8dSKumar Kartikeya Dwivedi
9805af8dSKumar Kartikeya DwivediFor struct types that have been added already, like bpf_spin_lock and bpf_timer,
9805af8dSKumar Kartikeya Dwivedithe kernel will preserve backwards compatibility, as they are part of UAPI.
9805af8dSKumar Kartikeya Dwivedi
9805af8dSKumar Kartikeya DwivediFor kptrs, they are also part of UAPI, but only with respect to the kptr
*03b77e17SAlexei Starovoitovmechanism. The types that you can use with a __kptr_untrusted and __kptr tagged
9805af8dSKumar Kartikeya Dwivedipointer in your struct are NOT part of the UAPI contract. The supported types can
9805af8dSKumar Kartikeya Dwivediand will change across kernel releases. However, operations like accessing kptr
9805af8dSKumar Kartikeya Dwivedifields and bpf_kptr_xchg() helper will continue to be supported across kernel
9805af8dSKumar Kartikeya Dwivedireleases for the supported types.
9805af8dSKumar Kartikeya Dwivedi
9805af8dSKumar Kartikeya DwivediFor any other supported struct type, unless explicitly stated in this document
9805af8dSKumar Kartikeya Dwivediand added to bpf.h UAPI header, such types can and will arbitrarily change their
9805af8dSKumar Kartikeya Dwivedisize, type, and alignment, or any other user visible API or ABI detail across
9805af8dSKumar Kartikeya Dwivedikernel releases. The users must adapt their BPF programs to the new changes and
9805af8dSKumar Kartikeya Dwivediupdate them to make sure their programs continue to work correctly.
9805af8dSKumar Kartikeya Dwivedi
25906092SBagas SanjayaNOTE: BPF subsystem specially reserves the 'bpf\_' prefix for type names, in
9805af8dSKumar Kartikeya Dwivediorder to introduce more special fields in the future. Hence, user programs must
25906092SBagas Sanjayaavoid defining types with 'bpf\_' prefix to not be broken in future releases.
25906092SBagas SanjayaIn other words, no backwards compatibility is guaranteed if one using a type
25906092SBagas Sanjayain BTF with 'bpf\_' prefix.
9805af8dSKumar Kartikeya Dwivedi
1f6d52f1SKumar Kartikeya DwivediQ: What is the compatibility story for special BPF types in allocated objects?
1f6d52f1SKumar Kartikeya Dwivedi------------------------------------------------------------------------------
1f6d52f1SKumar Kartikeya DwivediQ: Same as above, but for allocated objects (i.e. objects allocated using
1f6d52f1SKumar Kartikeya Dwivedibpf_obj_new for user defined types). Will the kernel preserve backwards
9805af8dSKumar Kartikeya Dwivedicompatibility for these features?
9805af8dSKumar Kartikeya Dwivedi
9805af8dSKumar Kartikeya DwivediA: NO.
9805af8dSKumar Kartikeya Dwivedi
27b53b73SToke Høiland-JørgensenUnlike map value types, the API to work with allocated objects and any support
27b53b73SToke Høiland-Jørgensenfor special fields inside them is exposed through kfuncs, and thus has the same
27b53b73SToke Høiland-Jørgensenlifecycle expectations as the kfuncs themselves. See
27b53b73SToke Høiland-Jørgensen:ref:`BPF_kfunc_lifecycle_expectations` for details.