xref: /openbmc/linux/Documentation/bpf/bpf_design_QA.rst (revision 1f6d52f1a8947436dc5b2575e2fffb831f240141)
11a6ac1d5SJesper Dangaard Brouer==============
21a6ac1d5SJesper Dangaard BrouerBPF Design Q&A
31a6ac1d5SJesper Dangaard Brouer==============
41a6ac1d5SJesper Dangaard Brouer
5192092faSJesper Dangaard BrouerBPF extensibility and applicability to networking, tracing, security
6192092faSJesper Dangaard Brouerin the linux kernel and several user space implementations of BPF
7192092faSJesper Dangaard Brouervirtual machine led to a number of misunderstanding on what BPF actually is.
8192092faSJesper Dangaard BrouerThis short QA is an attempt to address that and outline a direction
9192092faSJesper Dangaard Brouerof where BPF is heading long term.
10192092faSJesper Dangaard Brouer
111a6ac1d5SJesper Dangaard Brouer.. contents::
121a6ac1d5SJesper Dangaard Brouer    :local:
131a6ac1d5SJesper Dangaard Brouer    :depth: 3
141a6ac1d5SJesper Dangaard Brouer
151a6ac1d5SJesper Dangaard BrouerQuestions and Answers
161a6ac1d5SJesper Dangaard Brouer=====================
171a6ac1d5SJesper Dangaard Brouer
18192092faSJesper Dangaard BrouerQ: Is BPF a generic instruction set similar to x64 and arm64?
191a6ac1d5SJesper Dangaard Brouer-------------------------------------------------------------
20192092faSJesper Dangaard BrouerA: NO.
21192092faSJesper Dangaard Brouer
22192092faSJesper Dangaard BrouerQ: Is BPF a generic virtual machine ?
231a6ac1d5SJesper Dangaard Brouer-------------------------------------
24192092faSJesper Dangaard BrouerA: NO.
25192092faSJesper Dangaard Brouer
261a6ac1d5SJesper Dangaard BrouerBPF is generic instruction set *with* C calling convention.
271a6ac1d5SJesper Dangaard Brouer-----------------------------------------------------------
28192092faSJesper Dangaard Brouer
29192092faSJesper Dangaard BrouerQ: Why C calling convention was chosen?
301a6ac1d5SJesper Dangaard Brouer~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
311a6ac1d5SJesper Dangaard Brouer
32192092faSJesper Dangaard BrouerA: Because BPF programs are designed to run in the linux kernel
33192092faSJesper Dangaard Brouerwhich is written in C, hence BPF defines instruction set compatible
34192092faSJesper Dangaard Brouerwith two most used architectures x64 and arm64 (and takes into
35192092faSJesper Dangaard Brouerconsideration important quirks of other architectures) and
36192092faSJesper Dangaard Brouerdefines calling convention that is compatible with C calling
37192092faSJesper Dangaard Brouerconvention of the linux kernel on those architectures.
38192092faSJesper Dangaard Brouer
3946604676SAndrii NakryikoQ: Can multiple return values be supported in the future?
401a6ac1d5SJesper Dangaard Brouer~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
41192092faSJesper Dangaard BrouerA: NO. BPF allows only register R0 to be used as return value.
42192092faSJesper Dangaard Brouer
4346604676SAndrii NakryikoQ: Can more than 5 function arguments be supported in the future?
441a6ac1d5SJesper Dangaard Brouer~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
45192092faSJesper Dangaard BrouerA: NO. BPF calling convention only allows registers R1-R5 to be used
46192092faSJesper Dangaard Broueras arguments. BPF is not a standalone instruction set.
47192092faSJesper Dangaard Brouer(unlike x64 ISA that allows msft, cdecl and other conventions)
48192092faSJesper Dangaard Brouer
4946604676SAndrii NakryikoQ: Can BPF programs access instruction pointer or return address?
501a6ac1d5SJesper Dangaard Brouer-----------------------------------------------------------------
51192092faSJesper Dangaard BrouerA: NO.
52192092faSJesper Dangaard Brouer
5346604676SAndrii NakryikoQ: Can BPF programs access stack pointer ?
541a6ac1d5SJesper Dangaard Brouer------------------------------------------
551a6ac1d5SJesper Dangaard BrouerA: NO.
561a6ac1d5SJesper Dangaard Brouer
571a6ac1d5SJesper Dangaard BrouerOnly frame pointer (register R10) is accessible.
58192092faSJesper Dangaard BrouerFrom compiler point of view it's necessary to have stack pointer.
5946604676SAndrii NakryikoFor example, LLVM defines register R11 as stack pointer in its
60192092faSJesper Dangaard BrouerBPF backend, but it makes sure that generated code never uses it.
61192092faSJesper Dangaard Brouer
62192092faSJesper Dangaard BrouerQ: Does C-calling convention diminishes possible use cases?
631a6ac1d5SJesper Dangaard Brouer-----------------------------------------------------------
641a6ac1d5SJesper Dangaard BrouerA: YES.
651a6ac1d5SJesper Dangaard Brouer
661a6ac1d5SJesper Dangaard BrouerBPF design forces addition of major functionality in the form
67192092faSJesper Dangaard Brouerof kernel helper functions and kernel objects like BPF maps with
68192092faSJesper Dangaard Brouerseamless interoperability between them. It lets kernel call into
6946604676SAndrii NakryikoBPF programs and programs call kernel helpers with zero overhead,
7046604676SAndrii Nakryikoas all of them were native C code. That is particularly the case
71192092faSJesper Dangaard Brouerfor JITed BPF programs that are indistinguishable from
72192092faSJesper Dangaard Brouernative kernel C code.
73192092faSJesper Dangaard Brouer
74192092faSJesper Dangaard BrouerQ: Does it mean that 'innovative' extensions to BPF code are disallowed?
751a6ac1d5SJesper Dangaard Brouer------------------------------------------------------------------------
761a6ac1d5SJesper Dangaard BrouerA: Soft yes.
771a6ac1d5SJesper Dangaard Brouer
7846604676SAndrii NakryikoAt least for now, until BPF core has support for
79192092faSJesper Dangaard Brouerbpf-to-bpf calls, indirect calls, loops, global variables,
8046604676SAndrii Nakryikojump tables, read-only sections, and all other normal constructs
81192092faSJesper Dangaard Brouerthat C code can produce.
82192092faSJesper Dangaard Brouer
83192092faSJesper Dangaard BrouerQ: Can loops be supported in a safe way?
841a6ac1d5SJesper Dangaard Brouer----------------------------------------
851a6ac1d5SJesper Dangaard BrouerA: It's not clear yet.
861a6ac1d5SJesper Dangaard Brouer
871a6ac1d5SJesper Dangaard BrouerBPF developers are trying to find a way to
883b880244SAlexei Starovoitovsupport bounded loops.
893b880244SAlexei Starovoitov
903b880244SAlexei StarovoitovQ: What are the verifier limits?
913b880244SAlexei Starovoitov--------------------------------
923b880244SAlexei StarovoitovA: The only limit known to the user space is BPF_MAXINSNS (4096).
933b880244SAlexei StarovoitovIt's the maximum number of instructions that the unprivileged bpf
943b880244SAlexei Starovoitovprogram can have. The verifier has various internal limits.
953b880244SAlexei StarovoitovLike the maximum number of instructions that can be explored during
963b880244SAlexei Starovoitovprogram analysis. Currently, that limit is set to 1 million.
973b880244SAlexei StarovoitovWhich essentially means that the largest program can consist
983b880244SAlexei Starovoitovof 1 million NOP instructions. There is a limit to the maximum number
993b880244SAlexei Starovoitovof subsequent branches, a limit to the number of nested bpf-to-bpf
1003b880244SAlexei Starovoitovcalls, a limit to the number of the verifier states per instruction,
1013b880244SAlexei Starovoitova limit to the number of maps used by the program.
1023b880244SAlexei StarovoitovAll these limits can be hit with a sufficiently complex program.
1033b880244SAlexei StarovoitovThere are also non-numerical limits that can cause the program
1043b880244SAlexei Starovoitovto be rejected. The verifier used to recognize only pointer + constant
1053b880244SAlexei Starovoitovexpressions. Now it can recognize pointer + bounded_register.
1063b880244SAlexei Starovoitovbpf_lookup_map_elem(key) had a requirement that 'key' must be
1073b880244SAlexei Starovoitova pointer to the stack. Now, 'key' can be a pointer to map value.
1083b880244SAlexei StarovoitovThe verifier is steadily getting 'smarter'. The limits are
1093b880244SAlexei Starovoitovbeing removed. The only way to know that the program is going to
1103b880244SAlexei Starovoitovbe accepted by the verifier is to try to load it.
1113b880244SAlexei StarovoitovThe bpf development process guarantees that the future kernel
1123b880244SAlexei Starovoitovversions will accept all bpf programs that were accepted by
1133b880244SAlexei Starovoitovthe earlier versions.
1143b880244SAlexei Starovoitov
115192092faSJesper Dangaard Brouer
1161a6ac1d5SJesper Dangaard BrouerInstruction level questions
1171a6ac1d5SJesper Dangaard Brouer---------------------------
1181a6ac1d5SJesper Dangaard Brouer
1191a6ac1d5SJesper Dangaard BrouerQ: LD_ABS and LD_IND instructions vs C code
1201a6ac1d5SJesper Dangaard Brouer~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1211a6ac1d5SJesper Dangaard Brouer
122192092faSJesper Dangaard BrouerQ: How come LD_ABS and LD_IND instruction are present in BPF whereas
123192092faSJesper Dangaard BrouerC code cannot express them and has to use builtin intrinsics?
1241a6ac1d5SJesper Dangaard Brouer
125192092faSJesper Dangaard BrouerA: This is artifact of compatibility with classic BPF. Modern
126192092faSJesper Dangaard Brouernetworking code in BPF performs better without them.
127192092faSJesper Dangaard BrouerSee 'direct packet access'.
128192092faSJesper Dangaard Brouer
1291a6ac1d5SJesper Dangaard BrouerQ: BPF instructions mapping not one-to-one to native CPU
1301a6ac1d5SJesper Dangaard Brouer~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
131192092faSJesper Dangaard BrouerQ: It seems not all BPF instructions are one-to-one to native CPU.
132192092faSJesper Dangaard BrouerFor example why BPF_JNE and other compare and jumps are not cpu-like?
1331a6ac1d5SJesper Dangaard Brouer
134192092faSJesper Dangaard BrouerA: This was necessary to avoid introducing flags into ISA which are
135192092faSJesper Dangaard Brouerimpossible to make generic and efficient across CPU architectures.
136192092faSJesper Dangaard Brouer
13746604676SAndrii NakryikoQ: Why BPF_DIV instruction doesn't map to x64 div?
1381a6ac1d5SJesper Dangaard Brouer~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
139192092faSJesper Dangaard BrouerA: Because if we picked one-to-one relationship to x64 it would have made
140192092faSJesper Dangaard Brouerit more complicated to support on arm64 and other archs. Also it
141192092faSJesper Dangaard Brouerneeds div-by-zero runtime check.
142192092faSJesper Dangaard Brouer
14346604676SAndrii NakryikoQ: Why there is no BPF_SDIV for signed divide operation?
1441a6ac1d5SJesper Dangaard Brouer~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
145192092faSJesper Dangaard BrouerA: Because it would be rarely used. llvm errors in such case and
14646604676SAndrii Nakryikoprints a suggestion to use unsigned divide instead.
147192092faSJesper Dangaard Brouer
148192092faSJesper Dangaard BrouerQ: Why BPF has implicit prologue and epilogue?
1491a6ac1d5SJesper Dangaard Brouer~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
150192092faSJesper Dangaard BrouerA: Because architectures like sparc have register windows and in general
151192092faSJesper Dangaard Brouerthere are enough subtle differences between architectures, so naive
152192092faSJesper Dangaard Brouerstore return address into stack won't work. Another reason is BPF has
153192092faSJesper Dangaard Brouerto be safe from division by zero (and legacy exception path
154192092faSJesper Dangaard Brouerof LD_ABS insn). Those instructions need to invoke epilogue and
155192092faSJesper Dangaard Brouerreturn implicitly.
156192092faSJesper Dangaard Brouer
157192092faSJesper Dangaard BrouerQ: Why BPF_JLT and BPF_JLE instructions were not introduced in the beginning?
1581a6ac1d5SJesper Dangaard Brouer~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
159192092faSJesper Dangaard BrouerA: Because classic BPF didn't have them and BPF authors felt that compiler
160192092faSJesper Dangaard Brouerworkaround would be acceptable. Turned out that programs lose performance
161192092faSJesper Dangaard Brouerdue to lack of these compare instructions and they were added.
162192092faSJesper Dangaard BrouerThese two instructions is a perfect example what kind of new BPF
163192092faSJesper Dangaard Brouerinstructions are acceptable and can be added in the future.
164192092faSJesper Dangaard BrouerThese two already had equivalent instructions in native CPUs.
165192092faSJesper Dangaard BrouerNew instructions that don't have one-to-one mapping to HW instructions
166192092faSJesper Dangaard Brouerwill not be accepted.
167192092faSJesper Dangaard Brouer
1681a6ac1d5SJesper Dangaard BrouerQ: BPF 32-bit subregister requirements
1691a6ac1d5SJesper Dangaard Brouer~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
170192092faSJesper Dangaard BrouerQ: BPF 32-bit subregisters have a requirement to zero upper 32-bits of BPF
171192092faSJesper Dangaard Brouerregisters which makes BPF inefficient virtual machine for 32-bit
172192092faSJesper Dangaard BrouerCPU architectures and 32-bit HW accelerators. Can true 32-bit registers
173192092faSJesper Dangaard Brouerbe added to BPF in the future?
1741a6ac1d5SJesper Dangaard Brouer
175c231c22aSJiong WangA: NO.
176c231c22aSJiong Wang
177c231c22aSJiong WangBut some optimizations on zero-ing the upper 32 bits for BPF registers are
178c231c22aSJiong Wangavailable, and can be leveraged to improve the performance of JITed BPF
179c231c22aSJiong Wangprograms for 32-bit architectures.
180c231c22aSJiong Wang
181c231c22aSJiong WangStarting with version 7, LLVM is able to generate instructions that operate
182c231c22aSJiong Wangon 32-bit subregisters, provided the option -mattr=+alu32 is passed for
183c231c22aSJiong Wangcompiling a program. Furthermore, the verifier can now mark the
184c231c22aSJiong Wanginstructions for which zero-ing the upper bits of the destination register
185c231c22aSJiong Wangis required, and insert an explicit zero-extension (zext) instruction
186c231c22aSJiong Wang(a mov32 variant). This means that for architectures without zext hardware
187c231c22aSJiong Wangsupport, the JIT back-ends do not need to clear the upper bits for
188c231c22aSJiong Wangsubregisters written by alu32 instructions or narrow loads. Instead, the
189c231c22aSJiong Wangback-ends simply need to support code generation for that mov32 variant,
190c231c22aSJiong Wangand to overwrite bpf_jit_needs_zext() to make it return "true" (in order to
191c231c22aSJiong Wangenable zext insertion in the verifier).
192c231c22aSJiong Wang
193c231c22aSJiong WangNote that it is possible for a JIT back-end to have partial hardware
194c231c22aSJiong Wangsupport for zext. In that case, if verifier zext insertion is enabled,
195c231c22aSJiong Wangit could lead to the insertion of unnecessary zext instructions. Such
196c231c22aSJiong Wanginstructions could be removed by creating a simple peephole inside the JIT
197c231c22aSJiong Wangback-end: if one instruction has hardware support for zext and if the next
198c231c22aSJiong Wanginstruction is an explicit zext, then the latter can be skipped when doing
199c231c22aSJiong Wangthe code generation.
200192092faSJesper Dangaard Brouer
201192092faSJesper Dangaard BrouerQ: Does BPF have a stable ABI?
2021a6ac1d5SJesper Dangaard Brouer------------------------------
203192092faSJesper Dangaard BrouerA: YES. BPF instructions, arguments to BPF programs, set of helper
204192092faSJesper Dangaard Brouerfunctions and their arguments, recognized return codes are all part
205a769fa72SDaniel Borkmannof ABI. However there is one specific exception to tracing programs
206a769fa72SDaniel Borkmannwhich are using helpers like bpf_probe_read() to walk kernel internal
207a769fa72SDaniel Borkmanndata structures and compile with kernel internal headers. Both of these
208a769fa72SDaniel Borkmannkernel internals are subject to change and can break with newer kernels
209a769fa72SDaniel Borkmannsuch that the program needs to be adapted accordingly.
210192092faSJesper Dangaard Brouer
2116939f4efSQais YousefQ: Are tracepoints part of the stable ABI?
2126939f4efSQais Yousef------------------------------------------
2136939f4efSQais YousefA: NO. Tracepoints are tied to internal implementation details hence they are
2146939f4efSQais Yousefsubject to change and can break with newer kernels. BPF programs need to change
2156939f4efSQais Yousefaccordingly when this happens.
2166939f4efSQais Yousef
217b9b738eeSPaul E. McKenneyQ: Are places where kprobes can attach part of the stable ABI?
218b9b738eeSPaul E. McKenney--------------------------------------------------------------
219b9b738eeSPaul E. McKenneyA: NO. The places to which kprobes can attach are internal implementation
220b9b738eeSPaul E. McKenneydetails, which means that they are subject to change and can break with
221b9b738eeSPaul E. McKenneynewer kernels. BPF programs need to change accordingly when this happens.
222b9b738eeSPaul E. McKenney
223192092faSJesper Dangaard BrouerQ: How much stack space a BPF program uses?
2241a6ac1d5SJesper Dangaard Brouer-------------------------------------------
225192092faSJesper Dangaard BrouerA: Currently all program types are limited to 512 bytes of stack
226192092faSJesper Dangaard Brouerspace, but the verifier computes the actual amount of stack used
227192092faSJesper Dangaard Brouerand both interpreter and most JITed code consume necessary amount.
228192092faSJesper Dangaard Brouer
229192092faSJesper Dangaard BrouerQ: Can BPF be offloaded to HW?
2301a6ac1d5SJesper Dangaard Brouer------------------------------
231192092faSJesper Dangaard BrouerA: YES. BPF HW offload is supported by NFP driver.
232192092faSJesper Dangaard Brouer
233192092faSJesper Dangaard BrouerQ: Does classic BPF interpreter still exist?
2341a6ac1d5SJesper Dangaard Brouer--------------------------------------------
235192092faSJesper Dangaard BrouerA: NO. Classic BPF programs are converted into extend BPF instructions.
236192092faSJesper Dangaard Brouer
237192092faSJesper Dangaard BrouerQ: Can BPF call arbitrary kernel functions?
2381a6ac1d5SJesper Dangaard Brouer-------------------------------------------
239192092faSJesper Dangaard BrouerA: NO. BPF programs can only call a set of helper functions which
240192092faSJesper Dangaard Broueris defined for every program type.
241192092faSJesper Dangaard Brouer
242192092faSJesper Dangaard BrouerQ: Can BPF overwrite arbitrary kernel memory?
2431a6ac1d5SJesper Dangaard Brouer---------------------------------------------
2441a6ac1d5SJesper Dangaard BrouerA: NO.
2451a6ac1d5SJesper Dangaard Brouer
2461a6ac1d5SJesper Dangaard BrouerTracing bpf programs can *read* arbitrary memory with bpf_probe_read()
247192092faSJesper Dangaard Brouerand bpf_probe_read_str() helpers. Networking programs cannot read
248192092faSJesper Dangaard Brouerarbitrary memory, since they don't have access to these helpers.
249192092faSJesper Dangaard BrouerPrograms can never read or write arbitrary memory directly.
250192092faSJesper Dangaard Brouer
251192092faSJesper Dangaard BrouerQ: Can BPF overwrite arbitrary user memory?
2521a6ac1d5SJesper Dangaard Brouer-------------------------------------------
2531a6ac1d5SJesper Dangaard BrouerA: Sort-of.
2541a6ac1d5SJesper Dangaard Brouer
2551a6ac1d5SJesper Dangaard BrouerTracing BPF programs can overwrite the user memory
256192092faSJesper Dangaard Brouerof the current task with bpf_probe_write_user(). Every time such
257192092faSJesper Dangaard Brouerprogram is loaded the kernel will print warning message, so
258192092faSJesper Dangaard Brouerthis helper is only useful for experiments and prototypes.
259192092faSJesper Dangaard BrouerTracing BPF programs are root only.
260192092faSJesper Dangaard Brouer
2611a6ac1d5SJesper Dangaard BrouerQ: New functionality via kernel modules?
2621a6ac1d5SJesper Dangaard Brouer----------------------------------------
263192092faSJesper Dangaard BrouerQ: Can BPF functionality such as new program or map types, new
264192092faSJesper Dangaard Brouerhelpers, etc be added out of kernel module code?
2651a6ac1d5SJesper Dangaard Brouer
266192092faSJesper Dangaard BrouerA: NO.
2675bdca94fSMartin KaFai Lau
2685bdca94fSMartin KaFai LauQ: Directly calling kernel function is an ABI?
2695bdca94fSMartin KaFai Lau----------------------------------------------
2705bdca94fSMartin KaFai LauQ: Some kernel functions (e.g. tcp_slow_start) can be called
2715bdca94fSMartin KaFai Lauby BPF programs.  Do these kernel functions become an ABI?
2725bdca94fSMartin KaFai Lau
2735bdca94fSMartin KaFai LauA: NO.
2745bdca94fSMartin KaFai Lau
2755bdca94fSMartin KaFai LauThe kernel function protos will change and the bpf programs will be
2765bdca94fSMartin KaFai Laurejected by the verifier.  Also, for example, some of the bpf-callable
2775bdca94fSMartin KaFai Laukernel functions have already been used by other kernel tcp
2785bdca94fSMartin KaFai Laucc (congestion-control) implementations.  If any of these kernel
2795bdca94fSMartin KaFai Laufunctions has changed, both the in-tree and out-of-tree kernel tcp cc
2805bdca94fSMartin KaFai Lauimplementations have to be changed.  The same goes for the bpf
2815bdca94fSMartin KaFai Lauprograms and they have to be adjusted accordingly.
28262fc770dSPaul E. McKenney
28362fc770dSPaul E. McKenneyQ: Attaching to arbitrary kernel functions is an ABI?
28462fc770dSPaul E. McKenney-----------------------------------------------------
28562fc770dSPaul E. McKenneyQ: BPF programs can be attached to many kernel functions.  Do these
28662fc770dSPaul E. McKenneykernel functions become part of the ABI?
28762fc770dSPaul E. McKenney
28862fc770dSPaul E. McKenneyA: NO.
28962fc770dSPaul E. McKenney
29062fc770dSPaul E. McKenneyThe kernel function prototypes will change, and BPF programs attaching to
29162fc770dSPaul E. McKenneythem will need to change.  The BPF compile-once-run-everywhere (CO-RE)
29262fc770dSPaul E. McKenneyshould be used in order to make it easier to adapt your BPF programs to
29362fc770dSPaul E. McKenneydifferent versions of the kernel.
2948fcf1969SPaul E. McKenney
2958fcf1969SPaul E. McKenneyQ: Marking a function with BTF_ID makes that function an ABI?
2968fcf1969SPaul E. McKenney-------------------------------------------------------------
2978fcf1969SPaul E. McKenneyA: NO.
2988fcf1969SPaul E. McKenney
2998fcf1969SPaul E. McKenneyThe BTF_ID macro does not cause a function to become part of the ABI
3008fcf1969SPaul E. McKenneyany more than does the EXPORT_SYMBOL_GPL macro.
3019805af8dSKumar Kartikeya Dwivedi
3029805af8dSKumar Kartikeya DwivediQ: What is the compatibility story for special BPF types in map values?
3039805af8dSKumar Kartikeya Dwivedi-----------------------------------------------------------------------
3049805af8dSKumar Kartikeya DwivediQ: Users are allowed to embed bpf_spin_lock, bpf_timer fields in their BPF map
3059805af8dSKumar Kartikeya Dwivedivalues (when using BTF support for BPF maps). This allows to use helpers for
3069805af8dSKumar Kartikeya Dwivedisuch objects on these fields inside map values. Users are also allowed to embed
3079805af8dSKumar Kartikeya Dwivedipointers to some kernel types (with __kptr and __kptr_ref BTF tags). Will the
3089805af8dSKumar Kartikeya Dwivedikernel preserve backwards compatibility for these features?
3099805af8dSKumar Kartikeya Dwivedi
3109805af8dSKumar Kartikeya DwivediA: It depends. For bpf_spin_lock, bpf_timer: YES, for kptr and everything else:
3119805af8dSKumar Kartikeya DwivediNO, but see below.
3129805af8dSKumar Kartikeya Dwivedi
3139805af8dSKumar Kartikeya DwivediFor struct types that have been added already, like bpf_spin_lock and bpf_timer,
3149805af8dSKumar Kartikeya Dwivedithe kernel will preserve backwards compatibility, as they are part of UAPI.
3159805af8dSKumar Kartikeya Dwivedi
3169805af8dSKumar Kartikeya DwivediFor kptrs, they are also part of UAPI, but only with respect to the kptr
3179805af8dSKumar Kartikeya Dwivedimechanism. The types that you can use with a __kptr and __kptr_ref tagged
3189805af8dSKumar Kartikeya Dwivedipointer in your struct are NOT part of the UAPI contract. The supported types can
3199805af8dSKumar Kartikeya Dwivediand will change across kernel releases. However, operations like accessing kptr
3209805af8dSKumar Kartikeya Dwivedifields and bpf_kptr_xchg() helper will continue to be supported across kernel
3219805af8dSKumar Kartikeya Dwivedireleases for the supported types.
3229805af8dSKumar Kartikeya Dwivedi
3239805af8dSKumar Kartikeya DwivediFor any other supported struct type, unless explicitly stated in this document
3249805af8dSKumar Kartikeya Dwivediand added to bpf.h UAPI header, such types can and will arbitrarily change their
3259805af8dSKumar Kartikeya Dwivedisize, type, and alignment, or any other user visible API or ABI detail across
3269805af8dSKumar Kartikeya Dwivedikernel releases. The users must adapt their BPF programs to the new changes and
3279805af8dSKumar Kartikeya Dwivediupdate them to make sure their programs continue to work correctly.
3289805af8dSKumar Kartikeya Dwivedi
32925906092SBagas SanjayaNOTE: BPF subsystem specially reserves the 'bpf\_' prefix for type names, in
3309805af8dSKumar Kartikeya Dwivediorder to introduce more special fields in the future. Hence, user programs must
33125906092SBagas Sanjayaavoid defining types with 'bpf\_' prefix to not be broken in future releases.
33225906092SBagas SanjayaIn other words, no backwards compatibility is guaranteed if one using a type
33325906092SBagas Sanjayain BTF with 'bpf\_' prefix.
3349805af8dSKumar Kartikeya Dwivedi
335*1f6d52f1SKumar Kartikeya DwivediQ: What is the compatibility story for special BPF types in allocated objects?
336*1f6d52f1SKumar Kartikeya Dwivedi------------------------------------------------------------------------------
337*1f6d52f1SKumar Kartikeya DwivediQ: Same as above, but for allocated objects (i.e. objects allocated using
338*1f6d52f1SKumar Kartikeya Dwivedibpf_obj_new for user defined types). Will the kernel preserve backwards
3399805af8dSKumar Kartikeya Dwivedicompatibility for these features?
3409805af8dSKumar Kartikeya Dwivedi
3419805af8dSKumar Kartikeya DwivediA: NO.
3429805af8dSKumar Kartikeya Dwivedi
3439805af8dSKumar Kartikeya DwivediUnlike map value types, there are no stability guarantees for this case. The
344*1f6d52f1SKumar Kartikeya Dwivediwhole API to work with allocated objects and any support for special fields
345*1f6d52f1SKumar Kartikeya Dwivediinside them is unstable (since it is exposed through kfuncs).
346