11a6ac1d5SJesper Dangaard Brouer============== 21a6ac1d5SJesper Dangaard BrouerBPF Design Q&A 31a6ac1d5SJesper Dangaard Brouer============== 41a6ac1d5SJesper Dangaard Brouer 5192092faSJesper Dangaard BrouerBPF extensibility and applicability to networking, tracing, security 6192092faSJesper Dangaard Brouerin the linux kernel and several user space implementations of BPF 7192092faSJesper Dangaard Brouervirtual machine led to a number of misunderstanding on what BPF actually is. 8192092faSJesper Dangaard BrouerThis short QA is an attempt to address that and outline a direction 9192092faSJesper Dangaard Brouerof where BPF is heading long term. 10192092faSJesper Dangaard Brouer 111a6ac1d5SJesper Dangaard Brouer.. contents:: 121a6ac1d5SJesper Dangaard Brouer :local: 131a6ac1d5SJesper Dangaard Brouer :depth: 3 141a6ac1d5SJesper Dangaard Brouer 151a6ac1d5SJesper Dangaard BrouerQuestions and Answers 161a6ac1d5SJesper Dangaard Brouer===================== 171a6ac1d5SJesper Dangaard Brouer 18192092faSJesper Dangaard BrouerQ: Is BPF a generic instruction set similar to x64 and arm64? 191a6ac1d5SJesper Dangaard Brouer------------------------------------------------------------- 20192092faSJesper Dangaard BrouerA: NO. 21192092faSJesper Dangaard Brouer 22192092faSJesper Dangaard BrouerQ: Is BPF a generic virtual machine ? 231a6ac1d5SJesper Dangaard Brouer------------------------------------- 24192092faSJesper Dangaard BrouerA: NO. 25192092faSJesper Dangaard Brouer 261a6ac1d5SJesper Dangaard BrouerBPF is generic instruction set *with* C calling convention. 271a6ac1d5SJesper Dangaard Brouer----------------------------------------------------------- 28192092faSJesper Dangaard Brouer 29192092faSJesper Dangaard BrouerQ: Why C calling convention was chosen? 301a6ac1d5SJesper Dangaard Brouer~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 311a6ac1d5SJesper Dangaard Brouer 32192092faSJesper Dangaard BrouerA: Because BPF programs are designed to run in the linux kernel 33192092faSJesper Dangaard Brouerwhich is written in C, hence BPF defines instruction set compatible 34192092faSJesper Dangaard Brouerwith two most used architectures x64 and arm64 (and takes into 35192092faSJesper Dangaard Brouerconsideration important quirks of other architectures) and 36192092faSJesper Dangaard Brouerdefines calling convention that is compatible with C calling 37192092faSJesper Dangaard Brouerconvention of the linux kernel on those architectures. 38192092faSJesper Dangaard Brouer 3946604676SAndrii NakryikoQ: Can multiple return values be supported in the future? 401a6ac1d5SJesper Dangaard Brouer~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 41192092faSJesper Dangaard BrouerA: NO. BPF allows only register R0 to be used as return value. 42192092faSJesper Dangaard Brouer 4346604676SAndrii NakryikoQ: Can more than 5 function arguments be supported in the future? 441a6ac1d5SJesper Dangaard Brouer~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 45192092faSJesper Dangaard BrouerA: NO. BPF calling convention only allows registers R1-R5 to be used 46192092faSJesper Dangaard Broueras arguments. BPF is not a standalone instruction set. 47192092faSJesper Dangaard Brouer(unlike x64 ISA that allows msft, cdecl and other conventions) 48192092faSJesper Dangaard Brouer 4946604676SAndrii NakryikoQ: Can BPF programs access instruction pointer or return address? 501a6ac1d5SJesper Dangaard Brouer----------------------------------------------------------------- 51192092faSJesper Dangaard BrouerA: NO. 52192092faSJesper Dangaard Brouer 5346604676SAndrii NakryikoQ: Can BPF programs access stack pointer ? 541a6ac1d5SJesper Dangaard Brouer------------------------------------------ 551a6ac1d5SJesper Dangaard BrouerA: NO. 561a6ac1d5SJesper Dangaard Brouer 571a6ac1d5SJesper Dangaard BrouerOnly frame pointer (register R10) is accessible. 58192092faSJesper Dangaard BrouerFrom compiler point of view it's necessary to have stack pointer. 5946604676SAndrii NakryikoFor example, LLVM defines register R11 as stack pointer in its 60192092faSJesper Dangaard BrouerBPF backend, but it makes sure that generated code never uses it. 61192092faSJesper Dangaard Brouer 62192092faSJesper Dangaard BrouerQ: Does C-calling convention diminishes possible use cases? 631a6ac1d5SJesper Dangaard Brouer----------------------------------------------------------- 641a6ac1d5SJesper Dangaard BrouerA: YES. 651a6ac1d5SJesper Dangaard Brouer 661a6ac1d5SJesper Dangaard BrouerBPF design forces addition of major functionality in the form 67192092faSJesper Dangaard Brouerof kernel helper functions and kernel objects like BPF maps with 68192092faSJesper Dangaard Brouerseamless interoperability between them. It lets kernel call into 6946604676SAndrii NakryikoBPF programs and programs call kernel helpers with zero overhead, 7046604676SAndrii Nakryikoas all of them were native C code. That is particularly the case 71192092faSJesper Dangaard Brouerfor JITed BPF programs that are indistinguishable from 72192092faSJesper Dangaard Brouernative kernel C code. 73192092faSJesper Dangaard Brouer 74192092faSJesper Dangaard BrouerQ: Does it mean that 'innovative' extensions to BPF code are disallowed? 751a6ac1d5SJesper Dangaard Brouer------------------------------------------------------------------------ 761a6ac1d5SJesper Dangaard BrouerA: Soft yes. 771a6ac1d5SJesper Dangaard Brouer 7846604676SAndrii NakryikoAt least for now, until BPF core has support for 79192092faSJesper Dangaard Brouerbpf-to-bpf calls, indirect calls, loops, global variables, 8046604676SAndrii Nakryikojump tables, read-only sections, and all other normal constructs 81192092faSJesper Dangaard Brouerthat C code can produce. 82192092faSJesper Dangaard Brouer 83192092faSJesper Dangaard BrouerQ: Can loops be supported in a safe way? 841a6ac1d5SJesper Dangaard Brouer---------------------------------------- 851a6ac1d5SJesper Dangaard BrouerA: It's not clear yet. 861a6ac1d5SJesper Dangaard Brouer 871a6ac1d5SJesper Dangaard BrouerBPF developers are trying to find a way to 883b880244SAlexei Starovoitovsupport bounded loops. 893b880244SAlexei Starovoitov 903b880244SAlexei StarovoitovQ: What are the verifier limits? 913b880244SAlexei Starovoitov-------------------------------- 923b880244SAlexei StarovoitovA: The only limit known to the user space is BPF_MAXINSNS (4096). 933b880244SAlexei StarovoitovIt's the maximum number of instructions that the unprivileged bpf 943b880244SAlexei Starovoitovprogram can have. The verifier has various internal limits. 953b880244SAlexei StarovoitovLike the maximum number of instructions that can be explored during 963b880244SAlexei Starovoitovprogram analysis. Currently, that limit is set to 1 million. 973b880244SAlexei StarovoitovWhich essentially means that the largest program can consist 983b880244SAlexei Starovoitovof 1 million NOP instructions. There is a limit to the maximum number 993b880244SAlexei Starovoitovof subsequent branches, a limit to the number of nested bpf-to-bpf 1003b880244SAlexei Starovoitovcalls, a limit to the number of the verifier states per instruction, 1013b880244SAlexei Starovoitova limit to the number of maps used by the program. 1023b880244SAlexei StarovoitovAll these limits can be hit with a sufficiently complex program. 1033b880244SAlexei StarovoitovThere are also non-numerical limits that can cause the program 1043b880244SAlexei Starovoitovto be rejected. The verifier used to recognize only pointer + constant 1053b880244SAlexei Starovoitovexpressions. Now it can recognize pointer + bounded_register. 1063b880244SAlexei Starovoitovbpf_lookup_map_elem(key) had a requirement that 'key' must be 1073b880244SAlexei Starovoitova pointer to the stack. Now, 'key' can be a pointer to map value. 1083b880244SAlexei StarovoitovThe verifier is steadily getting 'smarter'. The limits are 1093b880244SAlexei Starovoitovbeing removed. The only way to know that the program is going to 1103b880244SAlexei Starovoitovbe accepted by the verifier is to try to load it. 1113b880244SAlexei StarovoitovThe bpf development process guarantees that the future kernel 1123b880244SAlexei Starovoitovversions will accept all bpf programs that were accepted by 1133b880244SAlexei Starovoitovthe earlier versions. 1143b880244SAlexei Starovoitov 115192092faSJesper Dangaard Brouer 1161a6ac1d5SJesper Dangaard BrouerInstruction level questions 1171a6ac1d5SJesper Dangaard Brouer--------------------------- 1181a6ac1d5SJesper Dangaard Brouer 1191a6ac1d5SJesper Dangaard BrouerQ: LD_ABS and LD_IND instructions vs C code 1201a6ac1d5SJesper Dangaard Brouer~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1211a6ac1d5SJesper Dangaard Brouer 122192092faSJesper Dangaard BrouerQ: How come LD_ABS and LD_IND instruction are present in BPF whereas 123192092faSJesper Dangaard BrouerC code cannot express them and has to use builtin intrinsics? 1241a6ac1d5SJesper Dangaard Brouer 125192092faSJesper Dangaard BrouerA: This is artifact of compatibility with classic BPF. Modern 126192092faSJesper Dangaard Brouernetworking code in BPF performs better without them. 127192092faSJesper Dangaard BrouerSee 'direct packet access'. 128192092faSJesper Dangaard Brouer 1291a6ac1d5SJesper Dangaard BrouerQ: BPF instructions mapping not one-to-one to native CPU 1301a6ac1d5SJesper Dangaard Brouer~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 131192092faSJesper Dangaard BrouerQ: It seems not all BPF instructions are one-to-one to native CPU. 132192092faSJesper Dangaard BrouerFor example why BPF_JNE and other compare and jumps are not cpu-like? 1331a6ac1d5SJesper Dangaard Brouer 134192092faSJesper Dangaard BrouerA: This was necessary to avoid introducing flags into ISA which are 135192092faSJesper Dangaard Brouerimpossible to make generic and efficient across CPU architectures. 136192092faSJesper Dangaard Brouer 13746604676SAndrii NakryikoQ: Why BPF_DIV instruction doesn't map to x64 div? 1381a6ac1d5SJesper Dangaard Brouer~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 139192092faSJesper Dangaard BrouerA: Because if we picked one-to-one relationship to x64 it would have made 140192092faSJesper Dangaard Brouerit more complicated to support on arm64 and other archs. Also it 141192092faSJesper Dangaard Brouerneeds div-by-zero runtime check. 142192092faSJesper Dangaard Brouer 14346604676SAndrii NakryikoQ: Why there is no BPF_SDIV for signed divide operation? 1441a6ac1d5SJesper Dangaard Brouer~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 145192092faSJesper Dangaard BrouerA: Because it would be rarely used. llvm errors in such case and 14646604676SAndrii Nakryikoprints a suggestion to use unsigned divide instead. 147192092faSJesper Dangaard Brouer 148192092faSJesper Dangaard BrouerQ: Why BPF has implicit prologue and epilogue? 1491a6ac1d5SJesper Dangaard Brouer~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 150192092faSJesper Dangaard BrouerA: Because architectures like sparc have register windows and in general 151192092faSJesper Dangaard Brouerthere are enough subtle differences between architectures, so naive 152192092faSJesper Dangaard Brouerstore return address into stack won't work. Another reason is BPF has 153192092faSJesper Dangaard Brouerto be safe from division by zero (and legacy exception path 154192092faSJesper Dangaard Brouerof LD_ABS insn). Those instructions need to invoke epilogue and 155192092faSJesper Dangaard Brouerreturn implicitly. 156192092faSJesper Dangaard Brouer 157192092faSJesper Dangaard BrouerQ: Why BPF_JLT and BPF_JLE instructions were not introduced in the beginning? 1581a6ac1d5SJesper Dangaard Brouer~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 159192092faSJesper Dangaard BrouerA: Because classic BPF didn't have them and BPF authors felt that compiler 160192092faSJesper Dangaard Brouerworkaround would be acceptable. Turned out that programs lose performance 161192092faSJesper Dangaard Brouerdue to lack of these compare instructions and they were added. 162192092faSJesper Dangaard BrouerThese two instructions is a perfect example what kind of new BPF 163192092faSJesper Dangaard Brouerinstructions are acceptable and can be added in the future. 164192092faSJesper Dangaard BrouerThese two already had equivalent instructions in native CPUs. 165192092faSJesper Dangaard BrouerNew instructions that don't have one-to-one mapping to HW instructions 166192092faSJesper Dangaard Brouerwill not be accepted. 167192092faSJesper Dangaard Brouer 1681a6ac1d5SJesper Dangaard BrouerQ: BPF 32-bit subregister requirements 1691a6ac1d5SJesper Dangaard Brouer~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 170192092faSJesper Dangaard BrouerQ: BPF 32-bit subregisters have a requirement to zero upper 32-bits of BPF 171192092faSJesper Dangaard Brouerregisters which makes BPF inefficient virtual machine for 32-bit 172192092faSJesper Dangaard BrouerCPU architectures and 32-bit HW accelerators. Can true 32-bit registers 173192092faSJesper Dangaard Brouerbe added to BPF in the future? 1741a6ac1d5SJesper Dangaard Brouer 175c231c22aSJiong WangA: NO. 176c231c22aSJiong Wang 177c231c22aSJiong WangBut some optimizations on zero-ing the upper 32 bits for BPF registers are 178c231c22aSJiong Wangavailable, and can be leveraged to improve the performance of JITed BPF 179c231c22aSJiong Wangprograms for 32-bit architectures. 180c231c22aSJiong Wang 181c231c22aSJiong WangStarting with version 7, LLVM is able to generate instructions that operate 182c231c22aSJiong Wangon 32-bit subregisters, provided the option -mattr=+alu32 is passed for 183c231c22aSJiong Wangcompiling a program. Furthermore, the verifier can now mark the 184c231c22aSJiong Wanginstructions for which zero-ing the upper bits of the destination register 185c231c22aSJiong Wangis required, and insert an explicit zero-extension (zext) instruction 186c231c22aSJiong Wang(a mov32 variant). This means that for architectures without zext hardware 187c231c22aSJiong Wangsupport, the JIT back-ends do not need to clear the upper bits for 188c231c22aSJiong Wangsubregisters written by alu32 instructions or narrow loads. Instead, the 189c231c22aSJiong Wangback-ends simply need to support code generation for that mov32 variant, 190c231c22aSJiong Wangand to overwrite bpf_jit_needs_zext() to make it return "true" (in order to 191c231c22aSJiong Wangenable zext insertion in the verifier). 192c231c22aSJiong Wang 193c231c22aSJiong WangNote that it is possible for a JIT back-end to have partial hardware 194c231c22aSJiong Wangsupport for zext. In that case, if verifier zext insertion is enabled, 195c231c22aSJiong Wangit could lead to the insertion of unnecessary zext instructions. Such 196c231c22aSJiong Wanginstructions could be removed by creating a simple peephole inside the JIT 197c231c22aSJiong Wangback-end: if one instruction has hardware support for zext and if the next 198c231c22aSJiong Wanginstruction is an explicit zext, then the latter can be skipped when doing 199c231c22aSJiong Wangthe code generation. 200192092faSJesper Dangaard Brouer 201192092faSJesper Dangaard BrouerQ: Does BPF have a stable ABI? 2021a6ac1d5SJesper Dangaard Brouer------------------------------ 203192092faSJesper Dangaard BrouerA: YES. BPF instructions, arguments to BPF programs, set of helper 204192092faSJesper Dangaard Brouerfunctions and their arguments, recognized return codes are all part 205a769fa72SDaniel Borkmannof ABI. However there is one specific exception to tracing programs 206a769fa72SDaniel Borkmannwhich are using helpers like bpf_probe_read() to walk kernel internal 207a769fa72SDaniel Borkmanndata structures and compile with kernel internal headers. Both of these 208a769fa72SDaniel Borkmannkernel internals are subject to change and can break with newer kernels 209a769fa72SDaniel Borkmannsuch that the program needs to be adapted accordingly. 210192092faSJesper Dangaard Brouer 2116939f4efSQais YousefQ: Are tracepoints part of the stable ABI? 2126939f4efSQais Yousef------------------------------------------ 2136939f4efSQais YousefA: NO. Tracepoints are tied to internal implementation details hence they are 2146939f4efSQais Yousefsubject to change and can break with newer kernels. BPF programs need to change 2156939f4efSQais Yousefaccordingly when this happens. 2166939f4efSQais Yousef 217b9b738eeSPaul E. McKenneyQ: Are places where kprobes can attach part of the stable ABI? 218b9b738eeSPaul E. McKenney-------------------------------------------------------------- 219b9b738eeSPaul E. McKenneyA: NO. The places to which kprobes can attach are internal implementation 220b9b738eeSPaul E. McKenneydetails, which means that they are subject to change and can break with 221b9b738eeSPaul E. McKenneynewer kernels. BPF programs need to change accordingly when this happens. 222b9b738eeSPaul E. McKenney 223192092faSJesper Dangaard BrouerQ: How much stack space a BPF program uses? 2241a6ac1d5SJesper Dangaard Brouer------------------------------------------- 225192092faSJesper Dangaard BrouerA: Currently all program types are limited to 512 bytes of stack 226192092faSJesper Dangaard Brouerspace, but the verifier computes the actual amount of stack used 227192092faSJesper Dangaard Brouerand both interpreter and most JITed code consume necessary amount. 228192092faSJesper Dangaard Brouer 229192092faSJesper Dangaard BrouerQ: Can BPF be offloaded to HW? 2301a6ac1d5SJesper Dangaard Brouer------------------------------ 231192092faSJesper Dangaard BrouerA: YES. BPF HW offload is supported by NFP driver. 232192092faSJesper Dangaard Brouer 233192092faSJesper Dangaard BrouerQ: Does classic BPF interpreter still exist? 2341a6ac1d5SJesper Dangaard Brouer-------------------------------------------- 235192092faSJesper Dangaard BrouerA: NO. Classic BPF programs are converted into extend BPF instructions. 236192092faSJesper Dangaard Brouer 237192092faSJesper Dangaard BrouerQ: Can BPF call arbitrary kernel functions? 2381a6ac1d5SJesper Dangaard Brouer------------------------------------------- 239192092faSJesper Dangaard BrouerA: NO. BPF programs can only call a set of helper functions which 240192092faSJesper Dangaard Broueris defined for every program type. 241192092faSJesper Dangaard Brouer 242192092faSJesper Dangaard BrouerQ: Can BPF overwrite arbitrary kernel memory? 2431a6ac1d5SJesper Dangaard Brouer--------------------------------------------- 2441a6ac1d5SJesper Dangaard BrouerA: NO. 2451a6ac1d5SJesper Dangaard Brouer 2461a6ac1d5SJesper Dangaard BrouerTracing bpf programs can *read* arbitrary memory with bpf_probe_read() 247192092faSJesper Dangaard Brouerand bpf_probe_read_str() helpers. Networking programs cannot read 248192092faSJesper Dangaard Brouerarbitrary memory, since they don't have access to these helpers. 249192092faSJesper Dangaard BrouerPrograms can never read or write arbitrary memory directly. 250192092faSJesper Dangaard Brouer 251192092faSJesper Dangaard BrouerQ: Can BPF overwrite arbitrary user memory? 2521a6ac1d5SJesper Dangaard Brouer------------------------------------------- 2531a6ac1d5SJesper Dangaard BrouerA: Sort-of. 2541a6ac1d5SJesper Dangaard Brouer 2551a6ac1d5SJesper Dangaard BrouerTracing BPF programs can overwrite the user memory 256192092faSJesper Dangaard Brouerof the current task with bpf_probe_write_user(). Every time such 257192092faSJesper Dangaard Brouerprogram is loaded the kernel will print warning message, so 258192092faSJesper Dangaard Brouerthis helper is only useful for experiments and prototypes. 259192092faSJesper Dangaard BrouerTracing BPF programs are root only. 260192092faSJesper Dangaard Brouer 2611a6ac1d5SJesper Dangaard BrouerQ: New functionality via kernel modules? 2621a6ac1d5SJesper Dangaard Brouer---------------------------------------- 263192092faSJesper Dangaard BrouerQ: Can BPF functionality such as new program or map types, new 264192092faSJesper Dangaard Brouerhelpers, etc be added out of kernel module code? 2651a6ac1d5SJesper Dangaard Brouer 266192092faSJesper Dangaard BrouerA: NO. 2675bdca94fSMartin KaFai Lau 2685bdca94fSMartin KaFai LauQ: Directly calling kernel function is an ABI? 2695bdca94fSMartin KaFai Lau---------------------------------------------- 2705bdca94fSMartin KaFai LauQ: Some kernel functions (e.g. tcp_slow_start) can be called 2715bdca94fSMartin KaFai Lauby BPF programs. Do these kernel functions become an ABI? 2725bdca94fSMartin KaFai Lau 2735bdca94fSMartin KaFai LauA: NO. 2745bdca94fSMartin KaFai Lau 2755bdca94fSMartin KaFai LauThe kernel function protos will change and the bpf programs will be 2765bdca94fSMartin KaFai Laurejected by the verifier. Also, for example, some of the bpf-callable 2775bdca94fSMartin KaFai Laukernel functions have already been used by other kernel tcp 2785bdca94fSMartin KaFai Laucc (congestion-control) implementations. If any of these kernel 2795bdca94fSMartin KaFai Laufunctions has changed, both the in-tree and out-of-tree kernel tcp cc 2805bdca94fSMartin KaFai Lauimplementations have to be changed. The same goes for the bpf 2815bdca94fSMartin KaFai Lauprograms and they have to be adjusted accordingly. 28262fc770dSPaul E. McKenney 28362fc770dSPaul E. McKenneyQ: Attaching to arbitrary kernel functions is an ABI? 28462fc770dSPaul E. McKenney----------------------------------------------------- 28562fc770dSPaul E. McKenneyQ: BPF programs can be attached to many kernel functions. Do these 28662fc770dSPaul E. McKenneykernel functions become part of the ABI? 28762fc770dSPaul E. McKenney 28862fc770dSPaul E. McKenneyA: NO. 28962fc770dSPaul E. McKenney 29062fc770dSPaul E. McKenneyThe kernel function prototypes will change, and BPF programs attaching to 29162fc770dSPaul E. McKenneythem will need to change. The BPF compile-once-run-everywhere (CO-RE) 29262fc770dSPaul E. McKenneyshould be used in order to make it easier to adapt your BPF programs to 29362fc770dSPaul E. McKenneydifferent versions of the kernel. 2948fcf1969SPaul E. McKenney 2958fcf1969SPaul E. McKenneyQ: Marking a function with BTF_ID makes that function an ABI? 2968fcf1969SPaul E. McKenney------------------------------------------------------------- 2978fcf1969SPaul E. McKenneyA: NO. 2988fcf1969SPaul E. McKenney 2998fcf1969SPaul E. McKenneyThe BTF_ID macro does not cause a function to become part of the ABI 3008fcf1969SPaul E. McKenneyany more than does the EXPORT_SYMBOL_GPL macro. 3019805af8dSKumar Kartikeya Dwivedi 3029805af8dSKumar Kartikeya DwivediQ: What is the compatibility story for special BPF types in map values? 3039805af8dSKumar Kartikeya Dwivedi----------------------------------------------------------------------- 3049805af8dSKumar Kartikeya DwivediQ: Users are allowed to embed bpf_spin_lock, bpf_timer fields in their BPF map 3059805af8dSKumar Kartikeya Dwivedivalues (when using BTF support for BPF maps). This allows to use helpers for 3069805af8dSKumar Kartikeya Dwivedisuch objects on these fields inside map values. Users are also allowed to embed 3079805af8dSKumar Kartikeya Dwivedipointers to some kernel types (with __kptr and __kptr_ref BTF tags). Will the 3089805af8dSKumar Kartikeya Dwivedikernel preserve backwards compatibility for these features? 3099805af8dSKumar Kartikeya Dwivedi 3109805af8dSKumar Kartikeya DwivediA: It depends. For bpf_spin_lock, bpf_timer: YES, for kptr and everything else: 3119805af8dSKumar Kartikeya DwivediNO, but see below. 3129805af8dSKumar Kartikeya Dwivedi 3139805af8dSKumar Kartikeya DwivediFor struct types that have been added already, like bpf_spin_lock and bpf_timer, 3149805af8dSKumar Kartikeya Dwivedithe kernel will preserve backwards compatibility, as they are part of UAPI. 3159805af8dSKumar Kartikeya Dwivedi 3169805af8dSKumar Kartikeya DwivediFor kptrs, they are also part of UAPI, but only with respect to the kptr 3179805af8dSKumar Kartikeya Dwivedimechanism. The types that you can use with a __kptr and __kptr_ref tagged 3189805af8dSKumar Kartikeya Dwivedipointer in your struct are NOT part of the UAPI contract. The supported types can 3199805af8dSKumar Kartikeya Dwivediand will change across kernel releases. However, operations like accessing kptr 3209805af8dSKumar Kartikeya Dwivedifields and bpf_kptr_xchg() helper will continue to be supported across kernel 3219805af8dSKumar Kartikeya Dwivedireleases for the supported types. 3229805af8dSKumar Kartikeya Dwivedi 3239805af8dSKumar Kartikeya DwivediFor any other supported struct type, unless explicitly stated in this document 3249805af8dSKumar Kartikeya Dwivediand added to bpf.h UAPI header, such types can and will arbitrarily change their 3259805af8dSKumar Kartikeya Dwivedisize, type, and alignment, or any other user visible API or ABI detail across 3269805af8dSKumar Kartikeya Dwivedikernel releases. The users must adapt their BPF programs to the new changes and 3279805af8dSKumar Kartikeya Dwivediupdate them to make sure their programs continue to work correctly. 3289805af8dSKumar Kartikeya Dwivedi 32925906092SBagas SanjayaNOTE: BPF subsystem specially reserves the 'bpf\_' prefix for type names, in 3309805af8dSKumar Kartikeya Dwivediorder to introduce more special fields in the future. Hence, user programs must 33125906092SBagas Sanjayaavoid defining types with 'bpf\_' prefix to not be broken in future releases. 33225906092SBagas SanjayaIn other words, no backwards compatibility is guaranteed if one using a type 33325906092SBagas Sanjayain BTF with 'bpf\_' prefix. 3349805af8dSKumar Kartikeya Dwivedi 335*1f6d52f1SKumar Kartikeya DwivediQ: What is the compatibility story for special BPF types in allocated objects? 336*1f6d52f1SKumar Kartikeya Dwivedi------------------------------------------------------------------------------ 337*1f6d52f1SKumar Kartikeya DwivediQ: Same as above, but for allocated objects (i.e. objects allocated using 338*1f6d52f1SKumar Kartikeya Dwivedibpf_obj_new for user defined types). Will the kernel preserve backwards 3399805af8dSKumar Kartikeya Dwivedicompatibility for these features? 3409805af8dSKumar Kartikeya Dwivedi 3419805af8dSKumar Kartikeya DwivediA: NO. 3429805af8dSKumar Kartikeya Dwivedi 3439805af8dSKumar Kartikeya DwivediUnlike map value types, there are no stability guarantees for this case. The 344*1f6d52f1SKumar Kartikeya Dwivediwhole API to work with allocated objects and any support for special fields 345*1f6d52f1SKumar Kartikeya Dwivediinside them is unstable (since it is exposed through kfuncs). 346