1============================= 2BPF Kernel Functions (kfuncs) 3============================= 4 51. Introduction 6=============== 7 8BPF Kernel Functions or more commonly known as kfuncs are functions in the Linux 9kernel which are exposed for use by BPF programs. Unlike normal BPF helpers, 10kfuncs do not have a stable interface and can change from one kernel release to 11another. Hence, BPF programs need to be updated in response to changes in the 12kernel. 13 142. Defining a kfunc 15=================== 16 17There are two ways to expose a kernel function to BPF programs, either make an 18existing function in the kernel visible, or add a new wrapper for BPF. In both 19cases, care must be taken that BPF program can only call such function in a 20valid context. To enforce this, visibility of a kfunc can be per program type. 21 22If you are not creating a BPF wrapper for existing kernel function, skip ahead 23to :ref:`BPF_kfunc_nodef`. 24 252.1 Creating a wrapper kfunc 26---------------------------- 27 28When defining a wrapper kfunc, the wrapper function should have extern linkage. 29This prevents the compiler from optimizing away dead code, as this wrapper kfunc 30is not invoked anywhere in the kernel itself. It is not necessary to provide a 31prototype in a header for the wrapper kfunc. 32 33An example is given below:: 34 35 /* Disables missing prototype warnings */ 36 __diag_push(); 37 __diag_ignore_all("-Wmissing-prototypes", 38 "Global kfuncs as their definitions will be in BTF"); 39 40 struct task_struct *bpf_find_get_task_by_vpid(pid_t nr) 41 { 42 return find_get_task_by_vpid(nr); 43 } 44 45 __diag_pop(); 46 47A wrapper kfunc is often needed when we need to annotate parameters of the 48kfunc. Otherwise one may directly make the kfunc visible to the BPF program by 49registering it with the BPF subsystem. See :ref:`BPF_kfunc_nodef`. 50 512.2 Annotating kfunc parameters 52------------------------------- 53 54Similar to BPF helpers, there is sometime need for additional context required 55by the verifier to make the usage of kernel functions safer and more useful. 56Hence, we can annotate a parameter by suffixing the name of the argument of the 57kfunc with a __tag, where tag may be one of the supported annotations. 58 592.2.1 __sz Annotation 60--------------------- 61 62This annotation is used to indicate a memory and size pair in the argument list. 63An example is given below:: 64 65 void bpf_memzero(void *mem, int mem__sz) 66 { 67 ... 68 } 69 70Here, the verifier will treat first argument as a PTR_TO_MEM, and second 71argument as its size. By default, without __sz annotation, the size of the type 72of the pointer is used. Without __sz annotation, a kfunc cannot accept a void 73pointer. 74 752.2.2 __k Annotation 76-------------------- 77 78This annotation is only understood for scalar arguments, where it indicates that 79the verifier must check the scalar argument to be a known constant, which does 80not indicate a size parameter, and the value of the constant is relevant to the 81safety of the program. 82 83An example is given below:: 84 85 void *bpf_obj_new(u32 local_type_id__k, ...) 86 { 87 ... 88 } 89 90Here, bpf_obj_new uses local_type_id argument to find out the size of that type 91ID in program's BTF and return a sized pointer to it. Each type ID will have a 92distinct size, hence it is crucial to treat each such call as distinct when 93values don't match during verifier state pruning checks. 94 95Hence, whenever a constant scalar argument is accepted by a kfunc which is not a 96size parameter, and the value of the constant matters for program safety, __k 97suffix should be used. 98 99.. _BPF_kfunc_nodef: 100 1012.3 Using an existing kernel function 102------------------------------------- 103 104When an existing function in the kernel is fit for consumption by BPF programs, 105it can be directly registered with the BPF subsystem. However, care must still 106be taken to review the context in which it will be invoked by the BPF program 107and whether it is safe to do so. 108 1092.4 Annotating kfuncs 110--------------------- 111 112In addition to kfuncs' arguments, verifier may need more information about the 113type of kfunc(s) being registered with the BPF subsystem. To do so, we define 114flags on a set of kfuncs as follows:: 115 116 BTF_SET8_START(bpf_task_set) 117 BTF_ID_FLAGS(func, bpf_get_task_pid, KF_ACQUIRE | KF_RET_NULL) 118 BTF_ID_FLAGS(func, bpf_put_pid, KF_RELEASE) 119 BTF_SET8_END(bpf_task_set) 120 121This set encodes the BTF ID of each kfunc listed above, and encodes the flags 122along with it. Ofcourse, it is also allowed to specify no flags. 123 1242.4.1 KF_ACQUIRE flag 125--------------------- 126 127The KF_ACQUIRE flag is used to indicate that the kfunc returns a pointer to a 128refcounted object. The verifier will then ensure that the pointer to the object 129is eventually released using a release kfunc, or transferred to a map using a 130referenced kptr (by invoking bpf_kptr_xchg). If not, the verifier fails the 131loading of the BPF program until no lingering references remain in all possible 132explored states of the program. 133 1342.4.2 KF_RET_NULL flag 135---------------------- 136 137The KF_RET_NULL flag is used to indicate that the pointer returned by the kfunc 138may be NULL. Hence, it forces the user to do a NULL check on the pointer 139returned from the kfunc before making use of it (dereferencing or passing to 140another helper). This flag is often used in pairing with KF_ACQUIRE flag, but 141both are orthogonal to each other. 142 1432.4.3 KF_RELEASE flag 144--------------------- 145 146The KF_RELEASE flag is used to indicate that the kfunc releases the pointer 147passed in to it. There can be only one referenced pointer that can be passed in. 148All copies of the pointer being released are invalidated as a result of invoking 149kfunc with this flag. 150 1512.4.4 KF_KPTR_GET flag 152---------------------- 153 154The KF_KPTR_GET flag is used to indicate that the kfunc takes the first argument 155as a pointer to kptr, safely increments the refcount of the object it points to, 156and returns a reference to the user. The rest of the arguments may be normal 157arguments of a kfunc. The KF_KPTR_GET flag should be used in conjunction with 158KF_ACQUIRE and KF_RET_NULL flags. 159 1602.4.5 KF_TRUSTED_ARGS flag 161-------------------------- 162 163The KF_TRUSTED_ARGS flag is used for kfuncs taking pointer arguments. It 164indicates that the all pointer arguments are valid, and that all pointers to 165BTF objects have been passed in their unmodified form (that is, at a zero 166offset, and without having been obtained from walking another pointer). 167 168There are two types of pointers to kernel objects which are considered "valid": 169 1701. Pointers which are passed as tracepoint or struct_ops callback arguments. 1712. Pointers which were returned from a KF_ACQUIRE or KF_KPTR_GET kfunc. 172 173Pointers to non-BTF objects (e.g. scalar pointers) may also be passed to 174KF_TRUSTED_ARGS kfuncs, and may have a non-zero offset. 175 176The definition of "valid" pointers is subject to change at any time, and has 177absolutely no ABI stability guarantees. 178 1792.4.6 KF_SLEEPABLE flag 180----------------------- 181 182The KF_SLEEPABLE flag is used for kfuncs that may sleep. Such kfuncs can only 183be called by sleepable BPF programs (BPF_F_SLEEPABLE). 184 1852.4.7 KF_DESTRUCTIVE flag 186-------------------------- 187 188The KF_DESTRUCTIVE flag is used to indicate functions calling which is 189destructive to the system. For example such a call can result in system 190rebooting or panicking. Due to this additional restrictions apply to these 191calls. At the moment they only require CAP_SYS_BOOT capability, but more can be 192added later. 193 1942.5 Registering the kfuncs 195-------------------------- 196 197Once the kfunc is prepared for use, the final step to making it visible is 198registering it with the BPF subsystem. Registration is done per BPF program 199type. An example is shown below:: 200 201 BTF_SET8_START(bpf_task_set) 202 BTF_ID_FLAGS(func, bpf_get_task_pid, KF_ACQUIRE | KF_RET_NULL) 203 BTF_ID_FLAGS(func, bpf_put_pid, KF_RELEASE) 204 BTF_SET8_END(bpf_task_set) 205 206 static const struct btf_kfunc_id_set bpf_task_kfunc_set = { 207 .owner = THIS_MODULE, 208 .set = &bpf_task_set, 209 }; 210 211 static int init_subsystem(void) 212 { 213 return register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING, &bpf_task_kfunc_set); 214 } 215 late_initcall(init_subsystem); 216