xref: /openbmc/linux/Documentation/bpf/btf.rst (revision 6d425d7c)
1=====================
2BPF Type Format (BTF)
3=====================
4
51. Introduction
6***************
7
8BTF (BPF Type Format) is the metadata format which encodes the debug info
9related to BPF program/map. The name BTF was used initially to describe data
10types. The BTF was later extended to include function info for defined
11subroutines, and line info for source/line information.
12
13The debug info is used for map pretty print, function signature, etc. The
14function signature enables better bpf program/function kernel symbol. The line
15info helps generate source annotated translated byte code, jited code and
16verifier log.
17
18The BTF specification contains two parts,
19  * BTF kernel API
20  * BTF ELF file format
21
22The kernel API is the contract between user space and kernel. The kernel
23verifies the BTF info before using it. The ELF file format is a user space
24contract between ELF file and libbpf loader.
25
26The type and string sections are part of the BTF kernel API, describing the
27debug info (mostly types related) referenced by the bpf program. These two
28sections are discussed in details in :ref:`BTF_Type_String`.
29
30.. _BTF_Type_String:
31
322. BTF Type and String Encoding
33*******************************
34
35The file ``include/uapi/linux/btf.h`` provides high-level definition of how
36types/strings are encoded.
37
38The beginning of data blob must be::
39
40    struct btf_header {
41        __u16   magic;
42        __u8    version;
43        __u8    flags;
44        __u32   hdr_len;
45
46        /* All offsets are in bytes relative to the end of this header */
47        __u32   type_off;       /* offset of type section       */
48        __u32   type_len;       /* length of type section       */
49        __u32   str_off;        /* offset of string section     */
50        __u32   str_len;        /* length of string section     */
51    };
52
53The magic is ``0xeB9F``, which has different encoding for big and little
54endian systems, and can be used to test whether BTF is generated for big- or
55little-endian target. The ``btf_header`` is designed to be extensible with
56``hdr_len`` equal to ``sizeof(struct btf_header)`` when a data blob is
57generated.
58
592.1 String Encoding
60===================
61
62The first string in the string section must be a null string. The rest of
63string table is a concatenation of other null-terminated strings.
64
652.2 Type Encoding
66=================
67
68The type id ``0`` is reserved for ``void`` type. The type section is parsed
69sequentially and type id is assigned to each recognized type starting from id
70``1``. Currently, the following types are supported::
71
72    #define BTF_KIND_INT            1       /* Integer      */
73    #define BTF_KIND_PTR            2       /* Pointer      */
74    #define BTF_KIND_ARRAY          3       /* Array        */
75    #define BTF_KIND_STRUCT         4       /* Struct       */
76    #define BTF_KIND_UNION          5       /* Union        */
77    #define BTF_KIND_ENUM           6       /* Enumeration  */
78    #define BTF_KIND_FWD            7       /* Forward      */
79    #define BTF_KIND_TYPEDEF        8       /* Typedef      */
80    #define BTF_KIND_VOLATILE       9       /* Volatile     */
81    #define BTF_KIND_CONST          10      /* Const        */
82    #define BTF_KIND_RESTRICT       11      /* Restrict     */
83    #define BTF_KIND_FUNC           12      /* Function     */
84    #define BTF_KIND_FUNC_PROTO     13      /* Function Proto       */
85    #define BTF_KIND_VAR            14      /* Variable     */
86    #define BTF_KIND_DATASEC        15      /* Section      */
87    #define BTF_KIND_FLOAT          16      /* Floating point       */
88    #define BTF_KIND_DECL_TAG       17      /* Decl Tag     */
89
90Note that the type section encodes debug info, not just pure types.
91``BTF_KIND_FUNC`` is not a type, and it represents a defined subprogram.
92
93Each type contains the following common data::
94
95    struct btf_type {
96        __u32 name_off;
97        /* "info" bits arrangement
98         * bits  0-15: vlen (e.g. # of struct's members)
99         * bits 16-23: unused
100         * bits 24-28: kind (e.g. int, ptr, array...etc)
101         * bits 29-30: unused
102         * bit     31: kind_flag, currently used by
103         *             struct, union and fwd
104         */
105        __u32 info;
106        /* "size" is used by INT, ENUM, STRUCT and UNION.
107         * "size" tells the size of the type it is describing.
108         *
109         * "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT,
110         * FUNC, FUNC_PROTO and DECL_TAG.
111         * "type" is a type_id referring to another type.
112         */
113        union {
114                __u32 size;
115                __u32 type;
116        };
117    };
118
119For certain kinds, the common data are followed by kind-specific data. The
120``name_off`` in ``struct btf_type`` specifies the offset in the string table.
121The following sections detail encoding of each kind.
122
1232.2.1 BTF_KIND_INT
124~~~~~~~~~~~~~~~~~~
125
126``struct btf_type`` encoding requirement:
127 * ``name_off``: any valid offset
128 * ``info.kind_flag``: 0
129 * ``info.kind``: BTF_KIND_INT
130 * ``info.vlen``: 0
131 * ``size``: the size of the int type in bytes.
132
133``btf_type`` is followed by a ``u32`` with the following bits arrangement::
134
135  #define BTF_INT_ENCODING(VAL)   (((VAL) & 0x0f000000) >> 24)
136  #define BTF_INT_OFFSET(VAL)     (((VAL) & 0x00ff0000) >> 16)
137  #define BTF_INT_BITS(VAL)       ((VAL)  & 0x000000ff)
138
139The ``BTF_INT_ENCODING`` has the following attributes::
140
141  #define BTF_INT_SIGNED  (1 << 0)
142  #define BTF_INT_CHAR    (1 << 1)
143  #define BTF_INT_BOOL    (1 << 2)
144
145The ``BTF_INT_ENCODING()`` provides extra information: signedness, char, or
146bool, for the int type. The char and bool encoding are mostly useful for
147pretty print. At most one encoding can be specified for the int type.
148
149The ``BTF_INT_BITS()`` specifies the number of actual bits held by this int
150type. For example, a 4-bit bitfield encodes ``BTF_INT_BITS()`` equals to 4.
151The ``btf_type.size * 8`` must be equal to or greater than ``BTF_INT_BITS()``
152for the type. The maximum value of ``BTF_INT_BITS()`` is 128.
153
154The ``BTF_INT_OFFSET()`` specifies the starting bit offset to calculate values
155for this int. For example, a bitfield struct member has:
156
157 * btf member bit offset 100 from the start of the structure,
158 * btf member pointing to an int type,
159 * the int type has ``BTF_INT_OFFSET() = 2`` and ``BTF_INT_BITS() = 4``
160
161Then in the struct memory layout, this member will occupy ``4`` bits starting
162from bits ``100 + 2 = 102``.
163
164Alternatively, the bitfield struct member can be the following to access the
165same bits as the above:
166
167 * btf member bit offset 102,
168 * btf member pointing to an int type,
169 * the int type has ``BTF_INT_OFFSET() = 0`` and ``BTF_INT_BITS() = 4``
170
171The original intention of ``BTF_INT_OFFSET()`` is to provide flexibility of
172bitfield encoding. Currently, both llvm and pahole generate
173``BTF_INT_OFFSET() = 0`` for all int types.
174
1752.2.2 BTF_KIND_PTR
176~~~~~~~~~~~~~~~~~~
177
178``struct btf_type`` encoding requirement:
179  * ``name_off``: 0
180  * ``info.kind_flag``: 0
181  * ``info.kind``: BTF_KIND_PTR
182  * ``info.vlen``: 0
183  * ``type``: the pointee type of the pointer
184
185No additional type data follow ``btf_type``.
186
1872.2.3 BTF_KIND_ARRAY
188~~~~~~~~~~~~~~~~~~~~
189
190``struct btf_type`` encoding requirement:
191  * ``name_off``: 0
192  * ``info.kind_flag``: 0
193  * ``info.kind``: BTF_KIND_ARRAY
194  * ``info.vlen``: 0
195  * ``size/type``: 0, not used
196
197``btf_type`` is followed by one ``struct btf_array``::
198
199    struct btf_array {
200        __u32   type;
201        __u32   index_type;
202        __u32   nelems;
203    };
204
205The ``struct btf_array`` encoding:
206  * ``type``: the element type
207  * ``index_type``: the index type
208  * ``nelems``: the number of elements for this array (``0`` is also allowed).
209
210The ``index_type`` can be any regular int type (``u8``, ``u16``, ``u32``,
211``u64``, ``unsigned __int128``). The original design of including
212``index_type`` follows DWARF, which has an ``index_type`` for its array type.
213Currently in BTF, beyond type verification, the ``index_type`` is not used.
214
215The ``struct btf_array`` allows chaining through element type to represent
216multidimensional arrays. For example, for ``int a[5][6]``, the following type
217information illustrates the chaining:
218
219  * [1]: int
220  * [2]: array, ``btf_array.type = [1]``, ``btf_array.nelems = 6``
221  * [3]: array, ``btf_array.type = [2]``, ``btf_array.nelems = 5``
222
223Currently, both pahole and llvm collapse multidimensional array into
224one-dimensional array, e.g., for ``a[5][6]``, the ``btf_array.nelems`` is
225equal to ``30``. This is because the original use case is map pretty print
226where the whole array is dumped out so one-dimensional array is enough. As
227more BTF usage is explored, pahole and llvm can be changed to generate proper
228chained representation for multidimensional arrays.
229
2302.2.4 BTF_KIND_STRUCT
231~~~~~~~~~~~~~~~~~~~~~
2322.2.5 BTF_KIND_UNION
233~~~~~~~~~~~~~~~~~~~~
234
235``struct btf_type`` encoding requirement:
236  * ``name_off``: 0 or offset to a valid C identifier
237  * ``info.kind_flag``: 0 or 1
238  * ``info.kind``: BTF_KIND_STRUCT or BTF_KIND_UNION
239  * ``info.vlen``: the number of struct/union members
240  * ``info.size``: the size of the struct/union in bytes
241
242``btf_type`` is followed by ``info.vlen`` number of ``struct btf_member``.::
243
244    struct btf_member {
245        __u32   name_off;
246        __u32   type;
247        __u32   offset;
248    };
249
250``struct btf_member`` encoding:
251  * ``name_off``: offset to a valid C identifier
252  * ``type``: the member type
253  * ``offset``: <see below>
254
255If the type info ``kind_flag`` is not set, the offset contains only bit offset
256of the member. Note that the base type of the bitfield can only be int or enum
257type. If the bitfield size is 32, the base type can be either int or enum
258type. If the bitfield size is not 32, the base type must be int, and int type
259``BTF_INT_BITS()`` encodes the bitfield size.
260
261If the ``kind_flag`` is set, the ``btf_member.offset`` contains both member
262bitfield size and bit offset. The bitfield size and bit offset are calculated
263as below.::
264
265  #define BTF_MEMBER_BITFIELD_SIZE(val)   ((val) >> 24)
266  #define BTF_MEMBER_BIT_OFFSET(val)      ((val) & 0xffffff)
267
268In this case, if the base type is an int type, it must be a regular int type:
269
270  * ``BTF_INT_OFFSET()`` must be 0.
271  * ``BTF_INT_BITS()`` must be equal to ``{1,2,4,8,16} * 8``.
272
273The following kernel patch introduced ``kind_flag`` and explained why both
274modes exist:
275
276  https://github.com/torvalds/linux/commit/9d5f9f701b1891466fb3dbb1806ad97716f95cc3#diff-fa650a64fdd3968396883d2fe8215ff3
277
2782.2.6 BTF_KIND_ENUM
279~~~~~~~~~~~~~~~~~~~
280
281``struct btf_type`` encoding requirement:
282  * ``name_off``: 0 or offset to a valid C identifier
283  * ``info.kind_flag``: 0
284  * ``info.kind``: BTF_KIND_ENUM
285  * ``info.vlen``: number of enum values
286  * ``size``: 4
287
288``btf_type`` is followed by ``info.vlen`` number of ``struct btf_enum``.::
289
290    struct btf_enum {
291        __u32   name_off;
292        __s32   val;
293    };
294
295The ``btf_enum`` encoding:
296  * ``name_off``: offset to a valid C identifier
297  * ``val``: any value
298
2992.2.7 BTF_KIND_FWD
300~~~~~~~~~~~~~~~~~~
301
302``struct btf_type`` encoding requirement:
303  * ``name_off``: offset to a valid C identifier
304  * ``info.kind_flag``: 0 for struct, 1 for union
305  * ``info.kind``: BTF_KIND_FWD
306  * ``info.vlen``: 0
307  * ``type``: 0
308
309No additional type data follow ``btf_type``.
310
3112.2.8 BTF_KIND_TYPEDEF
312~~~~~~~~~~~~~~~~~~~~~~
313
314``struct btf_type`` encoding requirement:
315  * ``name_off``: offset to a valid C identifier
316  * ``info.kind_flag``: 0
317  * ``info.kind``: BTF_KIND_TYPEDEF
318  * ``info.vlen``: 0
319  * ``type``: the type which can be referred by name at ``name_off``
320
321No additional type data follow ``btf_type``.
322
3232.2.9 BTF_KIND_VOLATILE
324~~~~~~~~~~~~~~~~~~~~~~~
325
326``struct btf_type`` encoding requirement:
327  * ``name_off``: 0
328  * ``info.kind_flag``: 0
329  * ``info.kind``: BTF_KIND_VOLATILE
330  * ``info.vlen``: 0
331  * ``type``: the type with ``volatile`` qualifier
332
333No additional type data follow ``btf_type``.
334
3352.2.10 BTF_KIND_CONST
336~~~~~~~~~~~~~~~~~~~~~
337
338``struct btf_type`` encoding requirement:
339  * ``name_off``: 0
340  * ``info.kind_flag``: 0
341  * ``info.kind``: BTF_KIND_CONST
342  * ``info.vlen``: 0
343  * ``type``: the type with ``const`` qualifier
344
345No additional type data follow ``btf_type``.
346
3472.2.11 BTF_KIND_RESTRICT
348~~~~~~~~~~~~~~~~~~~~~~~~
349
350``struct btf_type`` encoding requirement:
351  * ``name_off``: 0
352  * ``info.kind_flag``: 0
353  * ``info.kind``: BTF_KIND_RESTRICT
354  * ``info.vlen``: 0
355  * ``type``: the type with ``restrict`` qualifier
356
357No additional type data follow ``btf_type``.
358
3592.2.12 BTF_KIND_FUNC
360~~~~~~~~~~~~~~~~~~~~
361
362``struct btf_type`` encoding requirement:
363  * ``name_off``: offset to a valid C identifier
364  * ``info.kind_flag``: 0
365  * ``info.kind``: BTF_KIND_FUNC
366  * ``info.vlen``: 0
367  * ``type``: a BTF_KIND_FUNC_PROTO type
368
369No additional type data follow ``btf_type``.
370
371A BTF_KIND_FUNC defines not a type, but a subprogram (function) whose
372signature is defined by ``type``. The subprogram is thus an instance of that
373type. The BTF_KIND_FUNC may in turn be referenced by a func_info in the
374:ref:`BTF_Ext_Section` (ELF) or in the arguments to :ref:`BPF_Prog_Load`
375(ABI).
376
3772.2.13 BTF_KIND_FUNC_PROTO
378~~~~~~~~~~~~~~~~~~~~~~~~~~
379
380``struct btf_type`` encoding requirement:
381  * ``name_off``: 0
382  * ``info.kind_flag``: 0
383  * ``info.kind``: BTF_KIND_FUNC_PROTO
384  * ``info.vlen``: # of parameters
385  * ``type``: the return type
386
387``btf_type`` is followed by ``info.vlen`` number of ``struct btf_param``.::
388
389    struct btf_param {
390        __u32   name_off;
391        __u32   type;
392    };
393
394If a BTF_KIND_FUNC_PROTO type is referred by a BTF_KIND_FUNC type, then
395``btf_param.name_off`` must point to a valid C identifier except for the
396possible last argument representing the variable argument. The btf_param.type
397refers to parameter type.
398
399If the function has variable arguments, the last parameter is encoded with
400``name_off = 0`` and ``type = 0``.
401
4022.2.14 BTF_KIND_VAR
403~~~~~~~~~~~~~~~~~~~
404
405``struct btf_type`` encoding requirement:
406  * ``name_off``: offset to a valid C identifier
407  * ``info.kind_flag``: 0
408  * ``info.kind``: BTF_KIND_VAR
409  * ``info.vlen``: 0
410  * ``type``: the type of the variable
411
412``btf_type`` is followed by a single ``struct btf_variable`` with the
413following data::
414
415    struct btf_var {
416        __u32   linkage;
417    };
418
419``struct btf_var`` encoding:
420  * ``linkage``: currently only static variable 0, or globally allocated
421                 variable in ELF sections 1
422
423Not all type of global variables are supported by LLVM at this point.
424The following is currently available:
425
426  * static variables with or without section attributes
427  * global variables with section attributes
428
429The latter is for future extraction of map key/value type id's from a
430map definition.
431
4322.2.15 BTF_KIND_DATASEC
433~~~~~~~~~~~~~~~~~~~~~~~
434
435``struct btf_type`` encoding requirement:
436  * ``name_off``: offset to a valid name associated with a variable or
437                  one of .data/.bss/.rodata
438  * ``info.kind_flag``: 0
439  * ``info.kind``: BTF_KIND_DATASEC
440  * ``info.vlen``: # of variables
441  * ``size``: total section size in bytes (0 at compilation time, patched
442              to actual size by BPF loaders such as libbpf)
443
444``btf_type`` is followed by ``info.vlen`` number of ``struct btf_var_secinfo``.::
445
446    struct btf_var_secinfo {
447        __u32   type;
448        __u32   offset;
449        __u32   size;
450    };
451
452``struct btf_var_secinfo`` encoding:
453  * ``type``: the type of the BTF_KIND_VAR variable
454  * ``offset``: the in-section offset of the variable
455  * ``size``: the size of the variable in bytes
456
4572.2.16 BTF_KIND_FLOAT
458~~~~~~~~~~~~~~~~~~~~~
459
460``struct btf_type`` encoding requirement:
461 * ``name_off``: any valid offset
462 * ``info.kind_flag``: 0
463 * ``info.kind``: BTF_KIND_FLOAT
464 * ``info.vlen``: 0
465 * ``size``: the size of the float type in bytes: 2, 4, 8, 12 or 16.
466
467No additional type data follow ``btf_type``.
468
4692.2.17 BTF_KIND_DECL_TAG
470~~~~~~~~~~~~~~~~~~~~~~~~
471
472``struct btf_type`` encoding requirement:
473 * ``name_off``: offset to a non-empty string
474 * ``info.kind_flag``: 0
475 * ``info.kind``: BTF_KIND_DECL_TAG
476 * ``info.vlen``: 0
477 * ``type``: ``struct``, ``union``, ``func``, ``var`` or ``typedef``
478
479``btf_type`` is followed by ``struct btf_decl_tag``.::
480
481    struct btf_decl_tag {
482        __u32   component_idx;
483    };
484
485The ``name_off`` encodes btf_decl_tag attribute string.
486The ``type`` should be ``struct``, ``union``, ``func``, ``var`` or ``typedef``.
487For ``var`` or ``typedef`` type, ``btf_decl_tag.component_idx`` must be ``-1``.
488For the other three types, if the btf_decl_tag attribute is
489applied to the ``struct``, ``union`` or ``func`` itself,
490``btf_decl_tag.component_idx`` must be ``-1``. Otherwise,
491the attribute is applied to a ``struct``/``union`` member or
492a ``func`` argument, and ``btf_decl_tag.component_idx`` should be a
493valid index (starting from 0) pointing to a member or an argument.
494
4953. BTF Kernel API
496*****************
497
498The following bpf syscall command involves BTF:
499   * BPF_BTF_LOAD: load a blob of BTF data into kernel
500   * BPF_MAP_CREATE: map creation with btf key and value type info.
501   * BPF_PROG_LOAD: prog load with btf function and line info.
502   * BPF_BTF_GET_FD_BY_ID: get a btf fd
503   * BPF_OBJ_GET_INFO_BY_FD: btf, func_info, line_info
504     and other btf related info are returned.
505
506The workflow typically looks like:
507::
508
509  Application:
510      BPF_BTF_LOAD
511          |
512          v
513      BPF_MAP_CREATE and BPF_PROG_LOAD
514          |
515          V
516      ......
517
518  Introspection tool:
519      ......
520      BPF_{PROG,MAP}_GET_NEXT_ID (get prog/map id's)
521          |
522          V
523      BPF_{PROG,MAP}_GET_FD_BY_ID (get a prog/map fd)
524          |
525          V
526      BPF_OBJ_GET_INFO_BY_FD (get bpf_prog_info/bpf_map_info with btf_id)
527          |                                     |
528          V                                     |
529      BPF_BTF_GET_FD_BY_ID (get btf_fd)         |
530          |                                     |
531          V                                     |
532      BPF_OBJ_GET_INFO_BY_FD (get btf)          |
533          |                                     |
534          V                                     V
535      pretty print types, dump func signatures and line info, etc.
536
537
5383.1 BPF_BTF_LOAD
539================
540
541Load a blob of BTF data into kernel. A blob of data, described in
542:ref:`BTF_Type_String`, can be directly loaded into the kernel. A ``btf_fd``
543is returned to a userspace.
544
5453.2 BPF_MAP_CREATE
546==================
547
548A map can be created with ``btf_fd`` and specified key/value type id.::
549
550    __u32   btf_fd;         /* fd pointing to a BTF type data */
551    __u32   btf_key_type_id;        /* BTF type_id of the key */
552    __u32   btf_value_type_id;      /* BTF type_id of the value */
553
554In libbpf, the map can be defined with extra annotation like below:
555::
556
557    struct bpf_map_def SEC("maps") btf_map = {
558        .type = BPF_MAP_TYPE_ARRAY,
559        .key_size = sizeof(int),
560        .value_size = sizeof(struct ipv_counts),
561        .max_entries = 4,
562    };
563    BPF_ANNOTATE_KV_PAIR(btf_map, int, struct ipv_counts);
564
565Here, the parameters for macro BPF_ANNOTATE_KV_PAIR are map name, key and
566value types for the map. During ELF parsing, libbpf is able to extract
567key/value type_id's and assign them to BPF_MAP_CREATE attributes
568automatically.
569
570.. _BPF_Prog_Load:
571
5723.3 BPF_PROG_LOAD
573=================
574
575During prog_load, func_info and line_info can be passed to kernel with proper
576values for the following attributes:
577::
578
579    __u32           insn_cnt;
580    __aligned_u64   insns;
581    ......
582    __u32           prog_btf_fd;    /* fd pointing to BTF type data */
583    __u32           func_info_rec_size;     /* userspace bpf_func_info size */
584    __aligned_u64   func_info;      /* func info */
585    __u32           func_info_cnt;  /* number of bpf_func_info records */
586    __u32           line_info_rec_size;     /* userspace bpf_line_info size */
587    __aligned_u64   line_info;      /* line info */
588    __u32           line_info_cnt;  /* number of bpf_line_info records */
589
590The func_info and line_info are an array of below, respectively.::
591
592    struct bpf_func_info {
593        __u32   insn_off; /* [0, insn_cnt - 1] */
594        __u32   type_id;  /* pointing to a BTF_KIND_FUNC type */
595    };
596    struct bpf_line_info {
597        __u32   insn_off; /* [0, insn_cnt - 1] */
598        __u32   file_name_off; /* offset to string table for the filename */
599        __u32   line_off; /* offset to string table for the source line */
600        __u32   line_col; /* line number and column number */
601    };
602
603func_info_rec_size is the size of each func_info record, and
604line_info_rec_size is the size of each line_info record. Passing the record
605size to kernel make it possible to extend the record itself in the future.
606
607Below are requirements for func_info:
608  * func_info[0].insn_off must be 0.
609  * the func_info insn_off is in strictly increasing order and matches
610    bpf func boundaries.
611
612Below are requirements for line_info:
613  * the first insn in each func must have a line_info record pointing to it.
614  * the line_info insn_off is in strictly increasing order.
615
616For line_info, the line number and column number are defined as below:
617::
618
619    #define BPF_LINE_INFO_LINE_NUM(line_col)        ((line_col) >> 10)
620    #define BPF_LINE_INFO_LINE_COL(line_col)        ((line_col) & 0x3ff)
621
6223.4 BPF_{PROG,MAP}_GET_NEXT_ID
623==============================
624
625In kernel, every loaded program, map or btf has a unique id. The id won't
626change during the lifetime of a program, map, or btf.
627
628The bpf syscall command BPF_{PROG,MAP}_GET_NEXT_ID returns all id's, one for
629each command, to user space, for bpf program or maps, respectively, so an
630inspection tool can inspect all programs and maps.
631
6323.5 BPF_{PROG,MAP}_GET_FD_BY_ID
633===============================
634
635An introspection tool cannot use id to get details about program or maps.
636A file descriptor needs to be obtained first for reference-counting purpose.
637
6383.6 BPF_OBJ_GET_INFO_BY_FD
639==========================
640
641Once a program/map fd is acquired, an introspection tool can get the detailed
642information from kernel about this fd, some of which are BTF-related. For
643example, ``bpf_map_info`` returns ``btf_id`` and key/value type ids.
644``bpf_prog_info`` returns ``btf_id``, func_info, and line info for translated
645bpf byte codes, and jited_line_info.
646
6473.7 BPF_BTF_GET_FD_BY_ID
648========================
649
650With ``btf_id`` obtained in ``bpf_map_info`` and ``bpf_prog_info``, bpf
651syscall command BPF_BTF_GET_FD_BY_ID can retrieve a btf fd. Then, with
652command BPF_OBJ_GET_INFO_BY_FD, the btf blob, originally loaded into the
653kernel with BPF_BTF_LOAD, can be retrieved.
654
655With the btf blob, ``bpf_map_info``, and ``bpf_prog_info``, an introspection
656tool has full btf knowledge and is able to pretty print map key/values, dump
657func signatures and line info, along with byte/jit codes.
658
6594. ELF File Format Interface
660****************************
661
6624.1 .BTF section
663================
664
665The .BTF section contains type and string data. The format of this section is
666same as the one describe in :ref:`BTF_Type_String`.
667
668.. _BTF_Ext_Section:
669
6704.2 .BTF.ext section
671====================
672
673The .BTF.ext section encodes func_info and line_info which needs loader
674manipulation before loading into the kernel.
675
676The specification for .BTF.ext section is defined at ``tools/lib/bpf/btf.h``
677and ``tools/lib/bpf/btf.c``.
678
679The current header of .BTF.ext section::
680
681    struct btf_ext_header {
682        __u16   magic;
683        __u8    version;
684        __u8    flags;
685        __u32   hdr_len;
686
687        /* All offsets are in bytes relative to the end of this header */
688        __u32   func_info_off;
689        __u32   func_info_len;
690        __u32   line_info_off;
691        __u32   line_info_len;
692    };
693
694It is very similar to .BTF section. Instead of type/string section, it
695contains func_info and line_info section. See :ref:`BPF_Prog_Load` for details
696about func_info and line_info record format.
697
698The func_info is organized as below.::
699
700     func_info_rec_size
701     btf_ext_info_sec for section #1 /* func_info for section #1 */
702     btf_ext_info_sec for section #2 /* func_info for section #2 */
703     ...
704
705``func_info_rec_size`` specifies the size of ``bpf_func_info`` structure when
706.BTF.ext is generated. ``btf_ext_info_sec``, defined below, is a collection of
707func_info for each specific ELF section.::
708
709     struct btf_ext_info_sec {
710        __u32   sec_name_off; /* offset to section name */
711        __u32   num_info;
712        /* Followed by num_info * record_size number of bytes */
713        __u8    data[0];
714     };
715
716Here, num_info must be greater than 0.
717
718The line_info is organized as below.::
719
720     line_info_rec_size
721     btf_ext_info_sec for section #1 /* line_info for section #1 */
722     btf_ext_info_sec for section #2 /* line_info for section #2 */
723     ...
724
725``line_info_rec_size`` specifies the size of ``bpf_line_info`` structure when
726.BTF.ext is generated.
727
728The interpretation of ``bpf_func_info->insn_off`` and
729``bpf_line_info->insn_off`` is different between kernel API and ELF API. For
730kernel API, the ``insn_off`` is the instruction offset in the unit of ``struct
731bpf_insn``. For ELF API, the ``insn_off`` is the byte offset from the
732beginning of section (``btf_ext_info_sec->sec_name_off``).
733
7344.2 .BTF_ids section
735====================
736
737The .BTF_ids section encodes BTF ID values that are used within the kernel.
738
739This section is created during the kernel compilation with the help of
740macros defined in ``include/linux/btf_ids.h`` header file. Kernel code can
741use them to create lists and sets (sorted lists) of BTF ID values.
742
743The ``BTF_ID_LIST`` and ``BTF_ID`` macros define unsorted list of BTF ID values,
744with following syntax::
745
746  BTF_ID_LIST(list)
747  BTF_ID(type1, name1)
748  BTF_ID(type2, name2)
749
750resulting in following layout in .BTF_ids section::
751
752  __BTF_ID__type1__name1__1:
753  .zero 4
754  __BTF_ID__type2__name2__2:
755  .zero 4
756
757The ``u32 list[];`` variable is defined to access the list.
758
759The ``BTF_ID_UNUSED`` macro defines 4 zero bytes. It's used when we
760want to define unused entry in BTF_ID_LIST, like::
761
762      BTF_ID_LIST(bpf_skb_output_btf_ids)
763      BTF_ID(struct, sk_buff)
764      BTF_ID_UNUSED
765      BTF_ID(struct, task_struct)
766
767The ``BTF_SET_START/END`` macros pair defines sorted list of BTF ID values
768and their count, with following syntax::
769
770  BTF_SET_START(set)
771  BTF_ID(type1, name1)
772  BTF_ID(type2, name2)
773  BTF_SET_END(set)
774
775resulting in following layout in .BTF_ids section::
776
777  __BTF_ID__set__set:
778  .zero 4
779  __BTF_ID__type1__name1__3:
780  .zero 4
781  __BTF_ID__type2__name2__4:
782  .zero 4
783
784The ``struct btf_id_set set;`` variable is defined to access the list.
785
786The ``typeX`` name can be one of following::
787
788   struct, union, typedef, func
789
790and is used as a filter when resolving the BTF ID value.
791
792All the BTF ID lists and sets are compiled in the .BTF_ids section and
793resolved during the linking phase of kernel build by ``resolve_btfids`` tool.
794
7955. Using BTF
796************
797
7985.1 bpftool map pretty print
799============================
800
801With BTF, the map key/value can be printed based on fields rather than simply
802raw bytes. This is especially valuable for large structure or if your data
803structure has bitfields. For example, for the following map,::
804
805      enum A { A1, A2, A3, A4, A5 };
806      typedef enum A ___A;
807      struct tmp_t {
808           char a1:4;
809           int  a2:4;
810           int  :4;
811           __u32 a3:4;
812           int b;
813           ___A b1:4;
814           enum A b2:4;
815      };
816      struct bpf_map_def SEC("maps") tmpmap = {
817           .type = BPF_MAP_TYPE_ARRAY,
818           .key_size = sizeof(__u32),
819           .value_size = sizeof(struct tmp_t),
820           .max_entries = 1,
821      };
822      BPF_ANNOTATE_KV_PAIR(tmpmap, int, struct tmp_t);
823
824bpftool is able to pretty print like below:
825::
826
827      [{
828            "key": 0,
829            "value": {
830                "a1": 0x2,
831                "a2": 0x4,
832                "a3": 0x6,
833                "b": 7,
834                "b1": 0x8,
835                "b2": 0xa
836            }
837        }
838      ]
839
8405.2 bpftool prog dump
841=====================
842
843The following is an example showing how func_info and line_info can help prog
844dump with better kernel symbol names, function prototypes and line
845information.::
846
847    $ bpftool prog dump jited pinned /sys/fs/bpf/test_btf_haskv
848    [...]
849    int test_long_fname_2(struct dummy_tracepoint_args * arg):
850    bpf_prog_44a040bf25481309_test_long_fname_2:
851    ; static int test_long_fname_2(struct dummy_tracepoint_args *arg)
852       0:   push   %rbp
853       1:   mov    %rsp,%rbp
854       4:   sub    $0x30,%rsp
855       b:   sub    $0x28,%rbp
856       f:   mov    %rbx,0x0(%rbp)
857      13:   mov    %r13,0x8(%rbp)
858      17:   mov    %r14,0x10(%rbp)
859      1b:   mov    %r15,0x18(%rbp)
860      1f:   xor    %eax,%eax
861      21:   mov    %rax,0x20(%rbp)
862      25:   xor    %esi,%esi
863    ; int key = 0;
864      27:   mov    %esi,-0x4(%rbp)
865    ; if (!arg->sock)
866      2a:   mov    0x8(%rdi),%rdi
867    ; if (!arg->sock)
868      2e:   cmp    $0x0,%rdi
869      32:   je     0x0000000000000070
870      34:   mov    %rbp,%rsi
871    ; counts = bpf_map_lookup_elem(&btf_map, &key);
872    [...]
873
8745.3 Verifier Log
875================
876
877The following is an example of how line_info can help debugging verification
878failure.::
879
880       /* The code at tools/testing/selftests/bpf/test_xdp_noinline.c
881        * is modified as below.
882        */
883       data = (void *)(long)xdp->data;
884       data_end = (void *)(long)xdp->data_end;
885       /*
886       if (data + 4 > data_end)
887               return XDP_DROP;
888       */
889       *(u32 *)data = dst->dst;
890
891    $ bpftool prog load ./test_xdp_noinline.o /sys/fs/bpf/test_xdp_noinline type xdp
892        ; data = (void *)(long)xdp->data;
893        224: (79) r2 = *(u64 *)(r10 -112)
894        225: (61) r2 = *(u32 *)(r2 +0)
895        ; *(u32 *)data = dst->dst;
896        226: (63) *(u32 *)(r2 +0) = r1
897        invalid access to packet, off=0 size=4, R2(id=0,off=0,r=0)
898        R2 offset is outside of the packet
899
9006. BTF Generation
901*****************
902
903You need latest pahole
904
905  https://git.kernel.org/pub/scm/devel/pahole/pahole.git/
906
907or llvm (8.0 or later). The pahole acts as a dwarf2btf converter. It doesn't
908support .BTF.ext and btf BTF_KIND_FUNC type yet. For example,::
909
910      -bash-4.4$ cat t.c
911      struct t {
912        int a:2;
913        int b:3;
914        int c:2;
915      } g;
916      -bash-4.4$ gcc -c -O2 -g t.c
917      -bash-4.4$ pahole -JV t.o
918      File t.o:
919      [1] STRUCT t kind_flag=1 size=4 vlen=3
920              a type_id=2 bitfield_size=2 bits_offset=0
921              b type_id=2 bitfield_size=3 bits_offset=2
922              c type_id=2 bitfield_size=2 bits_offset=5
923      [2] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED
924
925The llvm is able to generate .BTF and .BTF.ext directly with -g for bpf target
926only. The assembly code (-S) is able to show the BTF encoding in assembly
927format.::
928
929    -bash-4.4$ cat t2.c
930    typedef int __int32;
931    struct t2 {
932      int a2;
933      int (*f2)(char q1, __int32 q2, ...);
934      int (*f3)();
935    } g2;
936    int main() { return 0; }
937    int test() { return 0; }
938    -bash-4.4$ clang -c -g -O2 -target bpf t2.c
939    -bash-4.4$ readelf -S t2.o
940      ......
941      [ 8] .BTF              PROGBITS         0000000000000000  00000247
942           000000000000016e  0000000000000000           0     0     1
943      [ 9] .BTF.ext          PROGBITS         0000000000000000  000003b5
944           0000000000000060  0000000000000000           0     0     1
945      [10] .rel.BTF.ext      REL              0000000000000000  000007e0
946           0000000000000040  0000000000000010          16     9     8
947      ......
948    -bash-4.4$ clang -S -g -O2 -target bpf t2.c
949    -bash-4.4$ cat t2.s
950      ......
951            .section        .BTF,"",@progbits
952            .short  60319                   # 0xeb9f
953            .byte   1
954            .byte   0
955            .long   24
956            .long   0
957            .long   220
958            .long   220
959            .long   122
960            .long   0                       # BTF_KIND_FUNC_PROTO(id = 1)
961            .long   218103808               # 0xd000000
962            .long   2
963            .long   83                      # BTF_KIND_INT(id = 2)
964            .long   16777216                # 0x1000000
965            .long   4
966            .long   16777248                # 0x1000020
967      ......
968            .byte   0                       # string offset=0
969            .ascii  ".text"                 # string offset=1
970            .byte   0
971            .ascii  "/home/yhs/tmp-pahole/t2.c" # string offset=7
972            .byte   0
973            .ascii  "int main() { return 0; }" # string offset=33
974            .byte   0
975            .ascii  "int test() { return 0; }" # string offset=58
976            .byte   0
977            .ascii  "int"                   # string offset=83
978      ......
979            .section        .BTF.ext,"",@progbits
980            .short  60319                   # 0xeb9f
981            .byte   1
982            .byte   0
983            .long   24
984            .long   0
985            .long   28
986            .long   28
987            .long   44
988            .long   8                       # FuncInfo
989            .long   1                       # FuncInfo section string offset=1
990            .long   2
991            .long   .Lfunc_begin0
992            .long   3
993            .long   .Lfunc_begin1
994            .long   5
995            .long   16                      # LineInfo
996            .long   1                       # LineInfo section string offset=1
997            .long   2
998            .long   .Ltmp0
999            .long   7
1000            .long   33
1001            .long   7182                    # Line 7 Col 14
1002            .long   .Ltmp3
1003            .long   7
1004            .long   58
1005            .long   8206                    # Line 8 Col 14
1006
10077. Testing
1008**********
1009
1010Kernel bpf selftest `test_btf.c` provides extensive set of BTF-related tests.
1011