1.. contents::
2.. sectnum::
3
4========================================
5eBPF Instruction Set Specification, v1.0
6========================================
7
8This document specifies version 1.0 of the eBPF instruction set.
9
10Documentation conventions
11=========================
12
13For brevity, this document uses the type notion "u64", "u32", etc.
14to mean an unsigned integer whose width is the specified number of bits,
15and "s32", etc. to mean a signed integer of the specified number of bits.
16
17Registers and calling convention
18================================
19
20eBPF has 10 general purpose registers and a read-only frame pointer register,
21all of which are 64-bits wide.
22
23The eBPF calling convention is defined as:
24
25* R0: return value from function calls, and exit value for eBPF programs
26* R1 - R5: arguments for function calls
27* R6 - R9: callee saved registers that function calls will preserve
28* R10: read-only frame pointer to access stack
29
30R0 - R5 are scratch registers and eBPF programs needs to spill/fill them if
31necessary across calls.
32
33Instruction encoding
34====================
35
36eBPF has two instruction encodings:
37
38* the basic instruction encoding, which uses 64 bits to encode an instruction
39* the wide instruction encoding, which appends a second 64-bit immediate (i.e.,
40  constant) value after the basic instruction for a total of 128 bits.
41
42The fields conforming an encoded basic instruction are stored in the
43following order::
44
45  opcode:8 src_reg:4 dst_reg:4 offset:16 imm:32 // In little-endian BPF.
46  opcode:8 dst_reg:4 src_reg:4 offset:16 imm:32 // In big-endian BPF.
47
48**imm**
49  signed integer immediate value
50
51**offset**
52  signed integer offset used with pointer arithmetic
53
54**src_reg**
55  the source register number (0-10), except where otherwise specified
56  (`64-bit immediate instructions`_ reuse this field for other purposes)
57
58**dst_reg**
59  destination register number (0-10)
60
61**opcode**
62  operation to perform
63
64Note that the contents of multi-byte fields ('imm' and 'offset') are
65stored using big-endian byte ordering in big-endian BPF and
66little-endian byte ordering in little-endian BPF.
67
68For example::
69
70  opcode                  offset imm          assembly
71         src_reg dst_reg
72  07     0       1        00 00  44 33 22 11  r1 += 0x11223344 // little
73         dst_reg src_reg
74  07     1       0        00 00  11 22 33 44  r1 += 0x11223344 // big
75
76Note that most instructions do not use all of the fields.
77Unused fields shall be cleared to zero.
78
79As discussed below in `64-bit immediate instructions`_, a 64-bit immediate
80instruction uses a 64-bit immediate value that is constructed as follows.
81The 64 bits following the basic instruction contain a pseudo instruction
82using the same format but with opcode, dst_reg, src_reg, and offset all set to zero,
83and imm containing the high 32 bits of the immediate value.
84
85This is depicted in the following figure::
86
87        basic_instruction
88  .-----------------------------.
89  |                             |
90  code:8 regs:8 offset:16 imm:32 unused:32 imm:32
91                                 |              |
92                                 '--------------'
93                                pseudo instruction
94
95Thus the 64-bit immediate value is constructed as follows:
96
97  imm64 = (next_imm << 32) | imm
98
99where 'next_imm' refers to the imm value of the pseudo instruction
100following the basic instruction.  The unused bytes in the pseudo
101instruction are reserved and shall be cleared to zero.
102
103Instruction classes
104-------------------
105
106The three LSB bits of the 'opcode' field store the instruction class:
107
108=========  =====  ===============================  ===================================
109class      value  description                      reference
110=========  =====  ===============================  ===================================
111BPF_LD     0x00   non-standard load operations     `Load and store instructions`_
112BPF_LDX    0x01   load into register operations    `Load and store instructions`_
113BPF_ST     0x02   store from immediate operations  `Load and store instructions`_
114BPF_STX    0x03   store from register operations   `Load and store instructions`_
115BPF_ALU    0x04   32-bit arithmetic operations     `Arithmetic and jump instructions`_
116BPF_JMP    0x05   64-bit jump operations           `Arithmetic and jump instructions`_
117BPF_JMP32  0x06   32-bit jump operations           `Arithmetic and jump instructions`_
118BPF_ALU64  0x07   64-bit arithmetic operations     `Arithmetic and jump instructions`_
119=========  =====  ===============================  ===================================
120
121Arithmetic and jump instructions
122================================
123
124For arithmetic and jump instructions (``BPF_ALU``, ``BPF_ALU64``, ``BPF_JMP`` and
125``BPF_JMP32``), the 8-bit 'opcode' field is divided into three parts:
126
127==============  ======  =================
1284 bits (MSB)    1 bit   3 bits (LSB)
129==============  ======  =================
130code            source  instruction class
131==============  ======  =================
132
133**code**
134  the operation code, whose meaning varies by instruction class
135
136**source**
137  the source operand location, which unless otherwise specified is one of:
138
139  ======  =====  ==============================================
140  source  value  description
141  ======  =====  ==============================================
142  BPF_K   0x00   use 32-bit 'imm' value as source operand
143  BPF_X   0x08   use 'src_reg' register value as source operand
144  ======  =====  ==============================================
145
146**instruction class**
147  the instruction class (see `Instruction classes`_)
148
149Arithmetic instructions
150-----------------------
151
152``BPF_ALU`` uses 32-bit wide operands while ``BPF_ALU64`` uses 64-bit wide operands for
153otherwise identical operations.
154The 'code' field encodes the operation as below, where 'src' and 'dst' refer
155to the values of the source and destination registers, respectively.
156
157========  =====  ==========================================================
158code      value  description
159========  =====  ==========================================================
160BPF_ADD   0x00   dst += src
161BPF_SUB   0x10   dst -= src
162BPF_MUL   0x20   dst \*= src
163BPF_DIV   0x30   dst = (src != 0) ? (dst / src) : 0
164BPF_OR    0x40   dst \|= src
165BPF_AND   0x50   dst &= src
166BPF_LSH   0x60   dst <<= (src & mask)
167BPF_RSH   0x70   dst >>= (src & mask)
168BPF_NEG   0x80   dst = -src
169BPF_MOD   0x90   dst = (src != 0) ? (dst % src) : dst
170BPF_XOR   0xa0   dst ^= src
171BPF_MOV   0xb0   dst = src
172BPF_ARSH  0xc0   sign extending dst >>= (src & mask)
173BPF_END   0xd0   byte swap operations (see `Byte swap instructions`_ below)
174========  =====  ==========================================================
175
176Underflow and overflow are allowed during arithmetic operations, meaning
177the 64-bit or 32-bit value will wrap. If eBPF program execution would
178result in division by zero, the destination register is instead set to zero.
179If execution would result in modulo by zero, for ``BPF_ALU64`` the value of
180the destination register is unchanged whereas for ``BPF_ALU`` the upper
18132 bits of the destination register are zeroed.
182
183``BPF_ADD | BPF_X | BPF_ALU`` means::
184
185  dst = (u32) ((u32) dst + (u32) src)
186
187where '(u32)' indicates that the upper 32 bits are zeroed.
188
189``BPF_ADD | BPF_X | BPF_ALU64`` means::
190
191  dst = dst + src
192
193``BPF_XOR | BPF_K | BPF_ALU`` means::
194
195  dst = (u32) dst ^ (u32) imm32
196
197``BPF_XOR | BPF_K | BPF_ALU64`` means::
198
199  dst = dst ^ imm32
200
201Also note that the division and modulo operations are unsigned. Thus, for
202``BPF_ALU``, 'imm' is first interpreted as an unsigned 32-bit value, whereas
203for ``BPF_ALU64``, 'imm' is first sign extended to 64 bits and the result
204interpreted as an unsigned 64-bit value. There are no instructions for
205signed division or modulo.
206
207Shift operations use a mask of 0x3F (63) for 64-bit operations and 0x1F (31)
208for 32-bit operations.
209
210Byte swap instructions
211~~~~~~~~~~~~~~~~~~~~~~
212
213The byte swap instructions use an instruction class of ``BPF_ALU`` and a 4-bit
214'code' field of ``BPF_END``.
215
216The byte swap instructions operate on the destination register
217only and do not use a separate source register or immediate value.
218
219The 1-bit source operand field in the opcode is used to select what byte
220order the operation convert from or to:
221
222=========  =====  =================================================
223source     value  description
224=========  =====  =================================================
225BPF_TO_LE  0x00   convert between host byte order and little endian
226BPF_TO_BE  0x08   convert between host byte order and big endian
227=========  =====  =================================================
228
229The 'imm' field encodes the width of the swap operations.  The following widths
230are supported: 16, 32 and 64.
231
232Examples:
233
234``BPF_ALU | BPF_TO_LE | BPF_END`` with imm = 16 means::
235
236  dst = htole16(dst)
237
238``BPF_ALU | BPF_TO_BE | BPF_END`` with imm = 64 means::
239
240  dst = htobe64(dst)
241
242Jump instructions
243-----------------
244
245``BPF_JMP32`` uses 32-bit wide operands while ``BPF_JMP`` uses 64-bit wide operands for
246otherwise identical operations.
247The 'code' field encodes the operation as below:
248
249========  =====  ===  ===========================================  =========================================
250code      value  src  description                                  notes
251========  =====  ===  ===========================================  =========================================
252BPF_JA    0x0    0x0  PC += offset                                 BPF_JMP only
253BPF_JEQ   0x1    any  PC += offset if dst == src
254BPF_JGT   0x2    any  PC += offset if dst > src                    unsigned
255BPF_JGE   0x3    any  PC += offset if dst >= src                   unsigned
256BPF_JSET  0x4    any  PC += offset if dst & src
257BPF_JNE   0x5    any  PC += offset if dst != src
258BPF_JSGT  0x6    any  PC += offset if dst > src                    signed
259BPF_JSGE  0x7    any  PC += offset if dst >= src                   signed
260BPF_CALL  0x8    0x0  call helper function by address              see `Helper functions`_
261BPF_CALL  0x8    0x1  call PC += offset                            see `Program-local functions`_
262BPF_CALL  0x8    0x2  call helper function by BTF ID               see `Helper functions`_
263BPF_EXIT  0x9    0x0  return                                       BPF_JMP only
264BPF_JLT   0xa    any  PC += offset if dst < src                    unsigned
265BPF_JLE   0xb    any  PC += offset if dst <= src                   unsigned
266BPF_JSLT  0xc    any  PC += offset if dst < src                    signed
267BPF_JSLE  0xd    any  PC += offset if dst <= src                   signed
268========  =====  ===  ===========================================  =========================================
269
270The eBPF program needs to store the return value into register R0 before doing a
271``BPF_EXIT``.
272
273Example:
274
275``BPF_JSGE | BPF_X | BPF_JMP32`` (0x7e) means::
276
277  if (s32)dst s>= (s32)src goto +offset
278
279where 's>=' indicates a signed '>=' comparison.
280
281Helper functions
282~~~~~~~~~~~~~~~~
283
284Helper functions are a concept whereby BPF programs can call into a
285set of function calls exposed by the underlying platform.
286
287Historically, each helper function was identified by an address
288encoded in the imm field.  The available helper functions may differ
289for each program type, but address values are unique across all program types.
290
291Platforms that support the BPF Type Format (BTF) support identifying
292a helper function by a BTF ID encoded in the imm field, where the BTF ID
293identifies the helper name and type.
294
295Program-local functions
296~~~~~~~~~~~~~~~~~~~~~~~
297Program-local functions are functions exposed by the same BPF program as the
298caller, and are referenced by offset from the call instruction, similar to
299``BPF_JA``.  A ``BPF_EXIT`` within the program-local function will return to
300the caller.
301
302Load and store instructions
303===========================
304
305For load and store instructions (``BPF_LD``, ``BPF_LDX``, ``BPF_ST``, and ``BPF_STX``), the
3068-bit 'opcode' field is divided as:
307
308============  ======  =================
3093 bits (MSB)  2 bits  3 bits (LSB)
310============  ======  =================
311mode          size    instruction class
312============  ======  =================
313
314The mode modifier is one of:
315
316  =============  =====  ====================================  =============
317  mode modifier  value  description                           reference
318  =============  =====  ====================================  =============
319  BPF_IMM        0x00   64-bit immediate instructions         `64-bit immediate instructions`_
320  BPF_ABS        0x20   legacy BPF packet access (absolute)   `Legacy BPF Packet access instructions`_
321  BPF_IND        0x40   legacy BPF packet access (indirect)   `Legacy BPF Packet access instructions`_
322  BPF_MEM        0x60   regular load and store operations     `Regular load and store operations`_
323  BPF_ATOMIC     0xc0   atomic operations                     `Atomic operations`_
324  =============  =====  ====================================  =============
325
326The size modifier is one of:
327
328  =============  =====  =====================
329  size modifier  value  description
330  =============  =====  =====================
331  BPF_W          0x00   word        (4 bytes)
332  BPF_H          0x08   half word   (2 bytes)
333  BPF_B          0x10   byte
334  BPF_DW         0x18   double word (8 bytes)
335  =============  =====  =====================
336
337Regular load and store operations
338---------------------------------
339
340The ``BPF_MEM`` mode modifier is used to encode regular load and store
341instructions that transfer data between a register and memory.
342
343``BPF_MEM | <size> | BPF_STX`` means::
344
345  *(size *) (dst + offset) = src
346
347``BPF_MEM | <size> | BPF_ST`` means::
348
349  *(size *) (dst + offset) = imm32
350
351``BPF_MEM | <size> | BPF_LDX`` means::
352
353  dst = *(size *) (src + offset)
354
355Where size is one of: ``BPF_B``, ``BPF_H``, ``BPF_W``, or ``BPF_DW``.
356
357Atomic operations
358-----------------
359
360Atomic operations are operations that operate on memory and can not be
361interrupted or corrupted by other access to the same memory region
362by other eBPF programs or means outside of this specification.
363
364All atomic operations supported by eBPF are encoded as store operations
365that use the ``BPF_ATOMIC`` mode modifier as follows:
366
367* ``BPF_ATOMIC | BPF_W | BPF_STX`` for 32-bit operations
368* ``BPF_ATOMIC | BPF_DW | BPF_STX`` for 64-bit operations
369* 8-bit and 16-bit wide atomic operations are not supported.
370
371The 'imm' field is used to encode the actual atomic operation.
372Simple atomic operation use a subset of the values defined to encode
373arithmetic operations in the 'imm' field to encode the atomic operation:
374
375========  =====  ===========
376imm       value  description
377========  =====  ===========
378BPF_ADD   0x00   atomic add
379BPF_OR    0x40   atomic or
380BPF_AND   0x50   atomic and
381BPF_XOR   0xa0   atomic xor
382========  =====  ===========
383
384
385``BPF_ATOMIC | BPF_W  | BPF_STX`` with 'imm' = BPF_ADD means::
386
387  *(u32 *)(dst + offset) += src
388
389``BPF_ATOMIC | BPF_DW | BPF_STX`` with 'imm' = BPF ADD means::
390
391  *(u64 *)(dst + offset) += src
392
393In addition to the simple atomic operations, there also is a modifier and
394two complex atomic operations:
395
396===========  ================  ===========================
397imm          value             description
398===========  ================  ===========================
399BPF_FETCH    0x01              modifier: return old value
400BPF_XCHG     0xe0 | BPF_FETCH  atomic exchange
401BPF_CMPXCHG  0xf0 | BPF_FETCH  atomic compare and exchange
402===========  ================  ===========================
403
404The ``BPF_FETCH`` modifier is optional for simple atomic operations, and
405always set for the complex atomic operations.  If the ``BPF_FETCH`` flag
406is set, then the operation also overwrites ``src`` with the value that
407was in memory before it was modified.
408
409The ``BPF_XCHG`` operation atomically exchanges ``src`` with the value
410addressed by ``dst + offset``.
411
412The ``BPF_CMPXCHG`` operation atomically compares the value addressed by
413``dst + offset`` with ``R0``. If they match, the value addressed by
414``dst + offset`` is replaced with ``src``. In either case, the
415value that was at ``dst + offset`` before the operation is zero-extended
416and loaded back to ``R0``.
417
41864-bit immediate instructions
419-----------------------------
420
421Instructions with the ``BPF_IMM`` 'mode' modifier use the wide instruction
422encoding defined in `Instruction encoding`_, and use the 'src' field of the
423basic instruction to hold an opcode subtype.
424
425The following table defines a set of ``BPF_IMM | BPF_DW | BPF_LD`` instructions
426with opcode subtypes in the 'src' field, using new terms such as "map"
427defined further below:
428
429=========================  ======  ===  =========================================  ===========  ==============
430opcode construction        opcode  src  pseudocode                                 imm type     dst type
431=========================  ======  ===  =========================================  ===========  ==============
432BPF_IMM | BPF_DW | BPF_LD  0x18    0x0  dst = imm64                                integer      integer
433BPF_IMM | BPF_DW | BPF_LD  0x18    0x1  dst = map_by_fd(imm)                       map fd       map
434BPF_IMM | BPF_DW | BPF_LD  0x18    0x2  dst = map_val(map_by_fd(imm)) + next_imm   map fd       data pointer
435BPF_IMM | BPF_DW | BPF_LD  0x18    0x3  dst = var_addr(imm)                        variable id  data pointer
436BPF_IMM | BPF_DW | BPF_LD  0x18    0x4  dst = code_addr(imm)                       integer      code pointer
437BPF_IMM | BPF_DW | BPF_LD  0x18    0x5  dst = map_by_idx(imm)                      map index    map
438BPF_IMM | BPF_DW | BPF_LD  0x18    0x6  dst = map_val(map_by_idx(imm)) + next_imm  map index    data pointer
439=========================  ======  ===  =========================================  ===========  ==============
440
441where
442
443* map_by_fd(imm) means to convert a 32-bit file descriptor into an address of a map (see `Maps`_)
444* map_by_idx(imm) means to convert a 32-bit index into an address of a map
445* map_val(map) gets the address of the first value in a given map
446* var_addr(imm) gets the address of a platform variable (see `Platform Variables`_) with a given id
447* code_addr(imm) gets the address of the instruction at a specified relative offset in number of (64-bit) instructions
448* the 'imm type' can be used by disassemblers for display
449* the 'dst type' can be used for verification and JIT compilation purposes
450
451Maps
452~~~~
453
454Maps are shared memory regions accessible by eBPF programs on some platforms.
455A map can have various semantics as defined in a separate document, and may or
456may not have a single contiguous memory region, but the 'map_val(map)' is
457currently only defined for maps that do have a single contiguous memory region.
458
459Each map can have a file descriptor (fd) if supported by the platform, where
460'map_by_fd(imm)' means to get the map with the specified file descriptor. Each
461BPF program can also be defined to use a set of maps associated with the
462program at load time, and 'map_by_idx(imm)' means to get the map with the given
463index in the set associated with the BPF program containing the instruction.
464
465Platform Variables
466~~~~~~~~~~~~~~~~~~
467
468Platform variables are memory regions, identified by integer ids, exposed by
469the runtime and accessible by BPF programs on some platforms.  The
470'var_addr(imm)' operation means to get the address of the memory region
471identified by the given id.
472
473Legacy BPF Packet access instructions
474-------------------------------------
475
476eBPF previously introduced special instructions for access to packet data that were
477carried over from classic BPF. However, these instructions are
478deprecated and should no longer be used.
479