1.. contents:: 2.. sectnum:: 3 4======================================== 5eBPF Instruction Set Specification, v1.0 6======================================== 7 8This document specifies version 1.0 of the eBPF instruction set. 9 10Documentation conventions 11========================= 12 13For brevity, this document uses the type notion "u64", "u32", etc. 14to mean an unsigned integer whose width is the specified number of bits, 15and "s32", etc. to mean a signed integer of the specified number of bits. 16 17Registers and calling convention 18================================ 19 20eBPF has 10 general purpose registers and a read-only frame pointer register, 21all of which are 64-bits wide. 22 23The eBPF calling convention is defined as: 24 25* R0: return value from function calls, and exit value for eBPF programs 26* R1 - R5: arguments for function calls 27* R6 - R9: callee saved registers that function calls will preserve 28* R10: read-only frame pointer to access stack 29 30R0 - R5 are scratch registers and eBPF programs needs to spill/fill them if 31necessary across calls. 32 33Instruction encoding 34==================== 35 36eBPF has two instruction encodings: 37 38* the basic instruction encoding, which uses 64 bits to encode an instruction 39* the wide instruction encoding, which appends a second 64-bit immediate (i.e., 40 constant) value after the basic instruction for a total of 128 bits. 41 42The fields conforming an encoded basic instruction are stored in the 43following order:: 44 45 opcode:8 src_reg:4 dst_reg:4 offset:16 imm:32 // In little-endian BPF. 46 opcode:8 dst_reg:4 src_reg:4 offset:16 imm:32 // In big-endian BPF. 47 48**imm** 49 signed integer immediate value 50 51**offset** 52 signed integer offset used with pointer arithmetic 53 54**src_reg** 55 the source register number (0-10), except where otherwise specified 56 (`64-bit immediate instructions`_ reuse this field for other purposes) 57 58**dst_reg** 59 destination register number (0-10) 60 61**opcode** 62 operation to perform 63 64Note that the contents of multi-byte fields ('imm' and 'offset') are 65stored using big-endian byte ordering in big-endian BPF and 66little-endian byte ordering in little-endian BPF. 67 68For example:: 69 70 opcode offset imm assembly 71 src_reg dst_reg 72 07 0 1 00 00 44 33 22 11 r1 += 0x11223344 // little 73 dst_reg src_reg 74 07 1 0 00 00 11 22 33 44 r1 += 0x11223344 // big 75 76Note that most instructions do not use all of the fields. 77Unused fields shall be cleared to zero. 78 79As discussed below in `64-bit immediate instructions`_, a 64-bit immediate 80instruction uses a 64-bit immediate value that is constructed as follows. 81The 64 bits following the basic instruction contain a pseudo instruction 82using the same format but with opcode, dst_reg, src_reg, and offset all set to zero, 83and imm containing the high 32 bits of the immediate value. 84 85This is depicted in the following figure:: 86 87 basic_instruction 88 .-----------------------------. 89 | | 90 code:8 regs:8 offset:16 imm:32 unused:32 imm:32 91 | | 92 '--------------' 93 pseudo instruction 94 95Thus the 64-bit immediate value is constructed as follows: 96 97 imm64 = (next_imm << 32) | imm 98 99where 'next_imm' refers to the imm value of the pseudo instruction 100following the basic instruction. The unused bytes in the pseudo 101instruction are reserved and shall be cleared to zero. 102 103Instruction classes 104------------------- 105 106The three LSB bits of the 'opcode' field store the instruction class: 107 108========= ===== =============================== =================================== 109class value description reference 110========= ===== =============================== =================================== 111BPF_LD 0x00 non-standard load operations `Load and store instructions`_ 112BPF_LDX 0x01 load into register operations `Load and store instructions`_ 113BPF_ST 0x02 store from immediate operations `Load and store instructions`_ 114BPF_STX 0x03 store from register operations `Load and store instructions`_ 115BPF_ALU 0x04 32-bit arithmetic operations `Arithmetic and jump instructions`_ 116BPF_JMP 0x05 64-bit jump operations `Arithmetic and jump instructions`_ 117BPF_JMP32 0x06 32-bit jump operations `Arithmetic and jump instructions`_ 118BPF_ALU64 0x07 64-bit arithmetic operations `Arithmetic and jump instructions`_ 119========= ===== =============================== =================================== 120 121Arithmetic and jump instructions 122================================ 123 124For arithmetic and jump instructions (``BPF_ALU``, ``BPF_ALU64``, ``BPF_JMP`` and 125``BPF_JMP32``), the 8-bit 'opcode' field is divided into three parts: 126 127============== ====== ================= 1284 bits (MSB) 1 bit 3 bits (LSB) 129============== ====== ================= 130code source instruction class 131============== ====== ================= 132 133**code** 134 the operation code, whose meaning varies by instruction class 135 136**source** 137 the source operand location, which unless otherwise specified is one of: 138 139 ====== ===== ============================================== 140 source value description 141 ====== ===== ============================================== 142 BPF_K 0x00 use 32-bit 'imm' value as source operand 143 BPF_X 0x08 use 'src_reg' register value as source operand 144 ====== ===== ============================================== 145 146**instruction class** 147 the instruction class (see `Instruction classes`_) 148 149Arithmetic instructions 150----------------------- 151 152``BPF_ALU`` uses 32-bit wide operands while ``BPF_ALU64`` uses 64-bit wide operands for 153otherwise identical operations. 154The 'code' field encodes the operation as below, where 'src' and 'dst' refer 155to the values of the source and destination registers, respectively. 156 157========= ===== ======= ========================================================== 158code value offset description 159========= ===== ======= ========================================================== 160BPF_ADD 0x00 0 dst += src 161BPF_SUB 0x10 0 dst -= src 162BPF_MUL 0x20 0 dst \*= src 163BPF_DIV 0x30 0 dst = (src != 0) ? (dst / src) : 0 164BPF_SDIV 0x30 1 dst = (src != 0) ? (dst s/ src) : 0 165BPF_OR 0x40 0 dst \|= src 166BPF_AND 0x50 0 dst &= src 167BPF_LSH 0x60 0 dst <<= (src & mask) 168BPF_RSH 0x70 0 dst >>= (src & mask) 169BPF_NEG 0x80 0 dst = -dst 170BPF_MOD 0x90 0 dst = (src != 0) ? (dst % src) : dst 171BPF_SMOD 0x90 1 dst = (src != 0) ? (dst s% src) : dst 172BPF_XOR 0xa0 0 dst ^= src 173BPF_MOV 0xb0 0 dst = src 174BPF_MOVSX 0xb0 8/16/32 dst = (s8,s16,s32)src 175BPF_ARSH 0xc0 0 sign extending dst >>= (src & mask) 176BPF_END 0xd0 0 byte swap operations (see `Byte swap instructions`_ below) 177========= ===== ======= ========================================================== 178 179Underflow and overflow are allowed during arithmetic operations, meaning 180the 64-bit or 32-bit value will wrap. If eBPF program execution would 181result in division by zero, the destination register is instead set to zero. 182If execution would result in modulo by zero, for ``BPF_ALU64`` the value of 183the destination register is unchanged whereas for ``BPF_ALU`` the upper 18432 bits of the destination register are zeroed. 185 186``BPF_ADD | BPF_X | BPF_ALU`` means:: 187 188 dst = (u32) ((u32) dst + (u32) src) 189 190where '(u32)' indicates that the upper 32 bits are zeroed. 191 192``BPF_ADD | BPF_X | BPF_ALU64`` means:: 193 194 dst = dst + src 195 196``BPF_XOR | BPF_K | BPF_ALU`` means:: 197 198 dst = (u32) dst ^ (u32) imm32 199 200``BPF_XOR | BPF_K | BPF_ALU64`` means:: 201 202 dst = dst ^ imm32 203 204Note that most instructions have instruction offset of 0. Only three instructions 205(``BPF_SDIV``, ``BPF_SMOD``, ``BPF_MOVSX``) have a non-zero offset. 206 207The devision and modulo operations support both unsigned and signed flavors. 208 209For unsigned operations (``BPF_DIV`` and ``BPF_MOD``), for ``BPF_ALU``, 210'imm' is interpreted as a 32-bit unsigned value. For ``BPF_ALU64``, 211'imm' is first sign extended from 32 to 64 bits, and then interpreted as 212a 64-bit unsigned value. 213 214For signed operations (``BPF_SDIV`` and ``BPF_SMOD``), for ``BPF_ALU``, 215'imm' is interpreted as a 32-bit signed value. For ``BPF_ALU64``, 'imm' 216is first sign extended from 32 to 64 bits, and then interpreted as a 21764-bit signed value. 218 219The ``BPF_MOVSX`` instruction does a move operation with sign extension. 220``BPF_ALU | BPF_MOVSX`` sign extends 8-bit and 16-bit operands into 32 221bit operands, and zeroes the remaining upper 32 bits. 222``BPF_ALU64 | BPF_MOVSX`` sign extends 8-bit, 16-bit, and 32-bit 223operands into 64 bit operands. 224 225Shift operations use a mask of 0x3F (63) for 64-bit operations and 0x1F (31) 226for 32-bit operations. 227 228Byte swap instructions 229---------------------- 230 231The byte swap instructions use instruction classes of ``BPF_ALU`` and ``BPF_ALU64`` 232and a 4-bit 'code' field of ``BPF_END``. 233 234The byte swap instructions operate on the destination register 235only and do not use a separate source register or immediate value. 236 237For ``BPF_ALU``, the 1-bit source operand field in the opcode is used to 238select what byte order the operation converts from or to. For 239``BPF_ALU64``, the 1-bit source operand field in the opcode is reserved 240and must be set to 0. 241 242========= ========= ===== ================================================= 243class source value description 244========= ========= ===== ================================================= 245BPF_ALU BPF_TO_LE 0x00 convert between host byte order and little endian 246BPF_ALU BPF_TO_BE 0x08 convert between host byte order and big endian 247BPF_ALU64 Reserved 0x00 do byte swap unconditionally 248========= ========= ===== ================================================= 249 250The 'imm' field encodes the width of the swap operations. The following widths 251are supported: 16, 32 and 64. 252 253Examples: 254 255``BPF_ALU | BPF_TO_LE | BPF_END`` with imm = 16 means:: 256 257 dst = htole16(dst) 258 259``BPF_ALU | BPF_TO_BE | BPF_END`` with imm = 64 means:: 260 261 dst = htobe64(dst) 262 263``BPF_ALU64 | BPF_TO_LE | BPF_END`` with imm = 16/32/64 means:: 264 265 dst = bswap16 dst 266 dst = bswap32 dst 267 dst = bswap64 dst 268 269Jump instructions 270----------------- 271 272``BPF_JMP32`` uses 32-bit wide operands while ``BPF_JMP`` uses 64-bit wide operands for 273otherwise identical operations. 274The 'code' field encodes the operation as below: 275 276======== ===== === =========================================== ========================================= 277code value src description notes 278======== ===== === =========================================== ========================================= 279BPF_JA 0x0 0x0 PC += offset BPF_JMP class 280BPF_JA 0x0 0x0 PC += imm BPF_JMP32 class 281BPF_JEQ 0x1 any PC += offset if dst == src 282BPF_JGT 0x2 any PC += offset if dst > src unsigned 283BPF_JGE 0x3 any PC += offset if dst >= src unsigned 284BPF_JSET 0x4 any PC += offset if dst & src 285BPF_JNE 0x5 any PC += offset if dst != src 286BPF_JSGT 0x6 any PC += offset if dst > src signed 287BPF_JSGE 0x7 any PC += offset if dst >= src signed 288BPF_CALL 0x8 0x0 call helper function by address see `Helper functions`_ 289BPF_CALL 0x8 0x1 call PC += offset see `Program-local functions`_ 290BPF_CALL 0x8 0x2 call helper function by BTF ID see `Helper functions`_ 291BPF_EXIT 0x9 0x0 return BPF_JMP only 292BPF_JLT 0xa any PC += offset if dst < src unsigned 293BPF_JLE 0xb any PC += offset if dst <= src unsigned 294BPF_JSLT 0xc any PC += offset if dst < src signed 295BPF_JSLE 0xd any PC += offset if dst <= src signed 296======== ===== === =========================================== ========================================= 297 298The eBPF program needs to store the return value into register R0 before doing a 299``BPF_EXIT``. 300 301Example: 302 303``BPF_JSGE | BPF_X | BPF_JMP32`` (0x7e) means:: 304 305 if (s32)dst s>= (s32)src goto +offset 306 307where 's>=' indicates a signed '>=' comparison. 308 309``BPF_JA | BPF_K | BPF_JMP32`` (0x06) means:: 310 311 gotol +imm 312 313where 'imm' means the branch offset comes from insn 'imm' field. 314 315Note that there are two flavors of ``BPF_JA`` instructions. The 316``BPF_JMP`` class permits a 16-bit jump offset specified by the 'offset' 317field, whereas the ``BPF_JMP32`` class permits a 32-bit jump offset 318specified by the 'imm' field. A > 16-bit conditional jump may be 319converted to a < 16-bit conditional jump plus a 32-bit unconditional 320jump. 321 322Helper functions 323~~~~~~~~~~~~~~~~ 324 325Helper functions are a concept whereby BPF programs can call into a 326set of function calls exposed by the underlying platform. 327 328Historically, each helper function was identified by an address 329encoded in the imm field. The available helper functions may differ 330for each program type, but address values are unique across all program types. 331 332Platforms that support the BPF Type Format (BTF) support identifying 333a helper function by a BTF ID encoded in the imm field, where the BTF ID 334identifies the helper name and type. 335 336Program-local functions 337~~~~~~~~~~~~~~~~~~~~~~~ 338Program-local functions are functions exposed by the same BPF program as the 339caller, and are referenced by offset from the call instruction, similar to 340``BPF_JA``. A ``BPF_EXIT`` within the program-local function will return to 341the caller. 342 343Load and store instructions 344=========================== 345 346For load and store instructions (``BPF_LD``, ``BPF_LDX``, ``BPF_ST``, and ``BPF_STX``), the 3478-bit 'opcode' field is divided as: 348 349============ ====== ================= 3503 bits (MSB) 2 bits 3 bits (LSB) 351============ ====== ================= 352mode size instruction class 353============ ====== ================= 354 355The mode modifier is one of: 356 357 ============= ===== ==================================== ============= 358 mode modifier value description reference 359 ============= ===== ==================================== ============= 360 BPF_IMM 0x00 64-bit immediate instructions `64-bit immediate instructions`_ 361 BPF_ABS 0x20 legacy BPF packet access (absolute) `Legacy BPF Packet access instructions`_ 362 BPF_IND 0x40 legacy BPF packet access (indirect) `Legacy BPF Packet access instructions`_ 363 BPF_MEM 0x60 regular load and store operations `Regular load and store operations`_ 364 BPF_MEMSX 0x80 sign-extension load operations `Sign-extension load operations`_ 365 BPF_ATOMIC 0xc0 atomic operations `Atomic operations`_ 366 ============= ===== ==================================== ============= 367 368The size modifier is one of: 369 370 ============= ===== ===================== 371 size modifier value description 372 ============= ===== ===================== 373 BPF_W 0x00 word (4 bytes) 374 BPF_H 0x08 half word (2 bytes) 375 BPF_B 0x10 byte 376 BPF_DW 0x18 double word (8 bytes) 377 ============= ===== ===================== 378 379Regular load and store operations 380--------------------------------- 381 382The ``BPF_MEM`` mode modifier is used to encode regular load and store 383instructions that transfer data between a register and memory. 384 385``BPF_MEM | <size> | BPF_STX`` means:: 386 387 *(size *) (dst + offset) = src 388 389``BPF_MEM | <size> | BPF_ST`` means:: 390 391 *(size *) (dst + offset) = imm32 392 393``BPF_MEM | <size> | BPF_LDX`` means:: 394 395 dst = *(unsigned size *) (src + offset) 396 397Where size is one of: ``BPF_B``, ``BPF_H``, ``BPF_W``, or ``BPF_DW`` and 398'unsigned size' is one of u8, u16, u32 or u64. 399 400Sign-extension load operations 401------------------------------ 402 403The ``BPF_MEMSX`` mode modifier is used to encode sign-extension load 404instructions that transfer data between a register and memory. 405 406``BPF_MEMSX | <size> | BPF_LDX`` means:: 407 408 dst = *(signed size *) (src + offset) 409 410Where size is one of: ``BPF_B``, ``BPF_H`` or ``BPF_W``, and 411'signed size' is one of s8, s16 or s32. 412 413Atomic operations 414----------------- 415 416Atomic operations are operations that operate on memory and can not be 417interrupted or corrupted by other access to the same memory region 418by other eBPF programs or means outside of this specification. 419 420All atomic operations supported by eBPF are encoded as store operations 421that use the ``BPF_ATOMIC`` mode modifier as follows: 422 423* ``BPF_ATOMIC | BPF_W | BPF_STX`` for 32-bit operations 424* ``BPF_ATOMIC | BPF_DW | BPF_STX`` for 64-bit operations 425* 8-bit and 16-bit wide atomic operations are not supported. 426 427The 'imm' field is used to encode the actual atomic operation. 428Simple atomic operation use a subset of the values defined to encode 429arithmetic operations in the 'imm' field to encode the atomic operation: 430 431======== ===== =========== 432imm value description 433======== ===== =========== 434BPF_ADD 0x00 atomic add 435BPF_OR 0x40 atomic or 436BPF_AND 0x50 atomic and 437BPF_XOR 0xa0 atomic xor 438======== ===== =========== 439 440 441``BPF_ATOMIC | BPF_W | BPF_STX`` with 'imm' = BPF_ADD means:: 442 443 *(u32 *)(dst + offset) += src 444 445``BPF_ATOMIC | BPF_DW | BPF_STX`` with 'imm' = BPF ADD means:: 446 447 *(u64 *)(dst + offset) += src 448 449In addition to the simple atomic operations, there also is a modifier and 450two complex atomic operations: 451 452=========== ================ =========================== 453imm value description 454=========== ================ =========================== 455BPF_FETCH 0x01 modifier: return old value 456BPF_XCHG 0xe0 | BPF_FETCH atomic exchange 457BPF_CMPXCHG 0xf0 | BPF_FETCH atomic compare and exchange 458=========== ================ =========================== 459 460The ``BPF_FETCH`` modifier is optional for simple atomic operations, and 461always set for the complex atomic operations. If the ``BPF_FETCH`` flag 462is set, then the operation also overwrites ``src`` with the value that 463was in memory before it was modified. 464 465The ``BPF_XCHG`` operation atomically exchanges ``src`` with the value 466addressed by ``dst + offset``. 467 468The ``BPF_CMPXCHG`` operation atomically compares the value addressed by 469``dst + offset`` with ``R0``. If they match, the value addressed by 470``dst + offset`` is replaced with ``src``. In either case, the 471value that was at ``dst + offset`` before the operation is zero-extended 472and loaded back to ``R0``. 473 47464-bit immediate instructions 475----------------------------- 476 477Instructions with the ``BPF_IMM`` 'mode' modifier use the wide instruction 478encoding defined in `Instruction encoding`_, and use the 'src' field of the 479basic instruction to hold an opcode subtype. 480 481The following table defines a set of ``BPF_IMM | BPF_DW | BPF_LD`` instructions 482with opcode subtypes in the 'src' field, using new terms such as "map" 483defined further below: 484 485========================= ====== === ========================================= =========== ============== 486opcode construction opcode src pseudocode imm type dst type 487========================= ====== === ========================================= =========== ============== 488BPF_IMM | BPF_DW | BPF_LD 0x18 0x0 dst = imm64 integer integer 489BPF_IMM | BPF_DW | BPF_LD 0x18 0x1 dst = map_by_fd(imm) map fd map 490BPF_IMM | BPF_DW | BPF_LD 0x18 0x2 dst = map_val(map_by_fd(imm)) + next_imm map fd data pointer 491BPF_IMM | BPF_DW | BPF_LD 0x18 0x3 dst = var_addr(imm) variable id data pointer 492BPF_IMM | BPF_DW | BPF_LD 0x18 0x4 dst = code_addr(imm) integer code pointer 493BPF_IMM | BPF_DW | BPF_LD 0x18 0x5 dst = map_by_idx(imm) map index map 494BPF_IMM | BPF_DW | BPF_LD 0x18 0x6 dst = map_val(map_by_idx(imm)) + next_imm map index data pointer 495========================= ====== === ========================================= =========== ============== 496 497where 498 499* map_by_fd(imm) means to convert a 32-bit file descriptor into an address of a map (see `Maps`_) 500* map_by_idx(imm) means to convert a 32-bit index into an address of a map 501* map_val(map) gets the address of the first value in a given map 502* var_addr(imm) gets the address of a platform variable (see `Platform Variables`_) with a given id 503* code_addr(imm) gets the address of the instruction at a specified relative offset in number of (64-bit) instructions 504* the 'imm type' can be used by disassemblers for display 505* the 'dst type' can be used for verification and JIT compilation purposes 506 507Maps 508~~~~ 509 510Maps are shared memory regions accessible by eBPF programs on some platforms. 511A map can have various semantics as defined in a separate document, and may or 512may not have a single contiguous memory region, but the 'map_val(map)' is 513currently only defined for maps that do have a single contiguous memory region. 514 515Each map can have a file descriptor (fd) if supported by the platform, where 516'map_by_fd(imm)' means to get the map with the specified file descriptor. Each 517BPF program can also be defined to use a set of maps associated with the 518program at load time, and 'map_by_idx(imm)' means to get the map with the given 519index in the set associated with the BPF program containing the instruction. 520 521Platform Variables 522~~~~~~~~~~~~~~~~~~ 523 524Platform variables are memory regions, identified by integer ids, exposed by 525the runtime and accessible by BPF programs on some platforms. The 526'var_addr(imm)' operation means to get the address of the memory region 527identified by the given id. 528 529Legacy BPF Packet access instructions 530------------------------------------- 531 532eBPF previously introduced special instructions for access to packet data that were 533carried over from classic BPF. However, these instructions are 534deprecated and should no longer be used. 535