#
fc448497 |
| 28-Mar-2018 |
Jakub Kicinski <jakub.kicinski@netronome.com> |
nfp: bpf: rename map_lookup_stack() to map_call_stack_common()
We will reuse most of map call code gen for other map calls. Rename the lookup gen function and use meta->func_id instead of hard-codin
nfp: bpf: rename map_lookup_stack() to map_call_stack_common()
We will reuse most of map call code gen for other map calls. Rename the lookup gen function and use meta->func_id instead of hard-coding lookup.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
#
87b10ecd |
| 28-Mar-2018 |
Jiong Wang <jiong.wang@netronome.com> |
nfp: bpf: detect packet reads could be cached, enable the optimisation
This patch is the front end of this optimisation, it detects and marks those packet reads that could be cached. Then the optimi
nfp: bpf: detect packet reads could be cached, enable the optimisation
This patch is the front end of this optimisation, it detects and marks those packet reads that could be cached. Then the optimisation "backend" will be activated automatically.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
#
91ff69e8 |
| 28-Mar-2018 |
Jiong Wang <jiong.wang@netronome.com> |
nfp: bpf: support unaligned read offset
This patch add the support for unaligned read offset, i.e. the read offset to the start of packet cache area is not aligned to REG_WIDTH. In this case, the re
nfp: bpf: support unaligned read offset
This patch add the support for unaligned read offset, i.e. the read offset to the start of packet cache area is not aligned to REG_WIDTH. In this case, the read area might across maximum three transfer-in registers.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
#
be759237 |
| 28-Mar-2018 |
Jiong Wang <jiong.wang@netronome.com> |
nfp: bpf: read from packet data cache for PTR_TO_PACKET
This patch assumes there is a packet data cache, and would try to read packet data from the cache instead of from memory.
This patch only imp
nfp: bpf: read from packet data cache for PTR_TO_PACKET
This patch assumes there is a packet data cache, and would try to read packet data from the cache instead of from memory.
This patch only implements the optimisation "backend", it doesn't build the packet data cache, so this optimisation is not enabled.
This patch has only enabled aligned packet data read, i.e. when the read offset to the start of cache is REG_WIDTH aligned.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
#
e8a4796e |
| 23-Mar-2018 |
Jakub Kicinski <jakub.kicinski@netronome.com> |
nfp: bpf: fix check of program max insn count
NFP program allocation length is in bytes and NFP program length is in instructions, fix the comparison of the two.
Fixes: 9314c442d7dd ("nfp: bpf: mov
nfp: bpf: fix check of program max insn count
NFP program allocation length is in bytes and NFP program length is in instructions, fix the comparison of the two.
Fixes: 9314c442d7dd ("nfp: bpf: move translation prepare to offload.c") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
Revision tags: v4.15 |
|
#
74801e50 |
| 16-Jan-2018 |
Quentin Monnet <quentin.monnet@netronome.com> |
nfp: bpf: reject program on instructions unknown to the JIT compiler
If an eBPF instruction is unknown to the driver JIT compiler, we can reject the program at verification time.
Signed-off-by: Que
nfp: bpf: reject program on instructions unknown to the JIT compiler
If an eBPF instruction is unknown to the driver JIT compiler, we can reject the program at verification time.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
show more ...
|
#
3dd43c33 |
| 11-Jan-2018 |
Jakub Kicinski <jakub.kicinski@netronome.com> |
nfp: bpf: add support for reading map memory
Map memory needs to use 40 bit addressing. Add handling of such accesses. Since 40 bit addresses are formed by using both 32 bit operands we need to pr
nfp: bpf: add support for reading map memory
Map memory needs to use 40 bit addressing. Add handling of such accesses. Since 40 bit addresses are formed by using both 32 bit operands we need to pre-calculate the actual address instead of adding in the offset inside the instruction, like we did in 32 bit mode.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
show more ...
|
#
77a3d311 |
| 11-Jan-2018 |
Jakub Kicinski <jakub.kicinski@netronome.com> |
nfp: bpf: add verification and codegen for map lookups
Verify our current constraints on the location of the key are met and generate the code for calling map lookup on the datapath.
New relocation
nfp: bpf: add verification and codegen for map lookups
Verify our current constraints on the location of the key are met and generate the code for calling map lookup on the datapath.
New relocation types have to be added - for helpers and return addresses.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
show more ...
|
#
c087aa8b |
| 10-Jan-2018 |
Nic Viljoen <nick.viljoen@netronome.com> |
nfp: bpf: add signed jump insns
This patch adds signed jump instructions (jsgt, jsge, jslt, jsle) to the nfp jit. As well as adding the additional required raw assembler branch mask to nfp_asm.h
Si
nfp: bpf: add signed jump insns
This patch adds signed jump instructions (jsgt, jsge, jslt, jsle) to the nfp jit. As well as adding the additional required raw assembler branch mask to nfp_asm.h
Signed-off-by: Nic Viljoen <nick.viljoen@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
show more ...
|
#
e84797fe |
| 10-Jan-2018 |
Jakub Kicinski <jakub.kicinski@netronome.com> |
nfp: bpf: use a large constant in unresolved branches
To make absolute relocated branches (branches which will be completely rewritten with br_set_offset()) distinguishable in user space dumps from
nfp: bpf: use a large constant in unresolved branches
To make absolute relocated branches (branches which will be completely rewritten with br_set_offset()) distinguishable in user space dumps from normal jumps add a large offset to them.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
show more ...
|
#
44a12ecc |
| 10-Jan-2018 |
Jakub Kicinski <jakub.kicinski@netronome.com> |
nfp: bpf: don't depend on high order allocations for program image
The translator pre-allocates a buffer of maximal program size. Due to HW/FW limitations the program buffer can't currently be longe
nfp: bpf: don't depend on high order allocations for program image
The translator pre-allocates a buffer of maximal program size. Due to HW/FW limitations the program buffer can't currently be longer than 128Kb, so we used to kmalloc() it, and then map for DMA directly.
Now that the late branch resolution is copying the program image anyway, we can just kvmalloc() the buffer. While at it, after translation reallocate the buffer to save space.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
show more ...
|
#
2314fe9e |
| 10-Jan-2018 |
Jakub Kicinski <jakub.kicinski@netronome.com> |
nfp: bpf: relocate jump targets just before the load
Don't translate the program assuming it will be loaded at a given address. This will be required for sharing programs between ports of the same
nfp: bpf: relocate jump targets just before the load
Don't translate the program assuming it will be loaded at a given address. This will be required for sharing programs between ports of the same NIC, tail calls and subprograms. It will also make the jump targets easier to understand when dumping the program to user space.
Translate the program as if it was going to be loaded at address zero. When load happens add the load offset in and set addresses of special branches.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
show more ...
|
#
488feeaf |
| 10-Jan-2018 |
Jakub Kicinski <jakub.kicinski@netronome.com> |
nfp: bpf: add helpers for modifying branch addresses
In preparation for better handling of relocations move existing helper for setting branch offset to nfp_asm.c and add two more.
Signed-off-by: J
nfp: bpf: add helpers for modifying branch addresses
In preparation for better handling of relocations move existing helper for setting branch offset to nfp_asm.c and add two more.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
show more ...
|
#
1549921d |
| 10-Jan-2018 |
Jakub Kicinski <jakub.kicinski@netronome.com> |
nfp: bpf: move jump resolution to jit.c
Jump target resolution should be in jit.c not offload.c. No functional changes.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Jio
nfp: bpf: move jump resolution to jit.c
Jump target resolution should be in jit.c not offload.c. No functional changes.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
show more ...
|
#
8231f844 |
| 14-Dec-2017 |
Jakub Kicinski <jakub.kicinski@netronome.com> |
nfp: bpf: optimize the adjust_head calls in trivial cases
If the program is simple and has only one adjust head call with constant parameters, we can check that the call will always succeed at trans
nfp: bpf: optimize the adjust_head calls in trivial cases
If the program is simple and has only one adjust head call with constant parameters, we can check that the call will always succeed at translation time. We need to track the location of the call and make sure parameters are always the same. We also have to check the parameters against datapath constraints and ETH_HLEN.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
show more ...
|
#
0d49eaf4 |
| 14-Dec-2017 |
Jakub Kicinski <jakub.kicinski@netronome.com> |
nfp: bpf: add basic support for adjust head call
Support bpf_xdp_adjust_head(). We need to check whether the packet offset after adjustment is within datapath's limits. We also check if the frame i
nfp: bpf: add basic support for adjust head call
Support bpf_xdp_adjust_head(). We need to check whether the packet offset after adjustment is within datapath's limits. We also check if the frame is at least ETH_HLEN long (similar to the kernel implementation).
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
show more ...
|
#
2cb230bd |
| 14-Dec-2017 |
Jakub Kicinski <jakub.kicinski@netronome.com> |
nfp: bpf: prepare for call support
Add skeleton of verifier checks and translation handler for call instructions. Make sure jump target resolution will not treat them as jumps.
Signed-off-by: Jaku
nfp: bpf: prepare for call support
Add skeleton of verifier checks and translation handler for call instructions. Make sure jump target resolution will not treat them as jumps.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
show more ...
|
#
6bc7103c |
| 30-Nov-2017 |
Jiong Wang <jiong.wang@netronome.com> |
nfp: bpf: detect load/store sequences lowered from memory copy
This patch add the optimization frontend, but adding a new eBPF IR scan pass "nfp_bpf_opt_ldst_gather".
The pass will traverse the IR
nfp: bpf: detect load/store sequences lowered from memory copy
This patch add the optimization frontend, but adding a new eBPF IR scan pass "nfp_bpf_opt_ldst_gather".
The pass will traverse the IR to recognize the load/store pairs sequences that come from lowering of memory copy builtins.
The gathered memory copy information will be kept in the meta info structure of the first load instruction in the sequence and will be consumed by the optimization backend added in the previous patches.
NOTE: a sequence with cross memory access doesn't qualify this optimization, i.e. if one load in the sequence will load from place that has been written by previous store. This is because when we turn the sequence into single CPP operation, we are reading all contents at once into NFP transfer registers, then write them out as a whole. This is not identical with what the original load/store sequence is doing.
Detecting cross memory access for two random pointers will be difficult, fortunately under XDP/eBPF's restrictied runtime environment, the copy normally happen among map, packet data and stack, they do not overlap with each other.
And for cases supported by NFP, cross memory access will only happen on PTR_TO_PACKET. Fortunately for this, there is ID information that we could do accurate memory alias check.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
show more ...
|
#
8c900538 |
| 30-Nov-2017 |
Jiong Wang <jiong.wang@netronome.com> |
nfp: bpf: implement memory bulk copy for length bigger than 32-bytes
When the gathered copy length is bigger than 32-bytes and within 128-bytes (the maximum length a single CPP Pull/Push request can
nfp: bpf: implement memory bulk copy for length bigger than 32-bytes
When the gathered copy length is bigger than 32-bytes and within 128-bytes (the maximum length a single CPP Pull/Push request can finish), the strategy of read/write are changeed into:
* Read. - use direct reference mode when length is within 32-bytes. - use indirect mode when length is bigger than 32-bytes.
* Write. - length <= 8-bytes use write8 (direct_ref). - length <= 32-byte and 4-bytes aligned use write32 (direct_ref). - length <= 32-bytes but not 4-bytes aligned use write8 (indirect_ref). - length > 32-bytes and 4-bytes aligned use write32 (indirect_ref). - length > 32-bytes and not 4-bytes aligned and <= 40-bytes use write32 (direct_ref) to finish the first 32-bytes. use write8 (direct_ref) to finish all remaining hanging part. - length > 32-bytes and not 4-bytes aligned use write32 (indirect_ref) to finish those 4-byte aligned parts. use write8 (direct_ref) to finish all remaining hanging part.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
show more ...
|
#
9879a381 |
| 30-Nov-2017 |
Jiong Wang <jiong.wang@netronome.com> |
nfp: bpf: implement memory bulk copy for length within 32-bytes
For NFP, we want to re-group a sequence of load/store pairs lowered from memcpy/memmove into single memory bulk operation which then c
nfp: bpf: implement memory bulk copy for length within 32-bytes
For NFP, we want to re-group a sequence of load/store pairs lowered from memcpy/memmove into single memory bulk operation which then could be accelerated using NFP CPP bus.
This patch extends the existing load/store auxiliary information by adding two new fields:
struct bpf_insn *paired_st; s16 ldst_gather_len;
Both fields are supposed to be carried by the the load instruction at the head of the sequence. "paired_st" is the corresponding store instruction at the head and "ldst_gather_len" is the gathered length.
If "ldst_gather_len" is negative, then the sequence is doing memory load/store in descending order, otherwise it is in ascending order. We need this information to detect overlapped memory access.
This patch then optimize memory bulk copy when the copy length is within 32-bytes.
The strategy of read/write used is:
* Read. Use read32 (direct_ref), always.
* Write. - length <= 8-bytes write8 (direct_ref). - length <= 32-bytes and is 4-byte aligned write32 (direct_ref). - length <= 32-bytes but is not 4-byte aligned write8 (indirect_ref).
NOTE: the optimization should not change program semantics. The destination register of the last load instruction should contain the same value before and after this optimization.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
show more ...
|
#
5468a8b9 |
| 30-Nov-2017 |
Jakub Kicinski <jakub.kicinski@netronome.com> |
nfp: bpf: encode indirect commands
Add support for emitting commands with field overwrites.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netron
nfp: bpf: encode indirect commands
Add support for emitting commands with field overwrites.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
show more ...
|
#
3239e7bb |
| 30-Nov-2017 |
Jiong Wang <jiong.wang@netronome.com> |
nfp: bpf: correct the encoding for No-Dest immed
When immed is used with No-Dest, the emitter should use reg.dst instead of reg.areg for the destination, using the latter will actually encode regist
nfp: bpf: correct the encoding for No-Dest immed
When immed is used with No-Dest, the emitter should use reg.dst instead of reg.areg for the destination, using the latter will actually encode register zero.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
show more ...
|
#
29fe46ef |
| 30-Nov-2017 |
Jiong Wang <jiong.wang@netronome.com> |
nfp: bpf: don't do ld/shifts combination if shifts are jump destination
If any of the shift insns in the ld/shift sequence is jump destination, don't do combination.
Signed-off-by: Jiong Wang <jion
nfp: bpf: don't do ld/shifts combination if shifts are jump destination
If any of the shift insns in the ld/shift sequence is jump destination, don't do combination.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
show more ...
|
#
1266f5d6 |
| 30-Nov-2017 |
Jiong Wang <jiong.wang@netronome.com> |
nfp: bpf: don't do ld/mask combination if mask is jump destination
If the mask insn in the ld/mask pair is jump destination, then don't do combination.
Signed-off-by: Jiong Wang <jiong.wang@netrono
nfp: bpf: don't do ld/mask combination if mask is jump destination
If the mask insn in the ld/mask pair is jump destination, then don't do combination.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
show more ...
|
#
5b674140 |
| 30-Nov-2017 |
Jiong Wang <jiong.wang@netronome.com> |
nfp: bpf: record jump destination to simplify jump fixup
eBPF insns are internally organized as dual-list inside NFP offload JIT. Random access to an insn needs to be done by either forward or backw
nfp: bpf: record jump destination to simplify jump fixup
eBPF insns are internally organized as dual-list inside NFP offload JIT. Random access to an insn needs to be done by either forward or backward traversal along the list.
One place we need to do such traversal is at nfp_fixup_branches where one traversal is needed for each jump insn to find the destination. Such traversals could be avoided if jump destinations are collected through a single travesal in a pre-scan pass, and such information could also be useful in other places where jump destination info are needed.
This patch adds such jump destination collection in nfp_prog_prepare.
Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
show more ...
|