History log of /openbmc/linux/drivers/net/ethernet/netronome/nfp/bpf/jit.c (Results 176 – 200 of 228)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 2314fe9e 10-Jan-2018 Jakub Kicinski <jakub.kicinski@netronome.com>

nfp: bpf: relocate jump targets just before the load

Don't translate the program assuming it will be loaded at a given
address. This will be required for sharing programs between ports

nfp: bpf: relocate jump targets just before the load

Don't translate the program assuming it will be loaded at a given
address. This will be required for sharing programs between ports
of the same NIC, tail calls and subprograms. It will also make the
jump targets easier to understand when dumping the program to user
space.

Translate the program as if it was going to be loaded at address
zero. When load happens add the load offset in and set addresses
of special branches.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

show more ...


# 488feeaf 10-Jan-2018 Jakub Kicinski <jakub.kicinski@netronome.com>

nfp: bpf: add helpers for modifying branch addresses

In preparation for better handling of relocations move existing
helper for setting branch offset to nfp_asm.c and add two more.

nfp: bpf: add helpers for modifying branch addresses

In preparation for better handling of relocations move existing
helper for setting branch offset to nfp_asm.c and add two more.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

show more ...


# 1549921d 10-Jan-2018 Jakub Kicinski <jakub.kicinski@netronome.com>

nfp: bpf: move jump resolution to jit.c

Jump target resolution should be in jit.c not offload.c.
No functional changes.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.c

nfp: bpf: move jump resolution to jit.c

Jump target resolution should be in jit.c not offload.c.
No functional changes.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

show more ...


# 8231f844 14-Dec-2017 Jakub Kicinski <jakub.kicinski@netronome.com>

nfp: bpf: optimize the adjust_head calls in trivial cases

If the program is simple and has only one adjust head call
with constant parameters, we can check that the call will
always

nfp: bpf: optimize the adjust_head calls in trivial cases

If the program is simple and has only one adjust head call
with constant parameters, we can check that the call will
always succeed at translation time. We need to track the
location of the call and make sure parameters are always
the same. We also have to check the parameters against
datapath constraints and ETH_HLEN.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

show more ...


# 0d49eaf4 14-Dec-2017 Jakub Kicinski <jakub.kicinski@netronome.com>

nfp: bpf: add basic support for adjust head call

Support bpf_xdp_adjust_head(). We need to check whether the
packet offset after adjustment is within datapath's limits.
We also chec

nfp: bpf: add basic support for adjust head call

Support bpf_xdp_adjust_head(). We need to check whether the
packet offset after adjustment is within datapath's limits.
We also check if the frame is at least ETH_HLEN long (similar
to the kernel implementation).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

show more ...


# 2cb230bd 14-Dec-2017 Jakub Kicinski <jakub.kicinski@netronome.com>

nfp: bpf: prepare for call support

Add skeleton of verifier checks and translation handler
for call instructions. Make sure jump target resolution
will not treat them as jumps.

nfp: bpf: prepare for call support

Add skeleton of verifier checks and translation handler
for call instructions. Make sure jump target resolution
will not treat them as jumps.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

show more ...


# 6bc7103c 30-Nov-2017 Jiong Wang <jiong.wang@netronome.com>

nfp: bpf: detect load/store sequences lowered from memory copy

This patch add the optimization frontend, but adding a new eBPF IR scan
pass "nfp_bpf_opt_ldst_gather".

The pass w

nfp: bpf: detect load/store sequences lowered from memory copy

This patch add the optimization frontend, but adding a new eBPF IR scan
pass "nfp_bpf_opt_ldst_gather".

The pass will traverse the IR to recognize the load/store pairs sequences
that come from lowering of memory copy builtins.

The gathered memory copy information will be kept in the meta info
structure of the first load instruction in the sequence and will be
consumed by the optimization backend added in the previous patches.

NOTE: a sequence with cross memory access doesn't qualify this
optimization, i.e. if one load in the sequence will load from place that
has been written by previous store. This is because when we turn the
sequence into single CPP operation, we are reading all contents at once
into NFP transfer registers, then write them out as a whole. This is not
identical with what the original load/store sequence is doing.

Detecting cross memory access for two random pointers will be difficult,
fortunately under XDP/eBPF's restrictied runtime environment, the copy
normally happen among map, packet data and stack, they do not overlap with
each other.

And for cases supported by NFP, cross memory access will only happen on
PTR_TO_PACKET. Fortunately for this, there is ID information that we could
do accurate memory alias check.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

show more ...


# 8c900538 30-Nov-2017 Jiong Wang <jiong.wang@netronome.com>

nfp: bpf: implement memory bulk copy for length bigger than 32-bytes

When the gathered copy length is bigger than 32-bytes and within 128-bytes
(the maximum length a single CPP Pull/Push

nfp: bpf: implement memory bulk copy for length bigger than 32-bytes

When the gathered copy length is bigger than 32-bytes and within 128-bytes
(the maximum length a single CPP Pull/Push request can finish), the
strategy of read/write are changeed into:

* Read.
- use direct reference mode when length is within 32-bytes.
- use indirect mode when length is bigger than 32-bytes.

* Write.
- length <= 8-bytes
use write8 (direct_ref).
- length <= 32-byte and 4-bytes aligned
use write32 (direct_ref).
- length <= 32-bytes but not 4-bytes aligned
use write8 (indirect_ref).
- length > 32-bytes and 4-bytes aligned
use write32 (indirect_ref).
- length > 32-bytes and not 4-bytes aligned and <= 40-bytes
use write32 (direct_ref) to finish the first 32-bytes.
use write8 (direct_ref) to finish all remaining hanging part.
- length > 32-bytes and not 4-bytes aligned
use write32 (indirect_ref) to finish those 4-byte aligned parts.
use write8 (direct_ref) to finish all remaining hanging part.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

show more ...


# 9879a381 30-Nov-2017 Jiong Wang <jiong.wang@netronome.com>

nfp: bpf: implement memory bulk copy for length within 32-bytes

For NFP, we want to re-group a sequence of load/store pairs lowered from
memcpy/memmove into single memory bulk operation

nfp: bpf: implement memory bulk copy for length within 32-bytes

For NFP, we want to re-group a sequence of load/store pairs lowered from
memcpy/memmove into single memory bulk operation which then could be
accelerated using NFP CPP bus.

This patch extends the existing load/store auxiliary information by adding
two new fields:

struct bpf_insn *paired_st;
s16 ldst_gather_len;

Both fields are supposed to be carried by the the load instruction at the
head of the sequence. "paired_st" is the corresponding store instruction at
the head and "ldst_gather_len" is the gathered length.

If "ldst_gather_len" is negative, then the sequence is doing memory
load/store in descending order, otherwise it is in ascending order. We need
this information to detect overlapped memory access.

This patch then optimize memory bulk copy when the copy length is within
32-bytes.

The strategy of read/write used is:

* Read.
Use read32 (direct_ref), always.

* Write.
- length <= 8-bytes
write8 (direct_ref).
- length <= 32-bytes and is 4-byte aligned
write32 (direct_ref).
- length <= 32-bytes but is not 4-byte aligned
write8 (indirect_ref).

NOTE: the optimization should not change program semantics. The destination
register of the last load instruction should contain the same value before
and after this optimization.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

show more ...


# 5468a8b9 30-Nov-2017 Jakub Kicinski <jakub.kicinski@netronome.com>

nfp: bpf: encode indirect commands

Add support for emitting commands with field overwrites.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wan

nfp: bpf: encode indirect commands

Add support for emitting commands with field overwrites.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

show more ...


# 3239e7bb 30-Nov-2017 Jiong Wang <jiong.wang@netronome.com>

nfp: bpf: correct the encoding for No-Dest immed

When immed is used with No-Dest, the emitter should use reg.dst instead of
reg.areg for the destination, using the latter will actually e

nfp: bpf: correct the encoding for No-Dest immed

When immed is used with No-Dest, the emitter should use reg.dst instead of
reg.areg for the destination, using the latter will actually encode
register zero.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

show more ...


# 29fe46ef 30-Nov-2017 Jiong Wang <jiong.wang@netronome.com>

nfp: bpf: don't do ld/shifts combination if shifts are jump destination

If any of the shift insns in the ld/shift sequence is jump destination,
don't do combination.

Signed-off-

nfp: bpf: don't do ld/shifts combination if shifts are jump destination

If any of the shift insns in the ld/shift sequence is jump destination,
don't do combination.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

show more ...


# 1266f5d6 30-Nov-2017 Jiong Wang <jiong.wang@netronome.com>

nfp: bpf: don't do ld/mask combination if mask is jump destination

If the mask insn in the ld/mask pair is jump destination, then don't do
combination.

Signed-off-by: Jiong Wang

nfp: bpf: don't do ld/mask combination if mask is jump destination

If the mask insn in the ld/mask pair is jump destination, then don't do
combination.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

show more ...


# 5b674140 30-Nov-2017 Jiong Wang <jiong.wang@netronome.com>

nfp: bpf: record jump destination to simplify jump fixup

eBPF insns are internally organized as dual-list inside NFP offload JIT.
Random access to an insn needs to be done by either forw

nfp: bpf: record jump destination to simplify jump fixup

eBPF insns are internally organized as dual-list inside NFP offload JIT.
Random access to an insn needs to be done by either forward or backward
traversal along the list.

One place we need to do such traversal is at nfp_fixup_branches where one
traversal is needed for each jump insn to find the destination. Such
traversals could be avoided if jump destinations are collected through a
single travesal in a pre-scan pass, and such information could also be
useful in other places where jump destination info are needed.

This patch adds such jump destination collection in nfp_prog_prepare.

Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

show more ...


# 854dc87d 30-Nov-2017 Jiong Wang <jiong.wang@netronome.com>

nfp: bpf: support backward jump

This patch adds support for backward jump on NFP.

- restrictions on backward jump in various functions have been removed.
- nfp_fixup_branche

nfp: bpf: support backward jump

This patch adds support for backward jump on NFP.

- restrictions on backward jump in various functions have been removed.
- nfp_fixup_branches now supports backward jump.

There is one thing to note, currently an input eBPF JMP insn may generate
several NFP insns, for example,

NFP imm move insn A \
NFP compare insn B --> 3 NFP insn jited from eBPF JMP insn M
NFP branch insn C /
---
NFP insn X --> 1 NFP insn jited from eBPF insn N
---
...

therefore, we are doing sanity check to make sure the last jited insn from
an eBPF JMP is a NFP branch instruction.

Once backward jump is allowed, it is possible an eBPF JMP insn is at the
end of the program. This is however causing trouble for the sanity check.
Because the sanity check requires the end index of the NFP insns jited from
one eBPF insn while only the start index is recorded before this patch that
we can only get the end index by:

start_index_of_the_next_eBPF_insn - 1

or for the above example:

start_index_of_eBPF_insn_N (which is the index of NFP insn X) - 1

nfp_fixup_branches was using nfp_for_each_insn_walk2 to expose *next* insn
to each iteration during the traversal so the last index could be
calculated from which. Now, it needs some extra code to handle the last
insn. Meanwhile, the use of walk2 is actually unnecessary, we could simply
use generic single instruction walk to do this, the next insn could be
easily calculated using list_next_entry.

So, this patch migrates the jump fixup traversal method to
*list_for_each_entry*, this simplifies the code logic a little bit.

The other thing to note is a new state variable "last_bpf_off" is
introduced to track the index of the last jited NFP insn. This is necessary
because NFP is generating special purposes epilogue sequences, so the index
of the last jited NFP insn is *not* always nfp_prog->prog_len - 1.

Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

show more ...


Revision tags: v4.13.16, v4.14
# c6c580d7 03-Nov-2017 Jakub Kicinski <jakub.kicinski@netronome.com>

nfp: bpf: move to new BPF program offload infrastructure

Following steps are taken in the driver to offload an XDP program:

XDP_SETUP_PROG:
* prepare:
- allocate program

nfp: bpf: move to new BPF program offload infrastructure

Following steps are taken in the driver to offload an XDP program:

XDP_SETUP_PROG:
* prepare:
- allocate program state;
- run verifier (bpf_analyzer());
- run translation;
* load:
- stop old program if needed;
- load program;
- enable BPF if not enabled;
* clean up:
- free program image.

With new infrastructure the flow will look like this:

BPF_OFFLOAD_VERIFIER_PREP:
- allocate program state;
BPF_OFFLOAD_TRANSLATE:
- run translation;
XDP_SETUP_PROG:
- stop old program if needed;
- load program;
- enable BPF if not enabled;
BPF_OFFLOAD_DESTROY:
- free program image.

Take advantage of the new infrastructure. Allocation of driver
metadata has to be moved from jit.c to offload.c since it's now
done at a different stage. Since there is no separate driver
private data for verification step, move temporary nfp_meta
pointer into nfp_prog. We will now use user space context
offsets.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# 9314c442 03-Nov-2017 Jakub Kicinski <jakub.kicinski@netronome.com>

nfp: bpf: move translation prepare to offload.c

struct nfp_prog is currently only used internally by the translator.
This means there is a lot of parameter passing going on, between

nfp: bpf: move translation prepare to offload.c

struct nfp_prog is currently only used internally by the translator.
This means there is a lot of parameter passing going on, between
the translator and different stages of offload. Simplify things
by allocating nfp_prog in offload.c already.

We will now use kmalloc() to allocate the program area and only
DMA map it for the time of loading (instead of allocating DMA
coherent memory upfront).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# c1c88eae 03-Nov-2017 Jakub Kicinski <jakub.kicinski@netronome.com>

nfp: bpf: move program prepare and free into offload.c

Most of offload/translation prepare logic will be moved to
offload.c. To help git generate more reasonable diffs
move nfp_prog

nfp: bpf: move program prepare and free into offload.c

Most of offload/translation prepare logic will be moved to
offload.c. To help git generate more reasonable diffs
move nfp_prog_prepare() and nfp_prog_free() functions
there as a first step.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# 94508438 03-Nov-2017 Jakub Kicinski <jakub.kicinski@netronome.com>

nfp: bpf: remove the register renumbering leftovers

The register renumbering was removed and will not be coming back
in its old, naive form, given that it would be fundamentally
inco

nfp: bpf: remove the register renumbering leftovers

The register renumbering was removed and will not be coming back
in its old, naive form, given that it would be fundamentally
incompatible with calling functions. Remove the leftovers.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# 012bb8a8 03-Nov-2017 Jakub Kicinski <jakub.kicinski@netronome.com>

nfp: bpf: drop support for cls_bpf with legacy actions

Only support BPF_PROG_TYPE_SCHED_CLS programs in direct
action mode. This simplifies preparing the offload since
there will no

nfp: bpf: drop support for cls_bpf with legacy actions

Only support BPF_PROG_TYPE_SCHED_CLS programs in direct
action mode. This simplifies preparing the offload since
there will now be only one mode of operation for that type
of program. We need to know the attachment mode type of
cls_bpf programs, because exit codes are interpreted
differently for legacy vs DA mode.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# 254ef4d7 01-Nov-2017 Jiong Wang <jiong.wang@netronome.com>

nfp: bpf: support [BPF_ALU | BPF_ALU64] | BPF_NEG

This patch supports BPF_NEG under both BPF_ALU64 and BPF_ALU. LLVM recently
starts to generate it.

NOTE: BPF_NEG takes single o

nfp: bpf: support [BPF_ALU | BPF_ALU64] | BPF_NEG

This patch supports BPF_NEG under both BPF_ALU64 and BPF_ALU. LLVM recently
starts to generate it.

NOTE: BPF_NEG takes single operand which is an register and serve as both
input and output.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# 5d42ced1 01-Nov-2017 Jiong Wang <jiong.wang@netronome.com>

nfp: bpf: rename ALU_OP_NEG to ALU_OP_NOT

The current ALU_OP_NEG is Op encoding 0x4 for NPF ALU instruction. It is
actually performing "~B" operation which is bitwise NOT.

The u

nfp: bpf: rename ALU_OP_NEG to ALU_OP_NOT

The current ALU_OP_NEG is Op encoding 0x4 for NPF ALU instruction. It is
actually performing "~B" operation which is bitwise NOT.

The using naming ALU_OP_NEG is misleading as NEG is -B which is not the
same as ~B.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# 9f16c8ab 23-Oct-2017 Jakub Kicinski <jakub.kicinski@netronome.com>

nfp: bpf: optimize mov64 a little

Loading 64bit constants require up to 4 load immediates, since
we can only load 16 bits at a time. If the 32bit halves of
the 64bit constant are th

nfp: bpf: optimize mov64 a little

Loading 64bit constants require up to 4 load immediates, since
we can only load 16 bits at a time. If the 32bit halves of
the 64bit constant are the same, however, we can save a cycle
by doing a register move instead of two loads of 16 bits.

Note that we don't optimize the normal ALU64 load because even
though it's a 64 bit load the upper half of the register is
a coming from sign extension so we can load it in one cycle
anyway.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# b14157ee 23-Oct-2017 Jakub Kicinski <jakub.kicinski@netronome.com>

nfp: bpf: support stack accesses via non-constant pointers

If stack pointer has a different value on different paths
but the alignment to words (4B) remains the same, we can
set a ne

nfp: bpf: support stack accesses via non-constant pointers

If stack pointer has a different value on different paths
but the alignment to words (4B) remains the same, we can
set a new LMEM access pointer to the calculated value and
access whichever word it's pointing to.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# 2df03a50 23-Oct-2017 Jakub Kicinski <jakub.kicinski@netronome.com>

nfp: bpf: support accessing the stack beyond 64 bytes

To access beyond 64th byte of the stack we need to set a new
stack pointer register (LMEM is accessed indirectly through
those p

nfp: bpf: support accessing the stack beyond 64 bytes

To access beyond 64th byte of the stack we need to set a new
stack pointer register (LMEM is accessed indirectly through
those pointers). Add a function for encoding local CSR access
instruction. Use stack pointer number 3.

Note that stack pointer registers allow us to index into 32
bytes of LMEM (with shift operations i.e. when operands are
restricted). This means if access is crossing 32 byte boundary
we must not use offsetting, we have to set the pointer to the
exact address and move it with post-increments.

We depend on the datapath placing the stack base address in
GPR A22 for our use.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


12345678910