#
cf9b1199 |
| 25-Jul-2016 |
Sargun Dhillon <sargun@sargun.me> |
samples/bpf: Add test/example of using bpf_probe_write_user bpf helper
This example shows using a kprobe to act as a dnat mechanism to divert traffic for arbitrary endpoints. It rewrite the argument
samples/bpf: Add test/example of using bpf_probe_write_user bpf helper
This example shows using a kprobe to act as a dnat mechanism to divert traffic for arbitrary endpoints. It rewrite the arguments to a syscall while they're still in userspace, and before the syscall has a chance to copy the argument into kernel space.
Although this is an example, it also acts as a test because the mapped address is 255.255.255.255:555 -> real address, and that's not a legal address to connect to. If the helper is broken, the example will fail on the intermediate steps, as well as the final step to verify the rewrite of userspace memory succeeded.
Signed-off-by: Sargun Dhillon <sargun@sargun.me> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
764cbcce |
| 19-Jul-2016 |
Brenden Blanco <bblanco@plumgrid.com> |
bpf: add sample for xdp forwarding and rewrite
Add a sample that rewrites and forwards packets out on the same interface. Observed single core forwarding performance of ~10Mpps.
Since the mlx4 driv
bpf: add sample for xdp forwarding and rewrite
Add a sample that rewrites and forwards packets out on the same interface. Observed single core forwarding performance of ~10Mpps.
Since the mlx4 driver under test recycles every single packet page, the perf output shows almost exclusively just the ring management and bpf program work. Slowdowns are likely occurring due to cache misses.
Signed-off-by: Brenden Blanco <bblanco@plumgrid.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
86af8b41 |
| 19-Jul-2016 |
Brenden Blanco <bblanco@plumgrid.com> |
Add sample for adding simple drop program to link
Add a sample program that only drops packets at the BPF_PROG_TYPE_XDP_RX hook of a link. With the drop-only program, observed single core rate is ~2
Add sample for adding simple drop program to link
Add a sample program that only drops packets at the BPF_PROG_TYPE_XDP_RX hook of a link. With the drop-only program, observed single core rate is ~20Mpps.
Other tests were run, for instance without the dropcnt increment or without reading from the packet header, the packet rate was mostly unchanged.
$ perf record -a samples/bpf/xdp1 $(</sys/class/net/eth0/ifindex) proto 17: 20403027 drops/s
./pktgen_sample03_burst_single_flow.sh -i $DEV -d $IP -m $MAC -t 4 Running... ctrl^C to stop Device: eth4@0 Result: OK: 11791017(c11788327+d2689) usec, 59622913 (60byte,0frags) 5056638pps 2427Mb/sec (2427186240bps) errors: 0 Device: eth4@1 Result: OK: 11791012(c11787906+d3106) usec, 60526944 (60byte,0frags) 5133311pps 2463Mb/sec (2463989280bps) errors: 0 Device: eth4@2 Result: OK: 11791019(c11788249+d2769) usec, 59868091 (60byte,0frags) 5077431pps 2437Mb/sec (2437166880bps) errors: 0 Device: eth4@3 Result: OK: 11795039(c11792403+d2636) usec, 59483181 (60byte,0frags) 5043067pps 2420Mb/sec (2420672160bps) errors: 0
perf report --no-children: 26.05% ksoftirqd/0 [mlx4_en] [k] mlx4_en_process_rx_cq 17.84% ksoftirqd/0 [mlx4_en] [k] mlx4_en_alloc_frags 5.52% ksoftirqd/0 [mlx4_en] [k] mlx4_en_free_frag 4.90% swapper [kernel.vmlinux] [k] poll_idle 4.14% ksoftirqd/0 [kernel.vmlinux] [k] get_page_from_freelist 2.78% ksoftirqd/0 [kernel.vmlinux] [k] __free_pages_ok 2.57% ksoftirqd/0 [kernel.vmlinux] [k] bpf_map_lookup_elem 2.51% swapper [mlx4_en] [k] mlx4_en_process_rx_cq 1.94% ksoftirqd/0 [kernel.vmlinux] [k] percpu_array_map_lookup_elem 1.45% swapper [mlx4_en] [k] mlx4_en_alloc_frags 1.35% ksoftirqd/0 [kernel.vmlinux] [k] free_one_page 1.33% swapper [kernel.vmlinux] [k] intel_idle 1.04% ksoftirqd/0 [mlx4_en] [k] 0x000000000001c5c5 0.96% ksoftirqd/0 [mlx4_en] [k] 0x000000000001c58d 0.93% ksoftirqd/0 [mlx4_en] [k] 0x000000000001c6ee 0.92% ksoftirqd/0 [mlx4_en] [k] 0x000000000001c6b9 0.89% ksoftirqd/0 [kernel.vmlinux] [k] __alloc_pages_nodemask 0.83% ksoftirqd/0 [mlx4_en] [k] 0x000000000001c686 0.83% ksoftirqd/0 [mlx4_en] [k] 0x000000000001c5d5 0.78% ksoftirqd/0 [mlx4_en] [k] mlx4_alloc_pages.isra.23 0.77% ksoftirqd/0 [mlx4_en] [k] 0x000000000001c5b4 0.77% ksoftirqd/0 [kernel.vmlinux] [k] net_rx_action
machine specs: receiver - Intel E5-1630 v3 @ 3.70GHz sender - Intel E5645 @ 2.40GHz Mellanox ConnectX-3 @40G
Signed-off-by: Brenden Blanco <bblanco@plumgrid.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
a3f74617 |
| 30-Jun-2016 |
Martin KaFai Lau <kafai@fb.com> |
cgroup: bpf: Add an example to do cgroup checking in BPF
test_cgrp2_array_pin.c: A userland program that creates a bpf_map (BPF_MAP_TYPE_GROUP_ARRAY), pouplates/updates it with a cgroup2's backed fd
cgroup: bpf: Add an example to do cgroup checking in BPF
test_cgrp2_array_pin.c: A userland program that creates a bpf_map (BPF_MAP_TYPE_GROUP_ARRAY), pouplates/updates it with a cgroup2's backed fd and pins it to a bpf-fs's file. The pinned file can be loaded by tc and then used by the bpf prog later. This program can also update an existing pinned array and it could be useful for debugging/testing purpose.
test_cgrp2_tc_kern.c: A bpf prog which should be loaded by tc. It is to demonstrate the usage of bpf_skb_in_cgroup.
test_cgrp2_tc.sh: A script that glues the test_cgrp2_array_pin.c and test_cgrp2_tc_kern.c together. The idea is like: 1. Load the test_cgrp2_tc_kern.o by tc 2. Use test_cgrp2_array_pin.c to populate a BPF_MAP_TYPE_CGROUP_ARRAY with a cgroup fd 3. Do a 'ping -6 ff02::1%ve' to ensure the packet has been dropped because of a match on the cgroup
Most of the lines in test_cgrp2_tc.sh is the boilerplate to setup the cgroup/bpf-fs/net-devices/netns...etc. It is not bulletproof on errors but should work well enough and give enough debug info if things did not go well.
Signed-off-by: Martin KaFai Lau <kafai@fb.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Tejun Heo <tj@kernel.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
65d472fb |
| 05-May-2016 |
Alexei Starovoitov <ast@fb.com> |
samples/bpf: add 'pointer to packet' tests
parse_simple.c - packet parser exapmle with single length check that filters out udp packets for port 9
parse_varlen.c - variable length parser that under
samples/bpf: add 'pointer to packet' tests
parse_simple.c - packet parser exapmle with single length check that filters out udp packets for port 9
parse_varlen.c - variable length parser that understand multiple vlan headers, ipip, ipip6 and ip options to filter out udp or tcp packets on port 9. The packet is parsed layer by layer with multitple length checks.
parse_ldabs.c - classic style of packet parsing using LD_ABS instruction. Same functionality as parse_simple.
simple = 24.1Mpps per core varlen = 22.7Mpps ldabs = 21.4Mpps
Parser with LD_ABS instructions is slower than full direct access parser which does more packet accesses and checks.
These examples demonstrate the choice bpf program authors can make between flexibility of the parser vs speed.
Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
bdefbbf2 |
| 28-Apr-2016 |
Jesper Dangaard Brouer <brouer@redhat.com> |
samples/bpf: like LLC also verify and allow redefining CLANG command
Users are likely to manually compile both LLVM 'llc' and 'clang' tools. Thus, also allow redefining CLANG and verify command exi
samples/bpf: like LLC also verify and allow redefining CLANG command
Users are likely to manually compile both LLVM 'llc' and 'clang' tools. Thus, also allow redefining CLANG and verify command exist.
Makefile implementation wise, the target that verify the command have been generalized.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
b62a796c |
| 28-Apr-2016 |
Jesper Dangaard Brouer <brouer@redhat.com> |
samples/bpf: allow make to be run from samples/bpf/ directory
It is not intuitive that 'make' must be run from the top level directory with argument "samples/bpf/" to compile these eBPF samples.
In
samples/bpf: allow make to be run from samples/bpf/ directory
It is not intuitive that 'make' must be run from the top level directory with argument "samples/bpf/" to compile these eBPF samples.
Introduce a kbuild make file trick that allow make to be run from the "samples/bpf/" directory itself. It basically change to the top level directory and call "make samples/bpf/" with the "/" slash after the directory name.
Also add a clean target that only cleans this directory, by taking advantage of the kbuild external module setting M=$PWD.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
7b01dd57 |
| 28-Apr-2016 |
Jesper Dangaard Brouer <brouer@redhat.com> |
samples/bpf: Makefile verify LLVM compiler avail and bpf target is supported
Make compiling samples/bpf more user friendly, by detecting if LLVM compiler tool 'llc' is available, and also detect if
samples/bpf: Makefile verify LLVM compiler avail and bpf target is supported
Make compiling samples/bpf more user friendly, by detecting if LLVM compiler tool 'llc' is available, and also detect if the 'bpf' target is available in this version of LLVM.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
6ccfba75 |
| 28-Apr-2016 |
Jesper Dangaard Brouer <brouer@redhat.com> |
samples/bpf: add back functionality to redefine LLC command
It is practical to be-able-to redefine the location of the LLVM command 'llc', because not all distros have a LLVM version with bpf target
samples/bpf: add back functionality to redefine LLC command
It is practical to be-able-to redefine the location of the LLVM command 'llc', because not all distros have a LLVM version with bpf target support. Thus, it is sometimes required to compile LLVM from source, and sometimes it is not desired to overwrite the distros default LLVM version.
This feature was removed with 128d1514be35 ("samples/bpf: Use llc in PATH, rather than a hardcoded value").
Add this features back. Note that it is possible to redefine the LLC on the make command like:
make samples/bpf/ LLC=~/git/llvm/build/bin/llc
Fixes: 128d1514be35 ("samples/bpf: Use llc in PATH, rather than a hardcoded value") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
e3edfdec |
| 06-Apr-2016 |
Alexei Starovoitov <ast@fb.com> |
samples/bpf: add tracepoint vs kprobe performance tests
the first microbenchmark does fd=open("/proc/self/comm"); for() { write(fd, "test"); } and on 4 cpus in parallel:
samples/bpf: add tracepoint vs kprobe performance tests
the first microbenchmark does fd=open("/proc/self/comm"); for() { write(fd, "test"); } and on 4 cpus in parallel: writes per sec base (no tracepoints, no kprobes) 930k with kprobe at __set_task_comm() 420k with tracepoint at task:task_rename 730k
For kprobe + full bpf program manully fetches oldcomm, newcomm via bpf_probe_read. For tracepint bpf program does nothing, since arguments are copied by tracepoint.
2nd microbenchmark does: fd=open("/dev/urandom"); for() { read(fd, buf); } and on 4 cpus in parallel: reads per sec base (no tracepoints, no kprobes) 300k with kprobe at urandom_read() 279k with tracepoint at random:urandom_read 290k
bpf progs attached to kprobe and tracepoint are noop.
Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
128d1514 |
| 04-Apr-2016 |
Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> |
samples/bpf: Use llc in PATH, rather than a hardcoded value
While at it, remove the generation of .s files and fix some typos in the related comment.
Cc: Alexei Starovoitov <ast@fb.com> Cc: David S
samples/bpf: Use llc in PATH, rather than a hardcoded value
While at it, remove the generation of .s files and fix some typos in the related comment.
Cc: Alexei Starovoitov <ast@fb.com> Cc: David S. Miller <davem@davemloft.net> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
26e90931 |
| 08-Mar-2016 |
Alexei Starovoitov <ast@fb.com> |
samples/bpf: add map performance test
performance tests for hash map and per-cpu hash map with and without pre-allocation
Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S.
samples/bpf: add map performance test
performance tests for hash map and per-cpu hash map with and without pre-allocation
Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
9d8b612d |
| 08-Mar-2016 |
Alexei Starovoitov <ast@fb.com> |
samples/bpf: add bpf map stress test
this test calls bpf programs from different contexts: from inside of slub, from rcu, from pretty much everywhere, since it kprobes all spin_lock functions. It st
samples/bpf: add bpf map stress test
this test calls bpf programs from different contexts: from inside of slub, from rcu, from pretty much everywhere, since it kprobes all spin_lock functions. It stresses the bpf hash and percpu map pre-allocation, deallocation logic and call_rcu mechanisms. User space part adding more stress by walking and deleting map elements.
Note that due to nature bpf_load.c the earlier kprobe+bpf programs are already active while loader loads new programs, creates new kprobes and attaches them.
Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
a6ffe7b9 |
| 17-Feb-2016 |
Alexei Starovoitov <ast@fb.com> |
samples/bpf: offwaketime example
This is simplified version of Brendan Gregg's offwaketime: This program shows kernel stack traces and task names that were blocked and "off-CPU", along with the stac
samples/bpf: offwaketime example
This is simplified version of Brendan Gregg's offwaketime: This program shows kernel stack traces and task names that were blocked and "off-CPU", along with the stack traces and task names for the threads that woke them, and the total elapsed time from when they blocked to when they were woken up. The combined stacks, task names, and total time is summarized in kernel context for efficiency.
Example: $ sudo ./offwaketime | flamegraph.pl > demo.svg Open demo.svg in the browser as FlameGraph visualization.
Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
30b50aa6 |
| 12-Nov-2015 |
Yang Shi <yang.shi@linaro.org> |
bpf: samples: exclude asm/sysreg.h for arm64
commit 338d4f49d6f7114a017d294ccf7374df4f998edc ("arm64: kernel: Add support for Privileged Access Never") includes sysreg.h into futex.h and uaccess.h.
bpf: samples: exclude asm/sysreg.h for arm64
commit 338d4f49d6f7114a017d294ccf7374df4f998edc ("arm64: kernel: Add support for Privileged Access Never") includes sysreg.h into futex.h and uaccess.h. But, the inline assembly used by asm/sysreg.h is incompatible with llvm so it will cause BPF samples build failure for ARM64. Since sysreg.h is useless for BPF samples, just exclude it from Makefile via defining __ASM_SYSREG_H.
Signed-off-by: Yang Shi <yang.shi@linaro.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
42984d7c |
| 29-Oct-2015 |
Daniel Borkmann <daniel@iogearbox.net> |
bpf: add sample usages for persistent maps/progs
This patch adds a couple of stand-alone examples on how BPF_OBJ_PIN and BPF_OBJ_GET commands can be used.
Example with maps:
# ./fds_example -F /
bpf: add sample usages for persistent maps/progs
This patch adds a couple of stand-alone examples on how BPF_OBJ_PIN and BPF_OBJ_GET commands can be used.
Example with maps:
# ./fds_example -F /sys/fs/bpf/m -P -m -k 1 -v 42 bpf: map fd:3 (Success) bpf: pin ret:(0,Success) bpf: fd:3 u->(1:42) ret:(0,Success) # ./fds_example -F /sys/fs/bpf/m -G -m -k 1 bpf: get fd:3 (Success) bpf: fd:3 l->(1):42 ret:(0,Success) # ./fds_example -F /sys/fs/bpf/m -G -m -k 1 -v 24 bpf: get fd:3 (Success) bpf: fd:3 u->(1:24) ret:(0,Success) # ./fds_example -F /sys/fs/bpf/m -G -m -k 1 bpf: get fd:3 (Success) bpf: fd:3 l->(1):24 ret:(0,Success)
# ./fds_example -F /sys/fs/bpf/m2 -P -m bpf: map fd:3 (Success) bpf: pin ret:(0,Success) # ./fds_example -F /sys/fs/bpf/m2 -G -m -k 1 bpf: get fd:3 (Success) bpf: fd:3 l->(1):0 ret:(0,Success) # ./fds_example -F /sys/fs/bpf/m2 -G -m bpf: get fd:3 (Success)
Example with progs:
# ./fds_example -F /sys/fs/bpf/p -P -p bpf: prog fd:3 (Success) bpf: pin ret:(0,Success) bpf sock:4 <- fd:3 attached ret:(0,Success) # ./fds_example -F /sys/fs/bpf/p -G -p bpf: get fd:3 (Success) bpf: sock:4 <- fd:3 attached ret:(0,Success)
# ./fds_example -F /sys/fs/bpf/p2 -P -p -o ./sockex1_kern.o bpf: prog fd:5 (Success) bpf: pin ret:(0,Success) bpf: sock:3 <- fd:5 attached ret:(0,Success) # ./fds_example -F /sys/fs/bpf/p2 -G -p bpf: get fd:3 (Success) bpf: sock:4 <- fd:3 attached ret:(0,Success)
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
39111695 |
| 20-Oct-2015 |
Alexei Starovoitov <ast@plumgrid.com> |
samples: bpf: add bpf_perf_event_output example
Performance test and example of bpf_perf_event_output(). kprobe is attached to sys_write() and trivial bpf program streams pid+cookie into userspace v
samples: bpf: add bpf_perf_event_output example
Performance test and example of bpf_perf_event_output(). kprobe is attached to sys_write() and trivial bpf program streams pid+cookie into userspace via PERF_COUNT_SW_BPF_OUTPUT event.
Usage: $ sudo ./bld_x64/samples/bpf/trace_output recv 2968913 events per sec
Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
47efb302 |
| 06-Aug-2015 |
Kaixu Xia <xiakaixu@huawei.com> |
samples/bpf: example of get selected PMU counter value
This is a simple example and shows how to use the new ability to get the selected Hardware PMU counter value.
Signed-off-by: Kaixu Xia <xiakai
samples/bpf: example of get selected PMU counter value
This is a simple example and shows how to use the new ability to get the selected Hardware PMU counter value.
Signed-off-by: Kaixu Xia <xiakaixu@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
0fb1170e |
| 19-Jun-2015 |
Daniel Wagner <daniel.wagner@bmw-carit.de> |
bpf: BPF based latency tracing
BPF offers another way to generate latency histograms. We attach kprobes at trace_preempt_off and trace_preempt_on and calculate the time it takes to from seeing the o
bpf: BPF based latency tracing
BPF offers another way to generate latency histograms. We attach kprobes at trace_preempt_off and trace_preempt_on and calculate the time it takes to from seeing the off/on transition.
The first array is used to store the start time stamp. The key is the CPU id. The second array stores the log2(time diff). We need to use static allocation here (array and not hash tables). The kprobes hooking into trace_preempt_on|off should not calling any dynamic memory allocation or free path. We need to avoid recursivly getting called. Besides that, it reduces jitter in the measurement.
CPU 0 latency : count distribution 1 -> 1 : 0 | | 2 -> 3 : 0 | | 4 -> 7 : 0 | | 8 -> 15 : 0 | | 16 -> 31 : 0 | | 32 -> 63 : 0 | | 64 -> 127 : 0 | | 128 -> 255 : 0 | | 256 -> 511 : 0 | | 512 -> 1023 : 0 | | 1024 -> 2047 : 0 | | 2048 -> 4095 : 166723 |*************************************** | 4096 -> 8191 : 19870 |*** | 8192 -> 16383 : 6324 | | 16384 -> 32767 : 1098 | | 32768 -> 65535 : 190 | | 65536 -> 131071 : 179 | | 131072 -> 262143 : 18 | | 262144 -> 524287 : 4 | | 524288 -> 1048575 : 1363 | | CPU 1 latency : count distribution 1 -> 1 : 0 | | 2 -> 3 : 0 | | 4 -> 7 : 0 | | 8 -> 15 : 0 | | 16 -> 31 : 0 | | 32 -> 63 : 0 | | 64 -> 127 : 0 | | 128 -> 255 : 0 | | 256 -> 511 : 0 | | 512 -> 1023 : 0 | | 1024 -> 2047 : 0 | | 2048 -> 4095 : 114042 |*************************************** | 4096 -> 8191 : 9587 |** | 8192 -> 16383 : 4140 | | 16384 -> 32767 : 673 | | 32768 -> 65535 : 179 | | 65536 -> 131071 : 29 | | 131072 -> 262143 : 4 | | 262144 -> 524287 : 1 | | 524288 -> 1048575 : 364 | | CPU 2 latency : count distribution 1 -> 1 : 0 | | 2 -> 3 : 0 | | 4 -> 7 : 0 | | 8 -> 15 : 0 | | 16 -> 31 : 0 | | 32 -> 63 : 0 | | 64 -> 127 : 0 | | 128 -> 255 : 0 | | 256 -> 511 : 0 | | 512 -> 1023 : 0 | | 1024 -> 2047 : 0 | | 2048 -> 4095 : 40147 |*************************************** | 4096 -> 8191 : 2300 |* | 8192 -> 16383 : 828 | | 16384 -> 32767 : 178 | | 32768 -> 65535 : 59 | | 65536 -> 131071 : 2 | | 131072 -> 262143 : 0 | | 262144 -> 524287 : 1 | | 524288 -> 1048575 : 174 | | CPU 3 latency : count distribution 1 -> 1 : 0 | | 2 -> 3 : 0 | | 4 -> 7 : 0 | | 8 -> 15 : 0 | | 16 -> 31 : 0 | | 32 -> 63 : 0 | | 64 -> 127 : 0 | | 128 -> 255 : 0 | | 256 -> 511 : 0 | | 512 -> 1023 : 0 | | 1024 -> 2047 : 0 | | 2048 -> 4095 : 29626 |*************************************** | 4096 -> 8191 : 2704 |** | 8192 -> 16383 : 1090 | | 16384 -> 32767 : 160 | | 32768 -> 65535 : 72 | | 65536 -> 131071 : 32 | | 131072 -> 262143 : 26 | | 262144 -> 524287 : 12 | | 524288 -> 1048575 : 298 | |
All this is based on the trace3 examples written by Alexei Starovoitov <ast@plumgrid.com>.
Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Ingo Molnar <mingo@kernel.org> Cc: linux-kernel@vger.kernel.org Cc: netdev@vger.kernel.org Acked-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
530b2c86 |
| 19-May-2015 |
Alexei Starovoitov <ast@plumgrid.com> |
samples/bpf: bpf_tail_call example for networking
Usage: $ sudo ./sockex3 IP src.port -> dst.port bytes packets 127.0.0.1.42010 -> 127.0.0.1.12865 1568 8 12
samples/bpf: bpf_tail_call example for networking
Usage: $ sudo ./sockex3 IP src.port -> dst.port bytes packets 127.0.0.1.42010 -> 127.0.0.1.12865 1568 8 127.0.0.1.59526 -> 127.0.0.1.33778 11422636 173070 127.0.0.1.33778 -> 127.0.0.1.59526 11260224828 341974 127.0.0.1.12865 -> 127.0.0.1.42010 1832 12 IP src.port -> dst.port bytes packets 127.0.0.1.42010 -> 127.0.0.1.12865 1568 8 127.0.0.1.59526 -> 127.0.0.1.33778 23198092 351486 127.0.0.1.33778 -> 127.0.0.1.59526 22972698518 698616 127.0.0.1.12865 -> 127.0.0.1.42010 1832 12
this example is similar to sockex2 in a way that it accumulates per-flow statistics, but it does packet parsing differently. sockex2 inlines full packet parser routine into single bpf program. This sockex3 example have 4 independent programs that parse vlan, mpls, ip, ipv6 and one main program that starts the process. bpf_tail_call() mechanism allows each program to be small and be called on demand potentially multiple times, so that many vlan, mpls, ip in ip, gre encapsulations can be parsed. These and other protocol parsers can be added or removed at runtime. TLVs can be parsed in similar manner. Note, tail_call_cnt dynamic check limits the number of tail calls to 32.
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
5bacd780 |
| 19-May-2015 |
Alexei Starovoitov <ast@plumgrid.com> |
samples/bpf: bpf_tail_call example for tracing
kprobe example that demonstrates how future seccomp programs may look like. It attaches to seccomp_phase1() function and tail-calls other BPF programs
samples/bpf: bpf_tail_call example for tracing
kprobe example that demonstrates how future seccomp programs may look like. It attaches to seccomp_phase1() function and tail-calls other BPF programs depending on syscall number.
Existing optimized classic BPF seccomp programs generated by Chrome look like: if (sd.nr < 121) { if (sd.nr < 57) { if (sd.nr < 22) { if (sd.nr < 7) { if (sd.nr < 4) { if (sd.nr < 1) { check sys_read } else { if (sd.nr < 3) { check sys_write and sys_open } else { check sys_close } } } else { } else { } else { } else { } else { }
the future seccomp using native eBPF may look like: bpf_tail_call(&sd, &syscall_jmp_table, sd.nr); which is simpler, faster and leaves more room for per-syscall checks.
Usage: $ sudo ./tracex5 <...>-366 [001] d... 4.870033: : read(fd=1, buf=00007f6d5bebf000, size=771) <...>-369 [003] d... 4.870066: : mmap <...>-369 [003] d... 4.870077: : syscall=110 (one of get/set uid/pid/gid) <...>-369 [003] d... 4.870089: : syscall=107 (one of get/set uid/pid/gid) sh-369 [000] d... 4.891740: : read(fd=0, buf=00000000023d1000, size=512) sh-369 [000] d... 4.891747: : write(fd=1, buf=00000000023d3000, size=512) sh-369 [000] d... 4.891747: : read(fd=1, buf=00000000023d3000, size=512)
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
b88c06e3 |
| 11-May-2015 |
Brenden Blanco <bblanco@plumgrid.com> |
samples/bpf: fix in-source build of samples with clang
in-source build of 'make samples/bpf/' was incorrectly using default compiler instead of invoking clang/llvm. out-of-source build was ok.
Fixe
samples/bpf: fix in-source build of samples with clang
in-source build of 'make samples/bpf/' was incorrectly using default compiler instead of invoking clang/llvm. out-of-source build was ok.
Fixes: a80857822b0c ("samples: bpf: trivial eBPF program in C") Signed-off-by: Brenden Blanco <bblanco@plumgrid.com> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
91bc4822 |
| 01-Apr-2015 |
Alexei Starovoitov <ast@plumgrid.com> |
tc: bpf: add checksum helpers
Commit 608cd71a9c7c ("tc: bpf: generalize pedit action") has added the possibility to mangle packet data to BPF programs in the tc pipeline. This patch adds two helpers
tc: bpf: add checksum helpers
Commit 608cd71a9c7c ("tc: bpf: generalize pedit action") has added the possibility to mangle packet data to BPF programs in the tc pipeline. This patch adds two helpers bpf_l3_csum_replace() and bpf_l4_csum_replace() for fixing up the protocol checksums after the packet mangling.
It also adds 'flags' argument to bpf_skb_store_bytes() helper to avoid unnecessary checksum recomputations when BPF programs adjusting l3/l4 checksums and documents all three helpers in uapi header.
Moreover, a sample program is added to show how BPF programs can make use of the mangle and csum helpers.
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
9811e353 |
| 25-Mar-2015 |
Alexei Starovoitov <ast@plumgrid.com> |
samples/bpf: Add kmem_alloc()/free() tracker tool
One BPF program attaches to kmem_cache_alloc_node() and remembers all allocated objects in the map. Another program attaches to kmem_cache_free() an
samples/bpf: Add kmem_alloc()/free() tracker tool
One BPF program attaches to kmem_cache_alloc_node() and remembers all allocated objects in the map. Another program attaches to kmem_cache_free() and deletes corresponding object from the map.
User space walks the map every second and prints any objects which are older than 1 second.
Usage:
$ sudo tracex4
Then start few long living processes. The 'tracex4' will print something like this:
obj 0xffff880465928000 is 13sec old was allocated at ip ffffffff8105dc32 obj 0xffff88043181c280 is 13sec old was allocated at ip ffffffff8105dc32 obj 0xffff880465848000 is 8sec old was allocated at ip ffffffff8105dc32 obj 0xffff8804338bc280 is 15sec old was allocated at ip ffffffff8105dc32
$ addr2line -fispe vmlinux ffffffff8105dc32 do_fork at fork.c:1665
As soon as processes exit the memory is reclaimed and 'tracex4' prints nothing.
Similar experiment can be done with the __kmalloc()/kfree() pair.
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1427312966-8434-10-git-send-email-ast@plumgrid.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
5c7fc2d2 |
| 25-Mar-2015 |
Alexei Starovoitov <ast@plumgrid.com> |
samples/bpf: Add IO latency analysis (iosnoop/heatmap) tool
BPF C program attaches to blk_mq_start_request()/blk_update_request() kprobe events to calculate IO latency.
For every completed block IO
samples/bpf: Add IO latency analysis (iosnoop/heatmap) tool
BPF C program attaches to blk_mq_start_request()/blk_update_request() kprobe events to calculate IO latency.
For every completed block IO event it computes the time delta in nsec and records in a histogram map:
map[log10(delta)*10]++
User space reads this histogram map every 2 seconds and prints it as a 'heatmap' using gray shades of text terminal. Black spaces have many events and white spaces have very few events. Left most space is the smallest latency, right most space is the largest latency in the range.
Usage:
$ sudo ./tracex3 and do 'sudo dd if=/dev/sda of=/dev/null' in other terminal.
Observe IO latencies and how different activity (like 'make kernel') affects it.
Similar experiments can be done for network transmit latencies, syscalls, etc.
'-t' flag prints the heatmap using normal ascii characters:
$ sudo ./tracex3 -t heatmap of IO latency # - many events with this latency - few events |1us |10us |100us |1ms |10ms |100ms |1s |10s *ooo. *O.#. # 221 . *# . # 125 .. .o#*.. # 55 . . . . .#O # 37 .# # 175 .#*. # 37 # # 199 . . *#*. # 55 *#..* # 42 # # 266 ...***Oo#*OO**o#* . # 629 # # 271 . .#o* o.*o* # 221 . . o* *#O.. # 50
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1427312966-8434-9-git-send-email-ast@plumgrid.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|