History log of /openbmc/linux/kernel/bpf/Makefile (Results 551 – 575 of 575)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 89453379 31-Oct-2014 Linus Torvalds <torvalds@linux-foundation.org>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:
"A bit has accumulated, but it's been a week or so since my last batch
of post-merge-window

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:
"A bit has accumulated, but it's been a week or so since my last batch
of post-merge-window fixes, so...

1) Missing module license in netfilter reject module, from Pablo.
Lots of people ran into this.

2) Off by one in mac80211 baserate calculation, from Karl Beldan.

3) Fix incorrect return value from ax88179_178a driver's set_mac_addr
op, which broke use of it with bonding. From Ian Morgan.

4) Checking of skb_gso_segment()'s return value was not all
encompassing, it can return an SKB pointer, a pointer error, or
NULL. Fix from Florian Westphal.

This is crummy, and longer term will be fixed to just return error
pointers or a real SKB.

6) Encapsulation offloads not being handled by
skb_gso_transport_seglen(). From Florian Westphal.

7) Fix deadlock in TIPC stack, from Ying Xue.

8) Fix performance regression from using rhashtable for netlink
sockets. The problem was the synchronize_net() invoked for every
socket destroy. From Thomas Graf.

9) Fix bug in eBPF verifier, and remove the strong dependency of BPF
on NET. From Alexei Starovoitov.

10) In qdisc_create(), use the correct interface to allocate
->cpu_bstats, otherwise the u64_stats_sync member isn't
initialized properly. From Sabrina Dubroca.

11) Off by one in ip_set_nfnl_get_byindex(), from Dan Carpenter.

12) nf_tables_newchain() was erroneously expecting error pointers from
netdev_alloc_pcpu_stats(). It only returna a valid pointer or
NULL. From Sabrina Dubroca.

13) Fix use-after-free in _decode_session6(), from Li RongQing.

14) When we set the TX flow hash on a socket, we mistakenly do so
before we've nailed down the final source port. Move the setting
deeper to fix this. From Sathya Perla.

15) NAPI budget accounting in amd-xgbe driver was counting descriptors
instead of full packets, fix from Thomas Lendacky.

16) Fix total_data_buflen calculation in hyperv driver, from Haiyang
Zhang.

17) Fix bcma driver build with OF_ADDRESS disabled, from Hauke
Mehrtens.

18) Fix mis-use of per-cpu memory in TCP md5 code. The problem is
that something that ends up being vmalloc memory can't be passed
to the crypto hash routines via scatter-gather lists. From Eric
Dumazet.

19) Fix regression in promiscuous mode enabling in cdc-ether, from
Olivier Blin.

20) Bucket eviction and frag entry killing can race with eachother,
causing an unlink of the object from the wrong list. Fix from
Nikolay Aleksandrov.

21) Missing initialization of spinlock in cxgb4 driver, from Anish
Bhatt.

22) Do not cache ipv4 routing failures, otherwise if the sysctl for
forwarding is subsequently enabled this won't be seen. From
Nicolas Cavallari"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (131 commits)
drivers: net: cpsw: Support ALLMULTI and fix IFF_PROMISC in switch mode
drivers: net: cpsw: Fix broken loop condition in switch mode
net: ethtool: Return -EOPNOTSUPP if user space tries to read EEPROM with lengh 0
stmmac: pci: set default of the filter bins
net: smc91x: Fix gpios for device tree based booting
mpls: Allow mpls_gso to be built as module
mpls: Fix mpls_gso handler.
r8152: stop submitting intr for -EPROTO
netfilter: nft_reject_bridge: restrict reject to prerouting and input
netfilter: nft_reject_bridge: don't use IP stack to reject traffic
netfilter: nf_reject_ipv6: split nf_send_reset6() in smaller functions
netfilter: nf_reject_ipv4: split nf_send_reset() in smaller functions
netfilter: nf_tables_bridge: update hook_mask to allow {pre,post}routing
drivers/net: macvtap and tun depend on INET
drivers/net, ipv6: Select IPv6 fragment idents for virtio UFO packets
drivers/net: Disable UFO through virtio
net: skb_fclone_busy() needs to detect orphaned skb
gre: Use inner mac length when computing tunnel length
mlx4: Avoid leaking steering rules on flow creation error flow
net/mlx4_en: Don't attempt to TX offload the outer UDP checksum for VXLAN
...

show more ...


Revision tags: v3.18-rc2
# f89b7755 23-Oct-2014 Alexei Starovoitov <ast@plumgrid.com>

bpf: split eBPF out of NET

introduce two configs:
- hidden CONFIG_BPF to select eBPF interpreter that classic socket filters
depend on
- visible CONFIG_BPF_SYSCALL (default off) that tracing and s

bpf: split eBPF out of NET

introduce two configs:
- hidden CONFIG_BPF to select eBPF interpreter that classic socket filters
depend on
- visible CONFIG_BPF_SYSCALL (default off) that tracing and sockets can use

that solves several problems:
- tracing and others that wish to use eBPF don't need to depend on NET.
They can use BPF_SYSCALL to allow loading from userspace or select BPF
to use it directly from kernel in NET-less configs.
- in 3.18 programs cannot be attached to events yet, so don't force it on
- when the rest of eBPF infra is there in 3.19+, it's still useful to
switch it off to minimize kernel size

bloat-o-meter on x64 shows:
add/remove: 0/60 grow/shrink: 0/2 up/down: 0/-15601 (-15601)

tested with many different config combinations. Hopefully didn't miss anything.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# d5432503 27-Oct-2014 Takashi Iwai <tiwai@suse.de>

Merge tag 'asoc-v3.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v3.18

A few small driver fixes for v3.18 plus the removal of the s6000 suppo

Merge tag 'asoc-v3.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v3.18

A few small driver fixes for v3.18 plus the removal of the s6000 support
since the relevant chip is no longer supported in mainline.

show more ...


# 93035286 22-Oct-2014 Takashi Iwai <tiwai@suse.de>

Merge branch 'topic/enum-info-cleanup' into for-next

this is a series of patches to just convert the plain info callback
for enum ctl elements to snd_ctl_elem_info(). Also, it includes the
extensio

Merge branch 'topic/enum-info-cleanup' into for-next

this is a series of patches to just convert the plain info callback
for enum ctl elements to snd_ctl_elem_info(). Also, it includes the
extension of snd_ctl_elem_info(), for catching the unexpected string
cut-off and handling the zero items.

show more ...


# 1b62f19c 21-Oct-2014 Mauro Carvalho Chehab <mchehab@osg.samsung.com>

Merge tag 'v3.18-rc1' into v4l_for_linus

Linux 3.18-rc1

* tag 'v3.18-rc1': (9167 commits)
Linux 3.18-rc1
MAINTAINERS: corrected bcm2835 search
Net: DSA: Fix checking for get_phy_flags functio

Merge tag 'v3.18-rc1' into v4l_for_linus

Linux 3.18-rc1

* tag 'v3.18-rc1': (9167 commits)
Linux 3.18-rc1
MAINTAINERS: corrected bcm2835 search
Net: DSA: Fix checking for get_phy_flags function
sparc64: Do not define thread fpregs save area as zero-length array.
sparc64: Fix corrupted thread fault code.
MAINTAINERS: Become the docs maintainer
x86,kvm,vmx: Preserve CR4 across VM entry
ipv6: fix a potential use after free in sit.c
ipv6: fix a potential use after free in ip6_offload.c
ipv4: fix a potential use after free in gre_offload.c
tcp: fix build error if IPv6 is not enabled
futex: Ensure get_futex_key_refs() always implies a barrier
bna: fix skb->truesize underestimation
net: dsa: add includes for ethtool and phy_fixed definitions
openvswitch: Set flow-key members.
netrom: use linux/uaccess.h
dsa: Fix conversion from host device to mii bus
tipc: fix bug in bundled buffer reception
ipv6: introduce tcp_v6_iif()
sfc: add support for skb->xmit_more
...

show more ...


# 1ef24960 21-Oct-2014 Mauro Carvalho Chehab <mchehab@osg.samsung.com>

Merge tag 'v3.18-rc1' into patchwork

Linux 3.18-rc1

* tag 'v3.18-rc1': (9526 commits)
Linux 3.18-rc1
MAINTAINERS: corrected bcm2835 search
Net: DSA: Fix checking for get_phy_flags function

Merge tag 'v3.18-rc1' into patchwork

Linux 3.18-rc1

* tag 'v3.18-rc1': (9526 commits)
Linux 3.18-rc1
MAINTAINERS: corrected bcm2835 search
Net: DSA: Fix checking for get_phy_flags function
sparc64: Do not define thread fpregs save area as zero-length array.
sparc64: Fix corrupted thread fault code.
MAINTAINERS: Become the docs maintainer
x86,kvm,vmx: Preserve CR4 across VM entry
ipv6: fix a potential use after free in sit.c
ipv6: fix a potential use after free in ip6_offload.c
ipv4: fix a potential use after free in gre_offload.c
tcp: fix build error if IPv6 is not enabled
futex: Ensure get_futex_key_refs() always implies a barrier
bna: fix skb->truesize underestimation
net: dsa: add includes for ethtool and phy_fixed definitions
openvswitch: Set flow-key members.
netrom: use linux/uaccess.h
dsa: Fix conversion from host device to mii bus
tipc: fix bug in bundled buffer reception
ipv6: introduce tcp_v6_iif()
sfc: add support for skb->xmit_more
...

show more ...


# a13926db 20-Oct-2014 Chris Zankel <chris@zankel.net>

Merge tag 'v3.18-rc1' into for_next

Linux 3.18-rc1


Revision tags: v3.18-rc1
# 35a9ad8a 08-Oct-2014 Linus Torvalds <torvalds@linux-foundation.org>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next

Pull networking updates from David Miller:
"Most notable changes in here:

1) By far the biggest accomplishment, thanks to a la

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next

Pull networking updates from David Miller:
"Most notable changes in here:

1) By far the biggest accomplishment, thanks to a large range of
contributors, is the addition of multi-send for transmit. This is
the result of discussions back in Chicago, and the hard work of
several individuals.

Now, when the ->ndo_start_xmit() method of a driver sees
skb->xmit_more as true, it can choose to defer the doorbell
telling the driver to start processing the new TX queue entires.

skb->xmit_more means that the generic networking is guaranteed to
call the driver immediately with another SKB to send.

There is logic added to the qdisc layer to dequeue multiple
packets at a time, and the handling mis-predicted offloads in
software is now done with no locks held.

Finally, pktgen is extended to have a "burst" parameter that can
be used to test a multi-send implementation.

Several drivers have xmit_more support: i40e, igb, ixgbe, mlx4,
virtio_net

Adding support is almost trivial, so export more drivers to
support this optimization soon.

I want to thank, in no particular or implied order, Jesper
Dangaard Brouer, Eric Dumazet, Alexander Duyck, Tom Herbert, Jamal
Hadi Salim, John Fastabend, Florian Westphal, Daniel Borkmann,
David Tat, Hannes Frederic Sowa, and Rusty Russell.

2) PTP and timestamping support in bnx2x, from Michal Kalderon.

3) Allow adjusting the rx_copybreak threshold for a driver via
ethtool, and add rx_copybreak support to enic driver. From
Govindarajulu Varadarajan.

4) Significant enhancements to the generic PHY layer and the bcm7xxx
driver in particular (EEE support, auto power down, etc.) from
Florian Fainelli.

5) Allow raw buffers to be used for flow dissection, allowing drivers
to determine the optimal "linear pull" size for devices that DMA
into pools of pages. The objective is to get exactly the
necessary amount of headers into the linear SKB area pre-pulled,
but no more. The new interface drivers use is eth_get_headlen().
From WANG Cong, with driver conversions (several had their own
by-hand duplicated implementations) by Alexander Duyck and Eric
Dumazet.

6) Support checksumming more smoothly and efficiently for
encapsulations, and add "foo over UDP" facility. From Tom
Herbert.

7) Add Broadcom SF2 switch driver to DSA layer, from Florian
Fainelli.

8) eBPF now can load programs via a system call and has an extensive
testsuite. Alexei Starovoitov and Daniel Borkmann.

9) Major overhaul of the packet scheduler to use RCU in several major
areas such as the classifiers and rate estimators. From John
Fastabend.

10) Add driver for Intel FM10000 Ethernet Switch, from Alexander
Duyck.

11) Rearrange TCP_SKB_CB() to reduce cache line misses, from Eric
Dumazet.

12) Add Datacenter TCP congestion control algorithm support, From
Florian Westphal.

13) Reorganize sk_buff so that __copy_skb_header() is significantly
faster. From Eric Dumazet"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1558 commits)
netlabel: directly return netlbl_unlabel_genl_init()
net: add netdev_txq_bql_{enqueue, complete}_prefetchw() helpers
net: description of dma_cookie cause make xmldocs warning
cxgb4: clean up a type issue
cxgb4: potential shift wrapping bug
i40e: skb->xmit_more support
net: fs_enet: Add NAPI TX
net: fs_enet: Remove non NAPI RX
r8169:add support for RTL8168EP
net_sched: copy exts->type in tcf_exts_change()
wimax: convert printk to pr_foo()
af_unix: remove 0 assignment on static
ipv6: Do not warn for informational ICMP messages, regardless of type.
Update Intel Ethernet Driver maintainers list
bridge: Save frag_max_size between PRE_ROUTING and POST_ROUTING
tipc: fix bug in multicast congestion handling
net: better IFF_XMIT_DST_RELEASE support
net/mlx4_en: remove NETDEV_TX_BUSY
3c59x: fix bad split of cpu_to_le32(pci_map_single())
net: bcmgenet: fix Tx ring priority programming
...

show more ...


Revision tags: v3.17
# 447a8b85 03-Oct-2014 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge branch 'next' into for-linus

Prepare first round of input updates for 3.18.


Revision tags: v3.17-rc7
# b4fc1a46 26-Sep-2014 David S. Miller <davem@davemloft.net>

Merge branch 'bpf-next'

Alexei Starovoitov says:

====================
eBPF syscall, verifier, testsuite

v14 -> v15:
- got rid of macros with hidden control flow (suggested by David)
replaced mac

Merge branch 'bpf-next'

Alexei Starovoitov says:

====================
eBPF syscall, verifier, testsuite

v14 -> v15:
- got rid of macros with hidden control flow (suggested by David)
replaced macro with explicit goto or return and simplified
where possible (affected patches #9 and #10)
- rebased, retested

v13 -> v14:
- small change to 1st patch to ease 'new userspace with old kernel'
problem (done similar to perf_copy_attr()) (suggested by Daniel)
- the rest unchanged

v12 -> v13:
- replaced 'foo __user *' pointers with __aligned_u64 (suggested by David)
- added __attribute__((aligned(8)) to 'union bpf_attr' to keep
constant alignment between patches
- updated manpage and syscall wrappers due to __aligned_u64
- rebased, retested on x64 with 32-bit and 64-bit userspace and on i386,
build tested on arm32,sparc64

v11 -> v12:
- dropped patch 11 and copied few macros to libbpf.h (suggested by Daniel)
- replaced 'enum bpf_prog_type' with u32 to be safe in compat (.. Andy)
- implemented and tested compat support (not part of this set) (.. Daniel)
- changed 'void *log_buf' to 'char *' (.. Daniel)
- combined struct bpf_work_struct and bpf_prog_info (.. Daniel)
- added better return value explanation to manpage (.. Andy)
- added log_buf/log_size explanation to manpage (.. Andy & Daniel)
- added a lot more info about prog_type and map_type to manpage (.. Andy)
- rebased, tweaked test_stubs

Patches 1-4 establish BPF syscall shell for maps and programs.
Patches 5-10 add verifier step by step
Patch 11 adds test stubs for 'unspec' program type and verifier testsuite
from user space

Note that patches 1,3,4,7 add commands and attributes to the syscall
while being backwards compatible from each other, which should demonstrate
how other commands can be added in the future.

After this set the programs can be loaded for testing only. They cannot
be attached to any events. Though manpage talks about tracing and sockets,
it will be a subject of future patches.

Please take a look at manpage:

BPF(2) Linux Programmer's Manual BPF(2)

NAME
bpf - perform a command on eBPF map or program

SYNOPSIS
#include <linux/bpf.h>

int bpf(int cmd, union bpf_attr *attr, unsigned int size);

DESCRIPTION
bpf() syscall is a multiplexor for a range of different operations on
eBPF which can be characterized as "universal in-kernel virtual
machine". eBPF is similar to original Berkeley Packet Filter (or
"classic BPF") used to filter network packets. Both statically analyze
the programs before loading them into the kernel to ensure that
programs cannot harm the running system.

eBPF extends classic BPF in multiple ways including ability to call in-
kernel helper functions and access shared data structures like eBPF
maps. The programs can be written in a restricted C that is compiled
into eBPF bytecode and executed on the eBPF virtual machine or JITed
into native instruction set.

eBPF Design/Architecture
eBPF maps is a generic storage of different types. User process can
create multiple maps (with key/value being opaque bytes of data) and
access them via file descriptor. In parallel eBPF programs can access
maps from inside the kernel. It's up to user process and eBPF program
to decide what they store inside maps.

eBPF programs are similar to kernel modules. They are loaded by the
user process and automatically unloaded when process exits. Each eBPF
program is a safe run-to-completion set of instructions. eBPF verifier
statically determines that the program terminates and is safe to
execute. During verification the program takes a hold of maps that it
intends to use, so selected maps cannot be removed until the program is
unloaded. The program can be attached to different events. These events
can be packets, tracepoint events and other types in the future. A new
event triggers execution of the program which may store information
about the event in the maps. Beyond storing data the programs may call
into in-kernel helper functions which may, for example, dump stack, do
trace_printk or other forms of live kernel debugging. The same program
can be attached to multiple events. Different programs can access the
same map:
tracepoint tracepoint tracepoint sk_buff sk_buff
event A event B event C on eth0 on eth1
| | | | |
| | | | |
--> tracing <-- tracing socket socket
prog_1 prog_2 prog_3 prog_4
| | | |
|--- -----| |-------| map_3
map_1 map_2

Syscall Arguments
bpf() syscall operation is determined by cmd which can be one of the
following:

BPF_MAP_CREATE
Create a map with given type and attributes and return map FD

BPF_MAP_LOOKUP_ELEM
Lookup element by key in a given map and return its value

BPF_MAP_UPDATE_ELEM
Create or update element (key/value pair) in a given map

BPF_MAP_DELETE_ELEM
Lookup and delete element by key in a given map

BPF_MAP_GET_NEXT_KEY
Lookup element by key in a given map and return key of next
element

BPF_PROG_LOAD
Verify and load eBPF program

attr is a pointer to a union of type bpf_attr as defined below.

size is the size of the union.

union bpf_attr {
struct { /* anonymous struct used by BPF_MAP_CREATE command */
__u32 map_type;
__u32 key_size; /* size of key in bytes */
__u32 value_size; /* size of value in bytes */
__u32 max_entries; /* max number of entries in a map */
};

struct { /* anonymous struct used by BPF_MAP_*_ELEM commands */
__u32 map_fd;
__aligned_u64 key;
union {
__aligned_u64 value;
__aligned_u64 next_key;
};
};

struct { /* anonymous struct used by BPF_PROG_LOAD command */
__u32 prog_type;
__u32 insn_cnt;
__aligned_u64 insns; /* 'const struct bpf_insn *' */
__aligned_u64 license; /* 'const char *' */
__u32 log_level; /* verbosity level of eBPF verifier */
__u32 log_size; /* size of user buffer */
__aligned_u64 log_buf; /* user supplied 'char *' buffer */
};
} __attribute__((aligned(8)));

eBPF maps
maps is a generic storage of different types for sharing data between
kernel and userspace.

Any map type has the following attributes:
. type
. max number of elements
. key size in bytes
. value size in bytes

The following wrapper functions demonstrate how this syscall can be
used to access the maps. The functions use the cmd argument to invoke
different operations.

BPF_MAP_CREATE
int bpf_create_map(enum bpf_map_type map_type, int key_size,
int value_size, int max_entries)
{
union bpf_attr attr = {
.map_type = map_type,
.key_size = key_size,
.value_size = value_size,
.max_entries = max_entries
};

return bpf(BPF_MAP_CREATE, &attr, sizeof(attr));
}
bpf() syscall creates a map of map_type type and given
attributes key_size, value_size, max_entries. On success it
returns process-local file descriptor. On error, -1 is returned
and errno is set to EINVAL or EPERM or ENOMEM.

The attributes key_size and value_size will be used by verifier
during program loading to check that program is calling
bpf_map_*_elem() helper functions with correctly initialized key
and that program doesn't access map element value beyond
specified value_size. For example, when map is created with
key_size = 8 and program does:
bpf_map_lookup_elem(map_fd, fp - 4)
such program will be rejected, since in-kernel helper function
bpf_map_lookup_elem(map_fd, void *key) expects to read 8 bytes
from 'key' pointer, but 'fp - 4' starting address will cause out
of bounds stack access.

Similarly, when map is created with value_size = 1 and program
does:
value = bpf_map_lookup_elem(...);
*(u32 *)value = 1;
such program will be rejected, since it accesses value pointer
beyond specified 1 byte value_size limit.

Currently only hash table map_type is supported:
enum bpf_map_type {
BPF_MAP_TYPE_UNSPEC,
BPF_MAP_TYPE_HASH,
};
map_type selects one of the available map implementations in
kernel. For all map_types eBPF programs access maps with the
same bpf_map_lookup_elem()/bpf_map_update_elem() helper
functions.

BPF_MAP_LOOKUP_ELEM
int bpf_lookup_elem(int fd, void *key, void *value)
{
union bpf_attr attr = {
.map_fd = fd,
.key = ptr_to_u64(key),
.value = ptr_to_u64(value),
};

return bpf(BPF_MAP_LOOKUP_ELEM, &attr, sizeof(attr));
}
bpf() syscall looks up an element with given key in a map fd.
If element is found it returns zero and stores element's value
into value. If element is not found it returns -1 and sets
errno to ENOENT.

BPF_MAP_UPDATE_ELEM
int bpf_update_elem(int fd, void *key, void *value)
{
union bpf_attr attr = {
.map_fd = fd,
.key = ptr_to_u64(key),
.value = ptr_to_u64(value),
};

return bpf(BPF_MAP_UPDATE_ELEM, &attr, sizeof(attr));
}
The call creates or updates element with given key/value in a
map fd. On success it returns zero. On error, -1 is returned
and errno is set to EINVAL or EPERM or ENOMEM or E2BIG. E2BIG
indicates that number of elements in the map reached max_entries
limit specified at map creation time.

BPF_MAP_DELETE_ELEM
int bpf_delete_elem(int fd, void *key)
{
union bpf_attr attr = {
.map_fd = fd,
.key = ptr_to_u64(key),
};

return bpf(BPF_MAP_DELETE_ELEM, &attr, sizeof(attr));
}
The call deletes an element in a map fd with given key. Returns
zero on success. If element is not found it returns -1 and sets
errno to ENOENT.

BPF_MAP_GET_NEXT_KEY
int bpf_get_next_key(int fd, void *key, void *next_key)
{
union bpf_attr attr = {
.map_fd = fd,
.key = ptr_to_u64(key),
.next_key = ptr_to_u64(next_key),
};

return bpf(BPF_MAP_GET_NEXT_KEY, &attr, sizeof(attr));
}
The call looks up an element by key in a given map fd and
returns key of the next element into next_key pointer. If key is
not found, it return zero and returns key of the first element
into next_key. If key is the last element, it returns -1 and
sets errno to ENOENT. Other possible errno values are ENOMEM,
EFAULT, EPERM, EINVAL. This method can be used to iterate over
all elements of the map.

close(map_fd)
will delete the map map_fd. Exiting process will delete all
maps automatically.

eBPF programs
BPF_PROG_LOAD
This cmd is used to load eBPF program into the kernel.

char bpf_log_buf[LOG_BUF_SIZE];

int bpf_prog_load(enum bpf_prog_type prog_type,
const struct bpf_insn *insns, int insn_cnt,
const char *license)
{
union bpf_attr attr = {
.prog_type = prog_type,
.insns = ptr_to_u64(insns),
.insn_cnt = insn_cnt,
.license = ptr_to_u64(license),
.log_buf = ptr_to_u64(bpf_log_buf),
.log_size = LOG_BUF_SIZE,
.log_level = 1,
};

return bpf(BPF_PROG_LOAD, &attr, sizeof(attr));
}
prog_type is one of the available program types:
enum bpf_prog_type {
BPF_PROG_TYPE_UNSPEC,
BPF_PROG_TYPE_SOCKET,
BPF_PROG_TYPE_TRACING,
};
By picking prog_type program author selects a set of helper
functions callable from eBPF program and corresponding format of
struct bpf_context (which is the data blob passed into the
program as the first argument). For example, the programs
loaded with prog_type = TYPE_TRACING may call bpf_printk()
helper, whereas TYPE_SOCKET programs may not. The set of
functions available to the programs under given type may
increase in the future.

Currently the set of functions for TYPE_TRACING is:
bpf_map_lookup_elem(map_fd, void *key) // lookup key in a map_fd
bpf_map_update_elem(map_fd, void *key, void *value) // update key/value
bpf_map_delete_elem(map_fd, void *key) // delete key in a map_fd
bpf_ktime_get_ns(void) // returns current ktime
bpf_printk(char *fmt, int fmt_size, ...) // prints into trace buffer
bpf_memcmp(void *ptr1, void *ptr2, int size) // non-faulting memcmp
bpf_fetch_ptr(void *ptr) // non-faulting load pointer from any address
bpf_fetch_u8(void *ptr) // non-faulting 1 byte load
bpf_fetch_u16(void *ptr) // other non-faulting loads
bpf_fetch_u32(void *ptr)
bpf_fetch_u64(void *ptr)

and bpf_context is defined as:
struct bpf_context {
/* argN fields match one to one to arguments passed to trace events */
u64 arg1, arg2, arg3, arg4, arg5, arg6;
/* return value from kretprobe event or from syscall_exit event */
u64 ret;
};

The set of helper functions for TYPE_SOCKET is TBD.

More program types may be added in the future. Like
BPF_PROG_TYPE_USER_TRACING for unprivileged programs.

BPF_PROG_TYPE_UNSPEC is used for testing only. Such programs
cannot be attached to events.

insns array of "struct bpf_insn" instructions

insn_cnt number of instructions in the program

license license string, which must be GPL compatible to call
helper functions marked gpl_only

log_buf user supplied buffer that in-kernel verifier is using to
store verification log. Log is a multi-line string that should
be used by program author to understand how verifier came to
conclusion that program is unsafe. The format of the output can
change at any time as verifier evolves.

log_size size of user buffer. If size of the buffer is not large
enough to store all verifier messages, -1 is returned and errno
is set to ENOSPC.

log_level verbosity level of eBPF verifier, where zero means no
logs provided

close(prog_fd)
will unload eBPF program

The maps are accesible from programs and generally tie the two
together. Programs process various events (like tracepoint, kprobe,
packets) and store the data into maps. User space fetches data from
maps. Either the same or a different map may be used by user space as
configuration space to alter program behavior on the fly.

Events
Once an eBPF program is loaded, it can be attached to an event. Various
kernel subsystems have different ways to do so. For example:

setsockopt(sock, SOL_SOCKET, SO_ATTACH_BPF, &prog_fd, sizeof(prog_fd));
will attach the program prog_fd to socket sock which was received by
prior call to socket().

ioctl(event_fd, PERF_EVENT_IOC_SET_BPF, prog_fd);
will attach the program prog_fd to perf event event_fd which was
received by prior call to perf_event_open().

Another way to attach the program to a tracing event is:
event_fd = open("/sys/kernel/debug/tracing/events/skb/kfree_skb/filter");
write(event_fd, "bpf-123"); /* where 123 is eBPF program FD */
/* here program is attached and will be triggered by events */
close(event_fd); /* to detach from event */

EXAMPLES
/* eBPF+sockets example:
* 1. create map with maximum of 2 elements
* 2. set map[6] = 0 and map[17] = 0
* 3. load eBPF program that counts number of TCP and UDP packets received
* via map[skb->ip->proto]++
* 4. attach prog_fd to raw socket via setsockopt()
* 5. print number of received TCP/UDP packets every second
*/
int main(int ac, char **av)
{
int sock, map_fd, prog_fd, key;
long long value = 0, tcp_cnt, udp_cnt;

map_fd = bpf_create_map(BPF_MAP_TYPE_HASH, sizeof(key), sizeof(value), 2);
if (map_fd < 0) {
printf("failed to create map '%s'\n", strerror(errno));
/* likely not run as root */
return 1;
}

key = 6; /* ip->proto == tcp */
assert(bpf_update_elem(map_fd, &key, &value) == 0);

key = 17; /* ip->proto == udp */
assert(bpf_update_elem(map_fd, &key, &value) == 0);

struct bpf_insn prog[] = {
BPF_MOV64_REG(BPF_REG_6, BPF_REG_1), /* r6 = r1 */
BPF_LD_ABS(BPF_B, 14 + 9), /* r0 = ip->proto */
BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_0, -4),/* *(u32 *)(fp - 4) = r0 */
BPF_MOV64_REG(BPF_REG_2, BPF_REG_10), /* r2 = fp */
BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -4), /* r2 = r2 - 4 */
BPF_LD_MAP_FD(BPF_REG_1, map_fd), /* r1 = map_fd */
BPF_CALL_FUNC(BPF_FUNC_map_lookup_elem), /* r0 = map_lookup(r1, r2) */
BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 2), /* if (r0 == 0) goto pc+2 */
BPF_MOV64_IMM(BPF_REG_1, 1), /* r1 = 1 */
BPF_XADD(BPF_DW, BPF_REG_0, BPF_REG_1, 0, 0), /* lock *(u64 *)r0 += r1 */
BPF_MOV64_IMM(BPF_REG_0, 0), /* r0 = 0 */
BPF_EXIT_INSN(), /* return r0 */
};
prog_fd = bpf_prog_load(BPF_PROG_TYPE_SOCKET, prog, sizeof(prog), "GPL");
assert(prog_fd >= 0);

sock = open_raw_sock("lo");

assert(setsockopt(sock, SOL_SOCKET, SO_ATTACH_BPF, &prog_fd,
sizeof(prog_fd)) == 0);

for (;;) {
key = 6;
assert(bpf_lookup_elem(map_fd, &key, &tcp_cnt) == 0);
key = 17;
assert(bpf_lookup_elem(map_fd, &key, &udp_cnt) == 0);
printf("TCP %lld UDP %lld packets0, tcp_cnt, udp_cnt);
sleep(1);
}

return 0;
}

RETURN VALUE
For a successful call, the return value depends on the operation:

BPF_MAP_CREATE
The new file descriptor associated with eBPF map.

BPF_PROG_LOAD
The new file descriptor associated with eBPF program.

All other commands
Zero.

On error, -1 is returned, and errno is set appropriately.

ERRORS
EPERM bpf() syscall was made without sufficient privilege (without the
CAP_SYS_ADMIN capability).

ENOMEM Cannot allocate sufficient memory.

EBADF fd is not an open file descriptor

EFAULT One of the pointers ( key or value or log_buf or insns ) is
outside accessible address space.

EINVAL The value specified in cmd is not recognized by this kernel.

EINVAL For BPF_MAP_CREATE, either map_type or attributes are invalid.

EINVAL For BPF_MAP_*_ELEM commands, some of the fields of "union
bpf_attr" unused by this command are not set to zero.

EINVAL For BPF_PROG_LOAD, attempt to load invalid program (unrecognized
instruction or uses reserved fields or jumps out of range or
loop detected or calls unknown function).

EACCES For BPF_PROG_LOAD, though program has valid instructions, it was
rejected, since it was deemed unsafe (may access disallowed
memory region or uninitialized stack/register or function
constraints don't match actual types or misaligned access). In
such case it is recommended to call bpf() again with log_level =
1 and examine log_buf for specific reason provided by verifier.

ENOENT For BPF_MAP_LOOKUP_ELEM or BPF_MAP_DELETE_ELEM, indicates that
element with given key was not found.

E2BIG program is too large or a map reached max_entries limit (max
number of elements).

NOTES
These commands may be used only by a privileged process (one having the
CAP_SYS_ADMIN capability).

SEE ALSO
eBPF architecture and instruction set is explained in
Documentation/networking/filter.txt

Linux 2014-09-16 BPF(2)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# 3c731eba 26-Sep-2014 Alexei Starovoitov <ast@plumgrid.com>

bpf: mini eBPF library, test stubs and verifier testsuite

1.
the library includes a trivial set of BPF syscall wrappers:
int bpf_create_map(int key_size, int value_size, int max_entries);
int bpf_up

bpf: mini eBPF library, test stubs and verifier testsuite

1.
the library includes a trivial set of BPF syscall wrappers:
int bpf_create_map(int key_size, int value_size, int max_entries);
int bpf_update_elem(int fd, void *key, void *value);
int bpf_lookup_elem(int fd, void *key, void *value);
int bpf_delete_elem(int fd, void *key);
int bpf_get_next_key(int fd, void *key, void *next_key);
int bpf_prog_load(enum bpf_prog_type prog_type,
const struct sock_filter_int *insns, int insn_len,
const char *license);
bpf_prog_load() stores verifier log into global bpf_log_buf[] array

and BPF_*() macros to build instructions

2.
test stubs configure eBPF infra with 'unspec' map and program types.
These are fake types used by user space testsuite only.

3.
verifier tests valid and invalid programs and expects predefined
error log messages from kernel.
40 tests so far.

$ sudo ./test_verifier
#0 add+sub+mul OK
#1 unreachable OK
#2 unreachable2 OK
#3 out of range jump OK
#4 out of range jump2 OK
#5 test1 ld_imm64 OK
...

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# 51580e79 26-Sep-2014 Alexei Starovoitov <ast@plumgrid.com>

bpf: verifier (add docs)

this patch adds all of eBPF verfier documentation and empty bpf_check()

The end goal for the verifier is to statically check safety of the program.

Verifier will catch:
-

bpf: verifier (add docs)

this patch adds all of eBPF verfier documentation and empty bpf_check()

The end goal for the verifier is to statically check safety of the program.

Verifier will catch:
- loops
- out of range jumps
- unreachable instructions
- invalid instructions
- uninitialized register access
- uninitialized stack access
- misaligned stack access
- out of range stack access
- invalid calling convention

More details in Documentation/networking/filter.txt

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# 99c55f7d 26-Sep-2014 Alexei Starovoitov <ast@plumgrid.com>

bpf: introduce BPF syscall and maps

BPF syscall is a multiplexor for a range of different operations on eBPF.
This patch introduces syscall with single command to create a map.
Next patch adds comma

bpf: introduce BPF syscall and maps

BPF syscall is a multiplexor for a range of different operations on eBPF.
This patch introduces syscall with single command to create a map.
Next patch adds commands to access maps.

'maps' is a generic storage of different types for sharing data between kernel
and userspace.

Userspace example:
/* this syscall wrapper creates a map with given type and attributes
* and returns map_fd on success.
* use close(map_fd) to delete the map
*/
int bpf_create_map(enum bpf_map_type map_type, int key_size,
int value_size, int max_entries)
{
union bpf_attr attr = {
.map_type = map_type,
.key_size = key_size,
.value_size = value_size,
.max_entries = max_entries
};

return bpf(BPF_MAP_CREATE, &attr, sizeof(attr));
}

'union bpf_attr' is backwards compatible with future extensions.

More details in Documentation/networking/filter.txt and in manpage

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


Revision tags: v3.17-rc6, v3.17-rc5
# db985cbd 14-Sep-2014 Thomas Gleixner <tglx@linutronix.de>

Merge tag 'irqchip-core-3.18' of git://git.infradead.org/users/jcooper/linux into irq/core

irqchip core changes for v3.18

- renesas: suspend to RAM, runtime PM, cleanups and DT binding docs
- key

Merge tag 'irqchip-core-3.18' of git://git.infradead.org/users/jcooper/linux into irq/core

irqchip core changes for v3.18

- renesas: suspend to RAM, runtime PM, cleanups and DT binding docs
- keystone: add new driver
- hip04: add Hisilicon HiP04 driver (without touching irq-gic.c)
- gic: Use defines instead of magic number, preserve v2 bybass bits
- handle_domain_irq: common low level interrupt entry handler

show more ...


# 336879b1 11-Sep-2014 Daniel Vetter <daniel.vetter@ffwll.ch>

Merge remote-tracking branch 'airlied/drm-next' into topic/vblank-rework

Dave asked me to do the backmerge before sending him the revised pull
request, so here we go. Nothing fancy in the conflicts,

Merge remote-tracking branch 'airlied/drm-next' into topic/vblank-rework

Dave asked me to do the backmerge before sending him the revised pull
request, so here we go. Nothing fancy in the conflicts, just a few
things changed right next to each another.

Conflicts:
drivers/gpu/drm/drm_irq.c

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

show more ...


# e2627dce 08-Sep-2014 Ingo Molnar <mingo@kernel.org>

Merge tag 'v3.17-rc4' into sched/core, to prevent conflicts with upcoming patches, and to refresh the tree

Linux 3.17-rc4


Revision tags: v3.17-rc4
# 516d5f8b 03-Sep-2014 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge tag 'v3.17-rc3' into next

Sync with mainline to bring in Chrome EC changes.


# 05244d16 03-Sep-2014 Takashi Iwai <tiwai@suse.de>

Merge tag 'asoc-v3.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v3.17

A few more driver specific fixes on top of the currently pending fixes

Merge tag 'asoc-v3.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v3.17

A few more driver specific fixes on top of the currently pending fixes
(which are already in your tree but not Linus').

show more ...


Revision tags: v3.17-rc3, v3.17-rc2
# 83bc90e1 24-Aug-2014 Ingo Molnar <mingo@kernel.org>

Merge branch 'linus' into perf/core, to fix conflicts

Conflicts:
arch/x86/kernel/cpu/perf_event_intel_uncore*.c

Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 80b304fd 22-Aug-2014 Ingo Molnar <mingo@kernel.org>

Merge tag 'efi-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into x86/urgent

Pull EFI fixes from Matt Fleming:

* WARN_ON(!spin_is_locked()) always triggers on non-SMP machi

Merge tag 'efi-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into x86/urgent

Pull EFI fixes from Matt Fleming:

* WARN_ON(!spin_is_locked()) always triggers on non-SMP machines.
Swap it for the more canonical lockdep_assert_held() which always
does the right thing - Guenter Roeck

* Assign the correct value to efi.runtime_version on arm64 so that all
the runtime services can be invoked - Semen Protsenko

Signed-off-by: Ingo Molnar <mingo@kernel.org>

show more ...


# 2871f352 21-Aug-2014 Mauro Carvalho Chehab <m.chehab@samsung.com>

Merge tag 'v3.17-rc1' into patchwork

Linux 3.17-rc1


# 5b49ab3e 19-Aug-2014 Brian Norris <computersforpeace@gmail.com>

Merge l2-mtd/next into l2-mtd/master


Revision tags: v3.17-rc1
# ae045e24 06-Aug-2014 Linus Torvalds <torvalds@linux-foundation.org>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next

Pull networking updates from David Miller:
"Highlights:

1) Steady transitioning of the BPF instructure to a generic spot so

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next

Pull networking updates from David Miller:
"Highlights:

1) Steady transitioning of the BPF instructure to a generic spot so
all kernel subsystems can make use of it, from Alexei Starovoitov.

2) SFC driver supports busy polling, from Alexandre Rames.

3) Take advantage of hash table in UDP multicast delivery, from David
Held.

4) Lighten locking, in particular by getting rid of the LRU lists, in
inet frag handling. From Florian Westphal.

5) Add support for various RFC6458 control messages in SCTP, from
Geir Ola Vaagland.

6) Allow to filter bridge forwarding database dumps by device, from
Jamal Hadi Salim.

7) virtio-net also now supports busy polling, from Jason Wang.

8) Some low level optimization tweaks in pktgen from Jesper Dangaard
Brouer.

9) Add support for ipv6 address generation modes, so that userland
can have some input into the process. From Jiri Pirko.

10) Consolidate common TCP connection request code in ipv4 and ipv6,
from Octavian Purdila.

11) New ARP packet logger in netfilter, from Pablo Neira Ayuso.

12) Generic resizable RCU hash table, with intial users in netlink and
nftables. From Thomas Graf.

13) Maintain a name assignment type so that userspace can see where a
network device name came from (enumerated by kernel, assigned
explicitly by userspace, etc.) From Tom Gundersen.

14) Automatic flow label generation on transmit in ipv6, from Tom
Herbert.

15) New packet timestamping facilities from Willem de Bruijn, meant to
assist in measuring latencies going into/out-of the packet
scheduler, latency from TCP data transmission to ACK, etc"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1536 commits)
cxgb4 : Disable recursive mailbox commands when enabling vi
net: reduce USB network driver config options.
tg3: Modify tg3_tso_bug() to handle multiple TX rings
amd-xgbe: Perform phy connect/disconnect at dev open/stop
amd-xgbe: Use dma_set_mask_and_coherent to set DMA mask
net: sun4i-emac: fix memory leak on bad packet
sctp: fix possible seqlock seadlock in sctp_packet_transmit()
Revert "net: phy: Set the driver when registering an MDIO bus device"
cxgb4vf: Turn off SGE RX/TX Callback Timers and interrupts in PCI shutdown routine
team: Simplify return path of team_newlink
bridge: Update outdated comment on promiscuous mode
net-timestamp: ACK timestamp for bytestreams
net-timestamp: TCP timestamping
net-timestamp: SCHED timestamp on entering packet scheduler
net-timestamp: add key to disambiguate concurrent datagrams
net-timestamp: move timestamp flags out of sk_flags
net-timestamp: extend SCM_TIMESTAMPING ancillary data struct
cxgb4i : Move stray CPL definitions to cxgb4 driver
tcp: reduce spurious retransmits due to transient SACK reneging
qlcnic: Initialize dcbnl_ops before register_netdev
...

show more ...


Revision tags: v3.16, v3.16-rc7
# f6e67532 23-Jul-2014 David S. Miller <davem@davemloft.net>

Merge branch 'filter-move'

Alexei Starovoitov says:

====================
I believe my recent set of RFC/patches [1] provided good visibility on where
I would like to take eBPF subsystem. These two

Merge branch 'filter-move'

Alexei Starovoitov says:

====================
I believe my recent set of RFC/patches [1] provided good visibility on where
I would like to take eBPF subsystem. These two trivial patches is a first step
in that direction:
patch 1 - mechanical split of eBPF interpreter out of filter.c
patch 2 - nominate myself as a maintainer for eBPF core pieces
In the foreseeable future eBPF patches will be going through net-next,
so put netdev as a primary mailing list

[1] git://git.kernel.org/pub/scm/linux/kernel/git/ast/bpf master
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# f5bffecd 23-Jul-2014 Alexei Starovoitov <ast@plumgrid.com>

net: filter: split filter.c into two files

BPF is used in several kernel components. This split creates logical boundary
between generic eBPF core and the rest

kernel/bpf/core.c: eBPF interpreter

net: filter: split filter.c into two files

BPF is used in several kernel components. This split creates logical boundary
between generic eBPF core and the rest

kernel/bpf/core.c: eBPF interpreter

net/core/filter.c: classic->eBPF converter, classic verifiers, socket filters

This patch only moves functions.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


1...<<212223