History log of /openbmc/linux/net/xdp/Makefile (Results 51 – 75 of 76)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v5.0.4
# 249acb5f 19-Mar-2019 Mark Brown <broonie@kernel.org>

Merge tag 'v5.1-rc1' into spi-5.2

Linux 5.1-rc1


# c9e48084 19-Mar-2019 Mark Brown <broonie@kernel.org>

Merge tag 'v5.1-rc1' into regulator-5.2

Linux 5.1-rc1


Revision tags: v5.0.3
# 12747059 18-Mar-2019 Mark Brown <broonie@kernel.org>

Merge tag 'v5.1-rc1' into asoc-5.2

Linux 5.1-rc1


# c3bdd5e6 22-Apr-2019 Saeed Mahameed <saeedm@mellanox.com>

Merge tag 'v5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into mlx5-next

Linux 5.1-rc1

We forgot to reset the branch last merge window thus mlx5-next is outdated
and stil

Merge tag 'v5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into mlx5-next

Linux 5.1-rc1

We forgot to reset the branch last merge window thus mlx5-next is outdated
and still based on 5.0-rc2. This merge commit is needed to sync mlx5-next
branch with 5.1-rc1.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

show more ...


# 9b0dcd0e 11-Apr-2019 Takashi Iwai <tiwai@suse.de>

Merge tag 'asoc-fix-v5.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v5.1

A few core fixes along with the driver specific ones, mainly fixing

Merge tag 'asoc-fix-v5.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v5.1

A few core fixes along with the driver specific ones, mainly fixing
small issues that only affect x86 platforms for various reasons (their
unusual machine enumeration mechanisms mainly, plus a fix for error
handling in topology).

There's some of the driver fixes that look larger than they are, like
the hdmi-codec changes which resulted in an indentation change, and most
of the other large changes are for new drivers like the STM32 changes.

show more ...


# 22d91ed3 18-Mar-2019 Mark Brown <broonie@kernel.org>

Merge tag 'v5.1-rc1' into asoc-5.1

Linux 5.1-rc1


Revision tags: v4.19.29, v5.0.2, v4.19.28, v5.0.1, v4.19.27
# 64563003 05-Mar-2019 Linus Torvalds <torvalds@linux-foundation.org>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next

Pull networking updates from David Miller:
"Here we go, another merge window full of networking and #ebpf changes:

1) Snoop DH

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next

Pull networking updates from David Miller:
"Here we go, another merge window full of networking and #ebpf changes:

1) Snoop DHCPACKS in batman-adv to learn MAC/IP pairs in the DHCP
range without dealing with floods of ARP traffic, from Linus
Lüssing.

2) Throttle buffered multicast packet transmission in mt76, from
Felix Fietkau.

3) Support adaptive interrupt moderation in ice, from Brett Creeley.

4) A lot of struct_size conversions, from Gustavo A. R. Silva.

5) Add peek/push/pop commands to bpftool, as well as bash completion,
from Stanislav Fomichev.

6) Optimize sk_msg_clone(), from Vakul Garg.

7) Add SO_BINDTOIFINDEX, from David Herrmann.

8) Be more conservative with local resends due to local congestion,
from Yuchung Cheng.

9) Allow vetoing of unsupported VXLAN FDBs, from Petr Machata.

10) Add health buffer support to devlink, from Eran Ben Elisha.

11) Add TXQ scheduling API to mac80211, from Toke Høiland-Jørgensen.

12) Add statistics to basic packet scheduler filter, from Cong Wang.

13) Add GRE tunnel support for mlxsw Spectrum-2, from Nir Dotan.

14) Lots of new IP tunneling forwarding tests, also from Nir Dotan.

15) Add 3ad stats to bonding, from Nikolay Aleksandrov.

16) Lots of probing improvements for bpftool, from Quentin Monnet.

17) Various nfp drive #ebpf JIT improvements from Jakub Kicinski.

18) Allow #ebpf programs to access gso_segs from skb shared info, from
Eric Dumazet.

19) Add sock_diag support for AF_XDP sockets, from Björn Töpel.

20) Support 22260 iwlwifi devices, from Luca Coelho.

21) Use rbtree for ipv6 defragmentation, from Peter Oskolkov.

22) Add JMP32 instruction class support to #ebpf, from Jiong Wang.

23) Add spinlock support to #ebpf, from Alexei Starovoitov.

24) Support 256-bit keys and TLS 1.3 in ktls, from Dave Watson.

25) Add device infomation API to devlink, from Jakub Kicinski.

26) Add new timestamping socket options which are y2038 safe, from
Deepa Dinamani.

27) Add RX checksum offloading for various sh_eth chips, from Sergei
Shtylyov.

28) Flow offload infrastructure, from Pablo Neira Ayuso.

29) Numerous cleanups, improvements, and bug fixes to the PHY layer
and many drivers from Heiner Kallweit.

30) Lots of changes to try and make packet scheduler classifiers run
lockless as much as possible, from Vlad Buslov.

31) Support BCM957504 chip in bnxt_en driver, from Erik Burrows.

32) Add concurrency tests to tc-tests infrastructure, from Vlad
Buslov.

33) Add hwmon support to aquantia, from Heiner Kallweit.

34) Allow 64-bit values for SO_MAX_PACING_RATE, from Eric Dumazet.

And I would be remiss if I didn't thank the various major networking
subsystem maintainers for integrating much of this work before I even
saw it. Alexei Starovoitov, Daniel Borkmann, Pablo Neira Ayuso,
Johannes Berg, Kalle Valo, and many others. Thank you!"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2207 commits)
net/sched: avoid unused-label warning
net: ignore sysctl_devconf_inherit_init_net without SYSCTL
phy: mdio-mux: fix Kconfig dependencies
net: phy: use phy_modify_mmd_changed in genphy_c45_an_config_aneg
net: dsa: mv88e6xxx: add call to mv88e6xxx_ports_cmode_init to probe for new DSA framework
selftest/net: Remove duplicate header
sky2: Disable MSI on Dell Inspiron 1545 and Gateway P-79
net/mlx5e: Update tx reporter status in case channels were successfully opened
devlink: Add support for direct reporter health state update
devlink: Update reporter state to error even if recover aborted
sctp: call iov_iter_revert() after sending ABORT
team: Free BPF filter when unregistering netdev
ip6mr: Do not call __IP6_INC_STATS() from preemptible context
isdn: mISDN: Fix potential NULL pointer dereference of kzalloc
net: dsa: mv88e6xxx: support in-band signalling on SGMII ports with external PHYs
cxgb4/chtls: Prefix adapter flags with CXGB4
net-sysfs: Switch to bitmap_zalloc()
mellanox: Switch to bitmap_zalloc()
bpf: add test cases for non-pointer sanitiation logic
mlxsw: i2c: Extend initialization by querying resources data
...

show more ...


Revision tags: v5.0, v4.19.26, v4.19.25, v4.19.24, v4.19.23, v4.19.22, v4.19.21, v4.19.20
# 752cfee9 01-Feb-2019 Johannes Berg <johannes.berg@intel.com>

Merge remote-tracking branch 'net-next/master' into mac80211-next

Merge net-next so that we get the changes from net, which would
otherwise conflict with the NLA_POLICY_NESTED/_ARRAY changes.

Signe

Merge remote-tracking branch 'net-next/master' into mac80211-next

Merge net-next so that we get the changes from net, which would
otherwise conflict with the NLA_POLICY_NESTED/_ARRAY changes.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>

show more ...


Revision tags: v4.19.19
# ec7146db 28-Jan-2019 David S. Miller <davem@davemloft.net>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2019-01-29

The following pull-request contains BPF updates for yo

Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2019-01-29

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Teach verifier dead code removal, this also allows for optimizing /
removing conditional branches around dead code and to shrink the
resulting image. Code store constrained architectures like nfp would
have hard time doing this at JIT level, from Jakub.

2) Add JMP32 instructions to BPF ISA in order to allow for optimizing
code generation for 32-bit sub-registers. Evaluation shows that this
can result in code reduction of ~5-20% compared to 64 bit-only code
generation. Also add implementation for most JITs, from Jiong.

3) Add support for __int128 types in BTF which is also needed for
vmlinux's BTF conversion to work, from Yonghong.

4) Add a new command to bpftool in order to dump a list of BPF-related
parameters from the system or for a specific network device e.g. in
terms of available prog/map types or helper functions, from Quentin.

5) Add AF_XDP sock_diag interface for querying sockets from user
space which provides information about the RX/TX/fill/completion
rings, umem, memory usage etc, from Björn.

6) Add skb context access for skb_shared_info->gso_segs field, from Eric.

7) Add support for testing flow dissector BPF programs by extending
existing BPF_PROG_TEST_RUN infrastructure, from Stanislav.

8) Split BPF kselftest's test_verifier into various subgroups of tests
in order better deal with merge conflicts in this area, from Jakub.

9) Add support for queue/stack manipulations in bpftool, from Stanislav.

10) Document BTF, from Yonghong.

11) Dump supported ELF section names in libbpf on program load
failure, from Taeung.

12) Silence a false positive compiler warning in verifier's BTF
handling, from Peter.

13) Fix help string in bpftool's feature probing, from Prashant.

14) Remove duplicate includes in BPF kselftests, from Yue.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


Revision tags: v4.19.18
# dbbd79ae 24-Jan-2019 Daniel Borkmann <daniel@iogearbox.net>

Merge branch 'af-xdp-sock-diag'

Björn Töpel says:

====================
This series adds an AF_XDP sock_diag interface for querying sockets
from user-space. Tools like iproute2 ss(8) can use this in

Merge branch 'af-xdp-sock-diag'

Björn Töpel says:

====================
This series adds an AF_XDP sock_diag interface for querying sockets
from user-space. Tools like iproute2 ss(8) can use this interface to
list open AF_XDP sockets.

The diagnostic provides information about the Rx/Tx/fill/completetion
rings, umem, memory usage and such. For a complete list, please refer
to the xsk_diag.c file.

The AF_XDP sock_diag interface is optional, and can be built as a
module. A separate patch series, adding ss(8) iproute2 support, will
follow.

v1->v2:
* Removed extra newline
* Zero-out all user-space facing structures prior setting the
members
* Added explicit "pad" member in _msg struct
* Removed unused variable "req" in xsk_diag_handler_dump()

Thanks to Daniel for reviewing the series!
====================

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

show more ...


# a36b38aa 24-Jan-2019 Björn Töpel <bjorn.topel@intel.com>

xsk: add sock_diag interface for AF_XDP

This patch adds the sock_diag interface for querying sockets from user
space. Tools like iproute2 ss(8) can use this interface to list open
AF_XDP sockets.

T

xsk: add sock_diag interface for AF_XDP

This patch adds the sock_diag interface for querying sockets from user
space. Tools like iproute2 ss(8) can use this interface to list open
AF_XDP sockets.

The user-space ABI is defined in linux/xdp_diag.h and includes netlink
request and response structs. The request can query sockets and the
response contains socket information about the rings, umems, inode and
more.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

show more ...


Revision tags: v4.19.17, v4.19.16
# 3eb66e91 14-Jan-2019 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge tag 'v4.20' into for-linus

Sync with mainline to get linux/overflow.h among other things.


# 4116941b 14-Jan-2019 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge tag 'v4.20' into next

Merge with mainline to bring in the new APIs.


Revision tags: v4.19.15, v4.19.14, v4.19.13, v4.19.12, v4.19.11, v4.19.10, v4.19.9, v4.19.8, v4.19.7, v4.19.6, v4.19.5, v4.19.4, v4.18.20, v4.19.3, v4.18.19, v4.19.2, v4.18.18, v4.18.17, v4.19.1, v4.19, v4.18.16, v4.18.15, v4.18.14, v4.18.13, v4.18.12, v4.18.11, v4.18.10, v4.18.9, v4.18.7, v4.18.6, v4.18.5, v4.17.18, v4.18.4, v4.18.3, v4.17.17, v4.18.2, v4.17.16, v4.17.15, v4.18.1, v4.18, v4.17.14, v4.17.13, v4.17.12, v4.17.11, v4.17.10
# c74a7469 23-Jul-2018 Rodrigo Vivi <rodrigo.vivi@intel.com>

Merge drm/drm-next into drm-intel-next-queued

We need a backmerge to get DP_DPCD_REV_14 before we push other
i915 changes to dinq that could break compilation.

Signed-off-by: Rodrigo Vivi <rodrigo.

Merge drm/drm-next into drm-intel-next-queued

We need a backmerge to get DP_DPCD_REV_14 before we push other
i915 changes to dinq that could break compilation.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

show more ...


Revision tags: v4.17.9, v4.17.8, v4.17.7, v4.17.6, v4.17.5, v4.17.4
# 6b16f5d1 28-Jun-2018 Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>

Merge tag 'v4.18-rc2' of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into fbdev-for-next

Linux 4.18-rc2


Revision tags: v4.17.3
# 57b54d74 25-Jun-2018 James Morris <james.morris@microsoft.com>

Merge tag 'v4.18-rc2' into next-general

Merge to Linux 4.18-rc2 for security subsystem developers.


# 7731b8bc 22-Jun-2018 Thomas Gleixner <tglx@linutronix.de>

Merge branch 'linus' into x86/urgent

Required to queue a dependent fix.


Revision tags: v4.17.2, v4.17.1
# 1c8c5a9d 06-Jun-2018 Linus Torvalds <torvalds@linux-foundation.org>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next

Pull networking updates from David Miller:

1) Add Maglev hashing scheduler to IPVS, from Inju Song.

2) Lots of new TC subsystem

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next

Pull networking updates from David Miller:

1) Add Maglev hashing scheduler to IPVS, from Inju Song.

2) Lots of new TC subsystem tests from Roman Mashak.

3) Add TCP zero copy receive and fix delayed acks and autotuning with
SO_RCVLOWAT, from Eric Dumazet.

4) Add XDP_REDIRECT support to mlx5 driver, from Jesper Dangaard
Brouer.

5) Add ttl inherit support to vxlan, from Hangbin Liu.

6) Properly separate ipv6 routes into their logically independant
components. fib6_info for the routing table, and fib6_nh for sets of
nexthops, which thus can be shared. From David Ahern.

7) Add bpf_xdp_adjust_tail helper, which can be used to generate ICMP
messages from XDP programs. From Nikita V. Shirokov.

8) Lots of long overdue cleanups to the r8169 driver, from Heiner
Kallweit.

9) Add BTF ("BPF Type Format"), from Martin KaFai Lau.

10) Add traffic condition monitoring to iwlwifi, from Luca Coelho.

11) Plumb extack down into fib_rules, from Roopa Prabhu.

12) Add Flower classifier offload support to igb, from Vinicius Costa
Gomes.

13) Add UDP GSO support, from Willem de Bruijn.

14) Add documentation for eBPF helpers, from Quentin Monnet.

15) Add TLS tx offload to mlx5, from Ilya Lesokhin.

16) Allow applications to be given the number of bytes available to read
on a socket via a control message returned from recvmsg(), from
Soheil Hassas Yeganeh.

17) Add x86_32 eBPF JIT compiler, from Wang YanQing.

18) Add AF_XDP sockets, with zerocopy support infrastructure as well.
From Björn Töpel.

19) Remove indirect load support from all of the BPF JITs and handle
these operations in the verifier by translating them into native BPF
instead. From Daniel Borkmann.

20) Add GRO support to ipv6 gre tunnels, from Eran Ben Elisha.

21) Allow XDP programs to do lookups in the main kernel routing tables
for forwarding. From David Ahern.

22) Allow drivers to store hardware state into an ELF section of kernel
dump vmcore files, and use it in cxgb4. From Rahul Lakkireddy.

23) Various RACK and loss detection improvements in TCP, from Yuchung
Cheng.

24) Add TCP SACK compression, from Eric Dumazet.

25) Add User Mode Helper support and basic bpfilter infrastructure, from
Alexei Starovoitov.

26) Support ports and protocol values in RTM_GETROUTE, from Roopa
Prabhu.

27) Support bulking in ->ndo_xdp_xmit() API, from Jesper Dangaard
Brouer.

28) Add lots of forwarding selftests, from Petr Machata.

29) Add generic network device failover driver, from Sridhar Samudrala.

* ra.kernel.org:/pub/scm/linux/kernel/git/davem/net-next: (1959 commits)
strparser: Add __strp_unpause and use it in ktls.
rxrpc: Fix terminal retransmission connection ID to include the channel
net: hns3: Optimize PF CMDQ interrupt switching process
net: hns3: Fix for VF mailbox receiving unknown message
net: hns3: Fix for VF mailbox cannot receiving PF response
bnx2x: use the right constant
Revert "net: sched: cls: Fix offloading when ingress dev is vxlan"
net: dsa: b53: Fix for brcm tag issue in Cygnus SoC
enic: fix UDP rss bits
netdev-FAQ: clarify DaveM's position for stable backports
rtnetlink: validate attributes in do_setlink()
mlxsw: Add extack messages for port_{un, }split failures
netdevsim: Add extack error message for devlink reload
devlink: Add extack to reload and port_{un, }split operations
net: metrics: add proper netlink validation
ipmr: fix error path when ipmr_new_table fails
ip6mr: only set ip6mr_table from setsockopt when ip6mr_new_table succeeds
net: hns3: remove unused hclgevf_cfg_func_mta_filter
netfilter: provide udp*_lib_lookup for nf_tproxy
qed*: Utilize FW 8.37.2.0
...

show more ...


Revision tags: v4.17
# 90fed9c9 24-May-2018 David S. Miller <davem@davemloft.net>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Alexei Starovoitov says:

====================
pull-request: bpf-next 2018-05-24

The following pull-request contains BPF updates for

Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Alexei Starovoitov says:

====================
pull-request: bpf-next 2018-05-24

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Björn Töpel cleans up AF_XDP (removes rebind, explicit cache alignment from uapi, etc).

2) David Ahern adds mtu checks to bpf_ipv{4,6}_fib_lookup() helpers.

3) Jesper Dangaard Brouer adds bulking support to ndo_xdp_xmit.

4) Jiong Wang adds support for indirect and arithmetic shifts to NFP

5) Martin KaFai Lau cleans up BTF uapi and makes the btf_header extensible.

6) Mathieu Xhonneux adds an End.BPF action to seg6local with BPF helpers allowing
to edit/grow/shrink a SRH and apply on a packet generic SRv6 actions.

7) Sandipan Das adds support for bpf2bpf function calls in ppc64 JIT.

8) Yonghong Song adds BPF_TASK_FD_QUERY command for introspection of tracing events.

9) other misc fixes from Gustavo A. R. Silva, Sirio Balmelli, John Fastabend, and Magnus Karlsson
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# dd8070bf 23-May-2018 Johannes Berg <johannes.berg@intel.com>

Merge remote-tracking branch 'net-next/master' into mac80211-next

Bring in net-next which had pulled in net, so I have the changes
from mac80211 and can apply a patch that would otherwise conflict.

Merge remote-tracking branch 'net-next/master' into mac80211-next

Bring in net-next which had pulled in net, so I have the changes
from mac80211 and can apply a patch that would otherwise conflict.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>

show more ...


# 82f9e2d5 18-May-2018 Daniel Borkmann <daniel@iogearbox.net>

Merge branch 'bpf-af-xdp-cleanups'

Björn Töpel says:

====================
This series contain "cosmetics only" follow-up patches for AF_XDP.

Thanks to Daniel for suggesting them!
=================

Merge branch 'bpf-af-xdp-cleanups'

Björn Töpel says:

====================
This series contain "cosmetics only" follow-up patches for AF_XDP.

Thanks to Daniel for suggesting them!
====================

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

show more ...


# 54b85c27 18-May-2018 Björn Töpel <bjorn.topel@intel.com>

xsk: remove newline at end of file

Minor cleanup, remove newline at end of Makefile.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# 01adc485 07-May-2018 David S. Miller <davem@davemloft.net>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Minor conflict, a CHECK was placed into an if() statement
in net-next, whilst a newline was added to that CHECK
call in 'net'. Thank

Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Minor conflict, a CHECK was placed into an if() statement
in net-next, whilst a newline was added to that CHECK
call in 'net'. Thanks to Daniel for the merge resolution.

Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# 08dbc7a6 03-May-2018 Alexei Starovoitov <ast@kernel.org>

Merge branch 'AF_XDP-initial-support'

Björn Töpel says:

====================
This patch set introduces a new address family called AF_XDP that is
optimized for high performance packet processing an

Merge branch 'AF_XDP-initial-support'

Björn Töpel says:

====================
This patch set introduces a new address family called AF_XDP that is
optimized for high performance packet processing and, in upcoming
patch sets, zero-copy semantics. In this patch set, we have removed
all zero-copy related code in order to make it smaller, simpler and
hopefully more review friendly. This patch set only supports copy-mode
for the generic XDP path (XDP_SKB) for both RX and TX and copy-mode
for RX using the XDP_DRV path. Zero-copy support requires XDP and
driver changes that Jesper Dangaard Brouer is working on. Some of his
work has already been accepted. We will publish our zero-copy support
for RX and TX on top of his patch sets at a later point in time.

An AF_XDP socket (XSK) is created with the normal socket()
syscall. Associated with each XSK are two queues: the RX queue and the
TX queue. A socket can receive packets on the RX queue and it can send
packets on the TX queue. These queues are registered and sized with
the setsockopts XDP_RX_RING and XDP_TX_RING, respectively. It is
mandatory to have at least one of these queues for each socket. In
contrast to AF_PACKET V2/V3 these descriptor queues are separated from
packet buffers. An RX or TX descriptor points to a data buffer in a
memory area called a UMEM. RX and TX can share the same UMEM so that a
packet does not have to be copied between RX and TX. Moreover, if a
packet needs to be kept for a while due to a possible retransmit, the
descriptor that points to that packet can be changed to point to
another and reused right away. This again avoids copying data.

This new dedicated packet buffer area is call a UMEM. It consists of a
number of equally size frames and each frame has a unique frame id. A
descriptor in one of the queues references a frame by referencing its
frame id. The user space allocates memory for this UMEM using whatever
means it feels is most appropriate (malloc, mmap, huge pages,
etc). This memory area is then registered with the kernel using the new
setsockopt XDP_UMEM_REG. The UMEM also has two queues: the FILL queue
and the COMPLETION queue. The fill queue is used by the application to
send down frame ids for the kernel to fill in with RX packet
data. References to these frames will then appear in the RX queue of
the XSK once they have been received. The completion queue, on the
other hand, contains frame ids that the kernel has transmitted
completely and can now be used again by user space, for either TX or
RX. Thus, the frame ids appearing in the completion queue are ids that
were previously transmitted using the TX queue. In summary, the RX and
FILL queues are used for the RX path and the TX and COMPLETION queues
are used for the TX path.

The socket is then finally bound with a bind() call to a device and a
specific queue id on that device, and it is not until bind is
completed that traffic starts to flow. Note that in this patch set,
all packet data is copied out to user-space.

A new feature in this patch set is that the UMEM can be shared between
processes, if desired. If a process wants to do this, it simply skips
the registration of the UMEM and its corresponding two queues, sets a
flag in the bind call and submits the XSK of the process it would like
to share UMEM with as well as its own newly created XSK socket. The
new process will then receive frame id references in its own RX queue
that point to this shared UMEM. Note that since the queue structures
are single-consumer / single-producer (for performance reasons), the
new process has to create its own socket with associated RX and TX
queues, since it cannot share this with the other process. This is
also the reason that there is only one set of FILL and COMPLETION
queues per UMEM. It is the responsibility of a single process to
handle the UMEM. If multiple-producer / multiple-consumer queues are
implemented in the future, this requirement could be relaxed.

How is then packets distributed between these two XSK? We have
introduced a new BPF map called XSKMAP (or BPF_MAP_TYPE_XSKMAP in
full). The user-space application can place an XSK at an arbitrary
place in this map. The XDP program can then redirect a packet to a
specific index in this map and at this point XDP validates that the
XSK in that map was indeed bound to that device and queue number. If
not, the packet is dropped. If the map is empty at that index, the
packet is also dropped. This also means that it is currently mandatory
to have an XDP program loaded (and one XSK in the XSKMAP) to be able
to get any traffic to user space through the XSK.

AF_XDP can operate in two different modes: XDP_SKB and XDP_DRV. If the
driver does not have support for XDP, or XDP_SKB is explicitly chosen
when loading the XDP program, XDP_SKB mode is employed that uses SKBs
together with the generic XDP support and copies out the data to user
space. A fallback mode that works for any network device. On the other
hand, if the driver has support for XDP, it will be used by the AF_XDP
code to provide better performance, but there is still a copy of the
data into user space.

There is a xdpsock benchmarking/test application included that
demonstrates how to use AF_XDP sockets with both private and shared
UMEMs. Say that you would like your UDP traffic from port 4242 to end
up in queue 16, that we will enable AF_XDP on. Here, we use ethtool
for this:

ethtool -N p3p2 rx-flow-hash udp4 fn
ethtool -N p3p2 flow-type udp4 src-port 4242 dst-port 4242 \
action 16

Running the rxdrop benchmark in XDP_DRV mode can then be done
using:

samples/bpf/xdpsock -i p3p2 -q 16 -r -N

For XDP_SKB mode, use the switch "-S" instead of "-N" and all options
can be displayed with "-h", as usual.

We have run some benchmarks on a dual socket system with two Broadwell
E5 2660 @ 2.0 GHz with hyperthreading turned off. Each socket has 14
cores which gives a total of 28, but only two cores are used in these
experiments. One for TR/RX and one for the user space application. The
memory is DDR4 @ 2133 MT/s (1067 MHz) and the size of each DIMM is
8192MB and with 8 of those DIMMs in the system we have 64 GB of total
memory. The compiler used is gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0. The
NIC is Intel I40E 40Gbit/s using the i40e driver.

Below are the results in Mpps of the I40E NIC benchmark runs for 64
and 1500 byte packets, generated by a commercial packet generator HW
outputing packets at full 40 Gbit/s line rate. The results are without
retpoline so that we can compare against previous numbers. With
retpoline, the AF_XDP numbers drop with between 10 - 15 percent.

AF_XDP performance 64 byte packets. Results from V2 in parenthesis.
Benchmark XDP_SKB XDP_DRV
rxdrop 2.9(3.0) 9.6(9.5)
txpush 2.6(2.5) NA*
l2fwd 1.9(1.9) 2.5(2.5) (TX using XDP_SKB in both cases)

AF_XDP performance 1500 byte packets:
Benchmark XDP_SKB XDP_DRV
rxdrop 2.1(2.2) 3.3(3.3)
l2fwd 1.4(1.4) 1.8(1.8) (TX using XDP_SKB in both cases)

* NA since we have no support for TX using the XDP_DRV infrastructure
in this patch set. This is for a future patch set since it involves
changes to the XDP NDOs. Some of this has been upstreamed by Jesper
Dangaard Brouer.

XDP performance on our system as a base line:

64 byte packets:
XDP stats CPU pps issue-pps
XDP-RX CPU 16 32.3(32.9)M 0

1500 byte packets:
XDP stats CPU pps issue-pps
XDP-RX CPU 16 3.3(3.3)M 0

Changes from V2:

* Fixed a race in XSKMAP map found by Will. The code has been
completely rearchitected and is now simpler, faster, and hopefully
also not racy. Please review and check if it holds.

If you would like to diff V2 against V3, you can find them here:
https://github.com/bjoto/linux/tree/af-xdp-v2-on-bpf-next
https://github.com/bjoto/linux/tree/af-xdp-v3-on-bpf-next

The structure of the patch set is as follows:

Patches 1-3: Basic socket and umem plumbing
Patches 4-9: RX support together with the new XSKMAP
Patches 10-13: TX support
Patch 14: Statistics support with getsockopt()
Patch 15: Sample application

We based this patch set on bpf-next commit a3fe1f6f2ada ("tools:
bpftool: change time format for program 'loaded at:' information")

To do for this patch set:

* Syzkaller torture session being worked on

Post-series plan:

* Optimize performance

* Kernel selftest

* Kernel load module support of AF_XDP would be nice. Unclear how to
achieve this though since our XDP code depends on net/core.

* Support for AF_XDP sockets without an XPD program loaded. In this
case all the traffic on a queue should go up to the user space socket.

* Daniel Borkmann's suggestion for a "copy to XDP socket, and return
XDP_PASS" for a tcpdump-like functionality.

* And of course getting to zero-copy support in small increments,
starting with TX then adding RX.

Thanks: Björn and Magnus
====================

Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


# 423f3832 02-May-2018 Magnus Karlsson <magnus.karlsson@intel.com>

xsk: add umem fill queue support and mmap

Here, we add another setsockopt for registered user memory (umem)
called XDP_UMEM_FILL_QUEUE. Using this socket option, the process can
ask the kernel to al

xsk: add umem fill queue support and mmap

Here, we add another setsockopt for registered user memory (umem)
called XDP_UMEM_FILL_QUEUE. Using this socket option, the process can
ask the kernel to allocate a queue (ring buffer) and also mmap it
(XDP_UMEM_PGOFF_FILL_QUEUE) into the process.

The queue is used to explicitly pass ownership of umem frames from the
user process to the kernel. These frames will in a later patch be
filled in with Rx packet data by the kernel.

v2: Fixed potential crash in xsk_mmap.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


1234