Revision tags: v5.0.4 |
|
#
249acb5f |
| 19-Mar-2019 |
Mark Brown <broonie@kernel.org> |
Merge tag 'v5.1-rc1' into spi-5.2
Linux 5.1-rc1
|
#
c9e48084 |
| 19-Mar-2019 |
Mark Brown <broonie@kernel.org> |
Merge tag 'v5.1-rc1' into regulator-5.2
Linux 5.1-rc1
|
Revision tags: v5.0.3 |
|
#
12747059 |
| 18-Mar-2019 |
Mark Brown <broonie@kernel.org> |
Merge tag 'v5.1-rc1' into asoc-5.2
Linux 5.1-rc1
|
#
c3bdd5e6 |
| 22-Apr-2019 |
Saeed Mahameed <saeedm@mellanox.com> |
Merge tag 'v5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into mlx5-next
Linux 5.1-rc1
We forgot to reset the branch last merge window thus mlx5-next is outdated and stil
Merge tag 'v5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into mlx5-next
Linux 5.1-rc1
We forgot to reset the branch last merge window thus mlx5-next is outdated and still based on 5.0-rc2. This merge commit is needed to sync mlx5-next branch with 5.1-rc1.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
show more ...
|
#
9b0dcd0e |
| 11-Apr-2019 |
Takashi Iwai <tiwai@suse.de> |
Merge tag 'asoc-fix-v5.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v5.1
A few core fixes along with the driver specific ones, mainly fixing
Merge tag 'asoc-fix-v5.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v5.1
A few core fixes along with the driver specific ones, mainly fixing small issues that only affect x86 platforms for various reasons (their unusual machine enumeration mechanisms mainly, plus a fix for error handling in topology).
There's some of the driver fixes that look larger than they are, like the hdmi-codec changes which resulted in an indentation change, and most of the other large changes are for new drivers like the STM32 changes.
show more ...
|
#
22d91ed3 |
| 18-Mar-2019 |
Mark Brown <broonie@kernel.org> |
Merge tag 'v5.1-rc1' into asoc-5.1
Linux 5.1-rc1
|
Revision tags: v4.19.29, v5.0.2, v4.19.28, v5.0.1, v4.19.27 |
|
#
64563003 |
| 05-Mar-2019 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller: "Here we go, another merge window full of networking and #ebpf changes:
1) Snoop DH
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller: "Here we go, another merge window full of networking and #ebpf changes:
1) Snoop DHCPACKS in batman-adv to learn MAC/IP pairs in the DHCP range without dealing with floods of ARP traffic, from Linus Lüssing.
2) Throttle buffered multicast packet transmission in mt76, from Felix Fietkau.
3) Support adaptive interrupt moderation in ice, from Brett Creeley.
4) A lot of struct_size conversions, from Gustavo A. R. Silva.
5) Add peek/push/pop commands to bpftool, as well as bash completion, from Stanislav Fomichev.
6) Optimize sk_msg_clone(), from Vakul Garg.
7) Add SO_BINDTOIFINDEX, from David Herrmann.
8) Be more conservative with local resends due to local congestion, from Yuchung Cheng.
9) Allow vetoing of unsupported VXLAN FDBs, from Petr Machata.
10) Add health buffer support to devlink, from Eran Ben Elisha.
11) Add TXQ scheduling API to mac80211, from Toke Høiland-Jørgensen.
12) Add statistics to basic packet scheduler filter, from Cong Wang.
13) Add GRE tunnel support for mlxsw Spectrum-2, from Nir Dotan.
14) Lots of new IP tunneling forwarding tests, also from Nir Dotan.
15) Add 3ad stats to bonding, from Nikolay Aleksandrov.
16) Lots of probing improvements for bpftool, from Quentin Monnet.
17) Various nfp drive #ebpf JIT improvements from Jakub Kicinski.
18) Allow #ebpf programs to access gso_segs from skb shared info, from Eric Dumazet.
19) Add sock_diag support for AF_XDP sockets, from Björn Töpel.
20) Support 22260 iwlwifi devices, from Luca Coelho.
21) Use rbtree for ipv6 defragmentation, from Peter Oskolkov.
22) Add JMP32 instruction class support to #ebpf, from Jiong Wang.
23) Add spinlock support to #ebpf, from Alexei Starovoitov.
24) Support 256-bit keys and TLS 1.3 in ktls, from Dave Watson.
25) Add device infomation API to devlink, from Jakub Kicinski.
26) Add new timestamping socket options which are y2038 safe, from Deepa Dinamani.
27) Add RX checksum offloading for various sh_eth chips, from Sergei Shtylyov.
28) Flow offload infrastructure, from Pablo Neira Ayuso.
29) Numerous cleanups, improvements, and bug fixes to the PHY layer and many drivers from Heiner Kallweit.
30) Lots of changes to try and make packet scheduler classifiers run lockless as much as possible, from Vlad Buslov.
31) Support BCM957504 chip in bnxt_en driver, from Erik Burrows.
32) Add concurrency tests to tc-tests infrastructure, from Vlad Buslov.
33) Add hwmon support to aquantia, from Heiner Kallweit.
34) Allow 64-bit values for SO_MAX_PACING_RATE, from Eric Dumazet.
And I would be remiss if I didn't thank the various major networking subsystem maintainers for integrating much of this work before I even saw it. Alexei Starovoitov, Daniel Borkmann, Pablo Neira Ayuso, Johannes Berg, Kalle Valo, and many others. Thank you!"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2207 commits) net/sched: avoid unused-label warning net: ignore sysctl_devconf_inherit_init_net without SYSCTL phy: mdio-mux: fix Kconfig dependencies net: phy: use phy_modify_mmd_changed in genphy_c45_an_config_aneg net: dsa: mv88e6xxx: add call to mv88e6xxx_ports_cmode_init to probe for new DSA framework selftest/net: Remove duplicate header sky2: Disable MSI on Dell Inspiron 1545 and Gateway P-79 net/mlx5e: Update tx reporter status in case channels were successfully opened devlink: Add support for direct reporter health state update devlink: Update reporter state to error even if recover aborted sctp: call iov_iter_revert() after sending ABORT team: Free BPF filter when unregistering netdev ip6mr: Do not call __IP6_INC_STATS() from preemptible context isdn: mISDN: Fix potential NULL pointer dereference of kzalloc net: dsa: mv88e6xxx: support in-band signalling on SGMII ports with external PHYs cxgb4/chtls: Prefix adapter flags with CXGB4 net-sysfs: Switch to bitmap_zalloc() mellanox: Switch to bitmap_zalloc() bpf: add test cases for non-pointer sanitiation logic mlxsw: i2c: Extend initialization by querying resources data ...
show more ...
|
Revision tags: v5.0, v4.19.26, v4.19.25, v4.19.24, v4.19.23, v4.19.22, v4.19.21, v4.19.20 |
|
#
752cfee9 |
| 01-Feb-2019 |
Johannes Berg <johannes.berg@intel.com> |
Merge remote-tracking branch 'net-next/master' into mac80211-next
Merge net-next so that we get the changes from net, which would otherwise conflict with the NLA_POLICY_NESTED/_ARRAY changes.
Signe
Merge remote-tracking branch 'net-next/master' into mac80211-next
Merge net-next so that we get the changes from net, which would otherwise conflict with the NLA_POLICY_NESTED/_ARRAY changes.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
show more ...
|
Revision tags: v4.19.19 |
|
#
ec7146db |
| 28-Jan-2019 |
David S. Miller <davem@davemloft.net> |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
==================== pull-request: bpf-next 2019-01-29
The following pull-request contains BPF updates for yo
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
==================== pull-request: bpf-next 2019-01-29
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) Teach verifier dead code removal, this also allows for optimizing / removing conditional branches around dead code and to shrink the resulting image. Code store constrained architectures like nfp would have hard time doing this at JIT level, from Jakub.
2) Add JMP32 instructions to BPF ISA in order to allow for optimizing code generation for 32-bit sub-registers. Evaluation shows that this can result in code reduction of ~5-20% compared to 64 bit-only code generation. Also add implementation for most JITs, from Jiong.
3) Add support for __int128 types in BTF which is also needed for vmlinux's BTF conversion to work, from Yonghong.
4) Add a new command to bpftool in order to dump a list of BPF-related parameters from the system or for a specific network device e.g. in terms of available prog/map types or helper functions, from Quentin.
5) Add AF_XDP sock_diag interface for querying sockets from user space which provides information about the RX/TX/fill/completion rings, umem, memory usage etc, from Björn.
6) Add skb context access for skb_shared_info->gso_segs field, from Eric.
7) Add support for testing flow dissector BPF programs by extending existing BPF_PROG_TEST_RUN infrastructure, from Stanislav.
8) Split BPF kselftest's test_verifier into various subgroups of tests in order better deal with merge conflicts in this area, from Jakub.
9) Add support for queue/stack manipulations in bpftool, from Stanislav.
10) Document BTF, from Yonghong.
11) Dump supported ELF section names in libbpf on program load failure, from Taeung.
12) Silence a false positive compiler warning in verifier's BTF handling, from Peter.
13) Fix help string in bpftool's feature probing, from Prashant.
14) Remove duplicate includes in BPF kselftests, from Yue. ====================
Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v4.19.18 |
|
#
dbbd79ae |
| 24-Jan-2019 |
Daniel Borkmann <daniel@iogearbox.net> |
Merge branch 'af-xdp-sock-diag'
Björn Töpel says:
==================== This series adds an AF_XDP sock_diag interface for querying sockets from user-space. Tools like iproute2 ss(8) can use this in
Merge branch 'af-xdp-sock-diag'
Björn Töpel says:
==================== This series adds an AF_XDP sock_diag interface for querying sockets from user-space. Tools like iproute2 ss(8) can use this interface to list open AF_XDP sockets.
The diagnostic provides information about the Rx/Tx/fill/completetion rings, umem, memory usage and such. For a complete list, please refer to the xsk_diag.c file.
The AF_XDP sock_diag interface is optional, and can be built as a module. A separate patch series, adding ss(8) iproute2 support, will follow.
v1->v2: * Removed extra newline * Zero-out all user-space facing structures prior setting the members * Added explicit "pad" member in _msg struct * Removed unused variable "req" in xsk_diag_handler_dump()
Thanks to Daniel for reviewing the series! ====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
show more ...
|
#
a36b38aa |
| 24-Jan-2019 |
Björn Töpel <bjorn.topel@intel.com> |
xsk: add sock_diag interface for AF_XDP
This patch adds the sock_diag interface for querying sockets from user space. Tools like iproute2 ss(8) can use this interface to list open AF_XDP sockets.
T
xsk: add sock_diag interface for AF_XDP
This patch adds the sock_diag interface for querying sockets from user space. Tools like iproute2 ss(8) can use this interface to list open AF_XDP sockets.
The user-space ABI is defined in linux/xdp_diag.h and includes netlink request and response structs. The request can query sockets and the response contains socket information about the rings, umems, inode and more.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
show more ...
|
Revision tags: v4.19.17, v4.19.16 |
|
#
3eb66e91 |
| 14-Jan-2019 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'v4.20' into for-linus
Sync with mainline to get linux/overflow.h among other things.
|
#
4116941b |
| 14-Jan-2019 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'v4.20' into next
Merge with mainline to bring in the new APIs.
|
Revision tags: v4.19.15, v4.19.14, v4.19.13, v4.19.12, v4.19.11, v4.19.10, v4.19.9, v4.19.8, v4.19.7, v4.19.6, v4.19.5, v4.19.4, v4.18.20, v4.19.3, v4.18.19, v4.19.2, v4.18.18, v4.18.17, v4.19.1, v4.19, v4.18.16, v4.18.15, v4.18.14, v4.18.13, v4.18.12, v4.18.11, v4.18.10, v4.18.9, v4.18.7, v4.18.6, v4.18.5, v4.17.18, v4.18.4, v4.18.3, v4.17.17, v4.18.2, v4.17.16, v4.17.15, v4.18.1, v4.18, v4.17.14, v4.17.13, v4.17.12, v4.17.11, v4.17.10 |
|
#
c74a7469 |
| 23-Jul-2018 |
Rodrigo Vivi <rodrigo.vivi@intel.com> |
Merge drm/drm-next into drm-intel-next-queued
We need a backmerge to get DP_DPCD_REV_14 before we push other i915 changes to dinq that could break compilation.
Signed-off-by: Rodrigo Vivi <rodrigo.
Merge drm/drm-next into drm-intel-next-queued
We need a backmerge to get DP_DPCD_REV_14 before we push other i915 changes to dinq that could break compilation.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
show more ...
|
Revision tags: v4.17.9, v4.17.8, v4.17.7, v4.17.6, v4.17.5, v4.17.4 |
|
#
6b16f5d1 |
| 28-Jun-2018 |
Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> |
Merge tag 'v4.18-rc2' of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into fbdev-for-next
Linux 4.18-rc2
|
Revision tags: v4.17.3 |
|
#
57b54d74 |
| 25-Jun-2018 |
James Morris <james.morris@microsoft.com> |
Merge tag 'v4.18-rc2' into next-general
Merge to Linux 4.18-rc2 for security subsystem developers.
|
#
7731b8bc |
| 22-Jun-2018 |
Thomas Gleixner <tglx@linutronix.de> |
Merge branch 'linus' into x86/urgent
Required to queue a dependent fix.
|
Revision tags: v4.17.2, v4.17.1 |
|
#
1c8c5a9d |
| 06-Jun-2018 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
1) Add Maglev hashing scheduler to IPVS, from Inju Song.
2) Lots of new TC subsystem
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
1) Add Maglev hashing scheduler to IPVS, from Inju Song.
2) Lots of new TC subsystem tests from Roman Mashak.
3) Add TCP zero copy receive and fix delayed acks and autotuning with SO_RCVLOWAT, from Eric Dumazet.
4) Add XDP_REDIRECT support to mlx5 driver, from Jesper Dangaard Brouer.
5) Add ttl inherit support to vxlan, from Hangbin Liu.
6) Properly separate ipv6 routes into their logically independant components. fib6_info for the routing table, and fib6_nh for sets of nexthops, which thus can be shared. From David Ahern.
7) Add bpf_xdp_adjust_tail helper, which can be used to generate ICMP messages from XDP programs. From Nikita V. Shirokov.
8) Lots of long overdue cleanups to the r8169 driver, from Heiner Kallweit.
9) Add BTF ("BPF Type Format"), from Martin KaFai Lau.
10) Add traffic condition monitoring to iwlwifi, from Luca Coelho.
11) Plumb extack down into fib_rules, from Roopa Prabhu.
12) Add Flower classifier offload support to igb, from Vinicius Costa Gomes.
13) Add UDP GSO support, from Willem de Bruijn.
14) Add documentation for eBPF helpers, from Quentin Monnet.
15) Add TLS tx offload to mlx5, from Ilya Lesokhin.
16) Allow applications to be given the number of bytes available to read on a socket via a control message returned from recvmsg(), from Soheil Hassas Yeganeh.
17) Add x86_32 eBPF JIT compiler, from Wang YanQing.
18) Add AF_XDP sockets, with zerocopy support infrastructure as well. From Björn Töpel.
19) Remove indirect load support from all of the BPF JITs and handle these operations in the verifier by translating them into native BPF instead. From Daniel Borkmann.
20) Add GRO support to ipv6 gre tunnels, from Eran Ben Elisha.
21) Allow XDP programs to do lookups in the main kernel routing tables for forwarding. From David Ahern.
22) Allow drivers to store hardware state into an ELF section of kernel dump vmcore files, and use it in cxgb4. From Rahul Lakkireddy.
23) Various RACK and loss detection improvements in TCP, from Yuchung Cheng.
24) Add TCP SACK compression, from Eric Dumazet.
25) Add User Mode Helper support and basic bpfilter infrastructure, from Alexei Starovoitov.
26) Support ports and protocol values in RTM_GETROUTE, from Roopa Prabhu.
27) Support bulking in ->ndo_xdp_xmit() API, from Jesper Dangaard Brouer.
28) Add lots of forwarding selftests, from Petr Machata.
29) Add generic network device failover driver, from Sridhar Samudrala.
* ra.kernel.org:/pub/scm/linux/kernel/git/davem/net-next: (1959 commits) strparser: Add __strp_unpause and use it in ktls. rxrpc: Fix terminal retransmission connection ID to include the channel net: hns3: Optimize PF CMDQ interrupt switching process net: hns3: Fix for VF mailbox receiving unknown message net: hns3: Fix for VF mailbox cannot receiving PF response bnx2x: use the right constant Revert "net: sched: cls: Fix offloading when ingress dev is vxlan" net: dsa: b53: Fix for brcm tag issue in Cygnus SoC enic: fix UDP rss bits netdev-FAQ: clarify DaveM's position for stable backports rtnetlink: validate attributes in do_setlink() mlxsw: Add extack messages for port_{un, }split failures netdevsim: Add extack error message for devlink reload devlink: Add extack to reload and port_{un, }split operations net: metrics: add proper netlink validation ipmr: fix error path when ipmr_new_table fails ip6mr: only set ip6mr_table from setsockopt when ip6mr_new_table succeeds net: hns3: remove unused hclgevf_cfg_func_mta_filter netfilter: provide udp*_lib_lookup for nf_tproxy qed*: Utilize FW 8.37.2.0 ...
show more ...
|
Revision tags: v4.17 |
|
#
90fed9c9 |
| 24-May-2018 |
David S. Miller <davem@davemloft.net> |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:
==================== pull-request: bpf-next 2018-05-24
The following pull-request contains BPF updates for
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:
==================== pull-request: bpf-next 2018-05-24
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) Björn Töpel cleans up AF_XDP (removes rebind, explicit cache alignment from uapi, etc).
2) David Ahern adds mtu checks to bpf_ipv{4,6}_fib_lookup() helpers.
3) Jesper Dangaard Brouer adds bulking support to ndo_xdp_xmit.
4) Jiong Wang adds support for indirect and arithmetic shifts to NFP
5) Martin KaFai Lau cleans up BTF uapi and makes the btf_header extensible.
6) Mathieu Xhonneux adds an End.BPF action to seg6local with BPF helpers allowing to edit/grow/shrink a SRH and apply on a packet generic SRv6 actions.
7) Sandipan Das adds support for bpf2bpf function calls in ppc64 JIT.
8) Yonghong Song adds BPF_TASK_FD_QUERY command for introspection of tracing events.
9) other misc fixes from Gustavo A. R. Silva, Sirio Balmelli, John Fastabend, and Magnus Karlsson ====================
Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
dd8070bf |
| 23-May-2018 |
Johannes Berg <johannes.berg@intel.com> |
Merge remote-tracking branch 'net-next/master' into mac80211-next
Bring in net-next which had pulled in net, so I have the changes from mac80211 and can apply a patch that would otherwise conflict.
Merge remote-tracking branch 'net-next/master' into mac80211-next
Bring in net-next which had pulled in net, so I have the changes from mac80211 and can apply a patch that would otherwise conflict.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
show more ...
|
#
82f9e2d5 |
| 18-May-2018 |
Daniel Borkmann <daniel@iogearbox.net> |
Merge branch 'bpf-af-xdp-cleanups'
Björn Töpel says:
==================== This series contain "cosmetics only" follow-up patches for AF_XDP.
Thanks to Daniel for suggesting them! =================
Merge branch 'bpf-af-xdp-cleanups'
Björn Töpel says:
==================== This series contain "cosmetics only" follow-up patches for AF_XDP.
Thanks to Daniel for suggesting them! ====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
show more ...
|
#
54b85c27 |
| 18-May-2018 |
Björn Töpel <bjorn.topel@intel.com> |
xsk: remove newline at end of file
Minor cleanup, remove newline at end of Makefile.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
#
01adc485 |
| 07-May-2018 |
David S. Miller <davem@davemloft.net> |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Minor conflict, a CHECK was placed into an if() statement in net-next, whilst a newline was added to that CHECK call in 'net'. Thank
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Minor conflict, a CHECK was placed into an if() statement in net-next, whilst a newline was added to that CHECK call in 'net'. Thanks to Daniel for the merge resolution.
Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
08dbc7a6 |
| 03-May-2018 |
Alexei Starovoitov <ast@kernel.org> |
Merge branch 'AF_XDP-initial-support'
Björn Töpel says:
==================== This patch set introduces a new address family called AF_XDP that is optimized for high performance packet processing an
Merge branch 'AF_XDP-initial-support'
Björn Töpel says:
==================== This patch set introduces a new address family called AF_XDP that is optimized for high performance packet processing and, in upcoming patch sets, zero-copy semantics. In this patch set, we have removed all zero-copy related code in order to make it smaller, simpler and hopefully more review friendly. This patch set only supports copy-mode for the generic XDP path (XDP_SKB) for both RX and TX and copy-mode for RX using the XDP_DRV path. Zero-copy support requires XDP and driver changes that Jesper Dangaard Brouer is working on. Some of his work has already been accepted. We will publish our zero-copy support for RX and TX on top of his patch sets at a later point in time.
An AF_XDP socket (XSK) is created with the normal socket() syscall. Associated with each XSK are two queues: the RX queue and the TX queue. A socket can receive packets on the RX queue and it can send packets on the TX queue. These queues are registered and sized with the setsockopts XDP_RX_RING and XDP_TX_RING, respectively. It is mandatory to have at least one of these queues for each socket. In contrast to AF_PACKET V2/V3 these descriptor queues are separated from packet buffers. An RX or TX descriptor points to a data buffer in a memory area called a UMEM. RX and TX can share the same UMEM so that a packet does not have to be copied between RX and TX. Moreover, if a packet needs to be kept for a while due to a possible retransmit, the descriptor that points to that packet can be changed to point to another and reused right away. This again avoids copying data.
This new dedicated packet buffer area is call a UMEM. It consists of a number of equally size frames and each frame has a unique frame id. A descriptor in one of the queues references a frame by referencing its frame id. The user space allocates memory for this UMEM using whatever means it feels is most appropriate (malloc, mmap, huge pages, etc). This memory area is then registered with the kernel using the new setsockopt XDP_UMEM_REG. The UMEM also has two queues: the FILL queue and the COMPLETION queue. The fill queue is used by the application to send down frame ids for the kernel to fill in with RX packet data. References to these frames will then appear in the RX queue of the XSK once they have been received. The completion queue, on the other hand, contains frame ids that the kernel has transmitted completely and can now be used again by user space, for either TX or RX. Thus, the frame ids appearing in the completion queue are ids that were previously transmitted using the TX queue. In summary, the RX and FILL queues are used for the RX path and the TX and COMPLETION queues are used for the TX path.
The socket is then finally bound with a bind() call to a device and a specific queue id on that device, and it is not until bind is completed that traffic starts to flow. Note that in this patch set, all packet data is copied out to user-space.
A new feature in this patch set is that the UMEM can be shared between processes, if desired. If a process wants to do this, it simply skips the registration of the UMEM and its corresponding two queues, sets a flag in the bind call and submits the XSK of the process it would like to share UMEM with as well as its own newly created XSK socket. The new process will then receive frame id references in its own RX queue that point to this shared UMEM. Note that since the queue structures are single-consumer / single-producer (for performance reasons), the new process has to create its own socket with associated RX and TX queues, since it cannot share this with the other process. This is also the reason that there is only one set of FILL and COMPLETION queues per UMEM. It is the responsibility of a single process to handle the UMEM. If multiple-producer / multiple-consumer queues are implemented in the future, this requirement could be relaxed.
How is then packets distributed between these two XSK? We have introduced a new BPF map called XSKMAP (or BPF_MAP_TYPE_XSKMAP in full). The user-space application can place an XSK at an arbitrary place in this map. The XDP program can then redirect a packet to a specific index in this map and at this point XDP validates that the XSK in that map was indeed bound to that device and queue number. If not, the packet is dropped. If the map is empty at that index, the packet is also dropped. This also means that it is currently mandatory to have an XDP program loaded (and one XSK in the XSKMAP) to be able to get any traffic to user space through the XSK.
AF_XDP can operate in two different modes: XDP_SKB and XDP_DRV. If the driver does not have support for XDP, or XDP_SKB is explicitly chosen when loading the XDP program, XDP_SKB mode is employed that uses SKBs together with the generic XDP support and copies out the data to user space. A fallback mode that works for any network device. On the other hand, if the driver has support for XDP, it will be used by the AF_XDP code to provide better performance, but there is still a copy of the data into user space.
There is a xdpsock benchmarking/test application included that demonstrates how to use AF_XDP sockets with both private and shared UMEMs. Say that you would like your UDP traffic from port 4242 to end up in queue 16, that we will enable AF_XDP on. Here, we use ethtool for this:
ethtool -N p3p2 rx-flow-hash udp4 fn ethtool -N p3p2 flow-type udp4 src-port 4242 dst-port 4242 \ action 16
Running the rxdrop benchmark in XDP_DRV mode can then be done using:
samples/bpf/xdpsock -i p3p2 -q 16 -r -N
For XDP_SKB mode, use the switch "-S" instead of "-N" and all options can be displayed with "-h", as usual.
We have run some benchmarks on a dual socket system with two Broadwell E5 2660 @ 2.0 GHz with hyperthreading turned off. Each socket has 14 cores which gives a total of 28, but only two cores are used in these experiments. One for TR/RX and one for the user space application. The memory is DDR4 @ 2133 MT/s (1067 MHz) and the size of each DIMM is 8192MB and with 8 of those DIMMs in the system we have 64 GB of total memory. The compiler used is gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0. The NIC is Intel I40E 40Gbit/s using the i40e driver.
Below are the results in Mpps of the I40E NIC benchmark runs for 64 and 1500 byte packets, generated by a commercial packet generator HW outputing packets at full 40 Gbit/s line rate. The results are without retpoline so that we can compare against previous numbers. With retpoline, the AF_XDP numbers drop with between 10 - 15 percent.
AF_XDP performance 64 byte packets. Results from V2 in parenthesis. Benchmark XDP_SKB XDP_DRV rxdrop 2.9(3.0) 9.6(9.5) txpush 2.6(2.5) NA* l2fwd 1.9(1.9) 2.5(2.5) (TX using XDP_SKB in both cases)
AF_XDP performance 1500 byte packets: Benchmark XDP_SKB XDP_DRV rxdrop 2.1(2.2) 3.3(3.3) l2fwd 1.4(1.4) 1.8(1.8) (TX using XDP_SKB in both cases)
* NA since we have no support for TX using the XDP_DRV infrastructure in this patch set. This is for a future patch set since it involves changes to the XDP NDOs. Some of this has been upstreamed by Jesper Dangaard Brouer.
XDP performance on our system as a base line:
64 byte packets: XDP stats CPU pps issue-pps XDP-RX CPU 16 32.3(32.9)M 0
1500 byte packets: XDP stats CPU pps issue-pps XDP-RX CPU 16 3.3(3.3)M 0
Changes from V2:
* Fixed a race in XSKMAP map found by Will. The code has been completely rearchitected and is now simpler, faster, and hopefully also not racy. Please review and check if it holds.
If you would like to diff V2 against V3, you can find them here: https://github.com/bjoto/linux/tree/af-xdp-v2-on-bpf-next https://github.com/bjoto/linux/tree/af-xdp-v3-on-bpf-next
The structure of the patch set is as follows:
Patches 1-3: Basic socket and umem plumbing Patches 4-9: RX support together with the new XSKMAP Patches 10-13: TX support Patch 14: Statistics support with getsockopt() Patch 15: Sample application
We based this patch set on bpf-next commit a3fe1f6f2ada ("tools: bpftool: change time format for program 'loaded at:' information")
To do for this patch set:
* Syzkaller torture session being worked on
Post-series plan:
* Optimize performance
* Kernel selftest
* Kernel load module support of AF_XDP would be nice. Unclear how to achieve this though since our XDP code depends on net/core.
* Support for AF_XDP sockets without an XPD program loaded. In this case all the traffic on a queue should go up to the user space socket.
* Daniel Borkmann's suggestion for a "copy to XDP socket, and return XDP_PASS" for a tcpdump-like functionality.
* And of course getting to zero-copy support in small increments, starting with TX then adding RX.
Thanks: Björn and Magnus ====================
Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
#
423f3832 |
| 02-May-2018 |
Magnus Karlsson <magnus.karlsson@intel.com> |
xsk: add umem fill queue support and mmap
Here, we add another setsockopt for registered user memory (umem) called XDP_UMEM_FILL_QUEUE. Using this socket option, the process can ask the kernel to al
xsk: add umem fill queue support and mmap
Here, we add another setsockopt for registered user memory (umem) called XDP_UMEM_FILL_QUEUE. Using this socket option, the process can ask the kernel to allocate a queue (ring buffer) and also mmap it (XDP_UMEM_PGOFF_FILL_QUEUE) into the process.
The queue is used to explicitly pass ownership of umem frames from the user process to the kernel. These frames will in a later patch be filled in with Rx packet data by the kernel.
v2: Fixed potential crash in xsk_mmap.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|