#
67ece13f |
| 18-Jul-2018 |
Takashi Iwai <tiwai@suse.de> |
Merge branch 'topic/vga_switcheroo' into for-next
Pull the vga_switcheroo audio client fix.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
Revision tags: v4.17.8, v4.17.7 |
|
#
08948b75 |
| 17-Jul-2018 |
Wolfram Sang <wsa@the-dreams.de> |
Merge tag 'v4.18-rc5' into i2c/for-4.19
Linux 4.18-rc5
|
#
37c45b23 |
| 17-Jul-2018 |
Ingo Molnar <mingo@kernel.org> |
Merge tag 'v4.18-rc5' into x86/mm, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
52b544bd |
| 17-Jul-2018 |
Ingo Molnar <mingo@kernel.org> |
Merge tag 'v4.18-rc5' into locking/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
500f0716 |
| 16-Jul-2018 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
Merge 4.18-rc5 into usb-next
We need the USB fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
956f004a |
| 16-Jul-2018 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
Merge 4.18-rc5 into staging-next
We need the staging fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
83cf9cd6 |
| 16-Jul-2018 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
Merge 4.18-rc5 into char-misc-next
We want the char-misc fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
fdf2ceb7 |
| 15-Jul-2018 |
Ingo Molnar <mingo@kernel.org> |
Merge tag 'v4.18-rc5' into sched/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
Revision tags: v4.17.6, v4.17.5, v4.17.4 |
|
#
6b16f5d1 |
| 28-Jun-2018 |
Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> |
Merge tag 'v4.18-rc2' of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into fbdev-for-next
Linux 4.18-rc2
|
Revision tags: v4.17.3 |
|
#
57b54d74 |
| 25-Jun-2018 |
James Morris <james.morris@microsoft.com> |
Merge tag 'v4.18-rc2' into next-general
Merge to Linux 4.18-rc2 for security subsystem developers.
|
#
5cd3da4b |
| 02-Jul-2018 |
David S. Miller <davem@davemloft.net> |
Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net
Simple overlapping changes in stmmac driver.
Adjust skb_gro_flush_final_remcsum function signature to make GRO list changes in net-next, as p
Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net
Simple overlapping changes in stmmac driver.
Adjust skb_gro_flush_final_remcsum function signature to make GRO list changes in net-next, as per Stephen Rothwell's example merge resolution.
Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
4e33d7d4 |
| 02-Jul-2018 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Verify netlink attributes properly in nf_queue, from Eric Dumazet.
2) Need to bump memor
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Verify netlink attributes properly in nf_queue, from Eric Dumazet.
2) Need to bump memory lock rlimit for test_sockmap bpf test, from Yonghong Song.
3) Fix VLAN handling in lan78xx driver, from Dave Stevenson.
4) Fix uninitialized read in nf_log, from Jann Horn.
5) Fix raw command length parsing in mlx5, from Alex Vesker.
6) Cleanup loopback RDS connections upon netns deletion, from Sowmini Varadhan.
7) Fix regressions in FIB rule matching during create, from Jason A. Donenfeld and Roopa Prabhu.
8) Fix mpls ether type detection in nfp, from Pieter Jansen van Vuuren.
9) More bpfilter build fixes/adjustments from Masahiro Yamada.
10) Fix XDP_{TX,REDIRECT} flushing in various drivers, from Jesper Dangaard Brouer.
11) fib_tests.sh file permissions were broken, from Shuah Khan.
12) Make sure BH/preemption is disabled in data path of mac80211, from Denis Kenzior.
13) Don't ignore nla_parse_nested() return values in nl80211, from Johannes berg.
14) Properly account sock objects ot kmemcg, from Shakeel Butt.
15) Adjustments to setting bpf program permissions to read-only, from Daniel Borkmann.
16) TCP Fast Open key endianness was broken, it always took on the host endiannness. Whoops. Explicitly make it little endian. From Yuching Cheng.
17) Fix prefix route setting for link local addresses in ipv6, from David Ahern.
18) Potential Spectre v1 in zatm driver, from Gustavo A. R. Silva.
19) Various bpf sockmap fixes, from John Fastabend.
20) Use after free for GRO with ESP, from Sabrina Dubroca.
21) Passing bogus flags to crypto_alloc_shash() in ipv6 SR code, from Eric Biggers.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (87 commits) qede: Adverstise software timestamp caps when PHC is not available. qed: Fix use of incorrect size in memcpy call. qed: Fix setting of incorrect eswitch mode. qed: Limit msix vectors in kdump kernel to the minimum required count. ipvlan: call dev_change_flags when ipvlan mode is reset ipv6: sr: fix passing wrong flags to crypto_alloc_shash() net: fix use-after-free in GRO with ESP tcp: prevent bogus FRTO undos with non-SACK flows bpf: sockhash, add release routine bpf: sockhash fix omitted bucket lock in sock_close bpf: sockmap, fix smap_list_map_remove when psock is in many maps bpf: sockmap, fix crash when ipv6 sock is added net: fib_rules: bring back rule_exists to match rule during add hv_netvsc: split sub-channel setup into async and sync net: use dev_change_tx_queue_len() for SIOCSIFTXQLEN atm: zatm: Fix potential Spectre v1 s390/qeth: consistently re-enable device features s390/qeth: don't clobber buffer on async TX completion s390/qeth: avoid using is_multicast_ether_addr_64bits on (u8 *)[6] s390/qeth: fix race when setting MAC address ...
show more ...
|
#
88e85a7d |
| 25-Jun-2018 |
Masahiro Yamada <yamada.masahiro@socionext.com> |
bpfilter: check compiler capability in Kconfig
With the brand-new syntax extension of Kconfig, we can directly check the compiler capability in the configuration phase.
If the cc-can-link.sh fails,
bpfilter: check compiler capability in Kconfig
With the brand-new syntax extension of Kconfig, we can directly check the compiler capability in the configuration phase.
If the cc-can-link.sh fails, the BPFILTER_UMH is automatically hidden by the dependency.
I also deleted 'default n', which is no-op.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
7731b8bc |
| 22-Jun-2018 |
Thomas Gleixner <tglx@linutronix.de> |
Merge branch 'linus' into x86/urgent
Required to queue a dependent fix.
|
Revision tags: v4.17.2, v4.17.1 |
|
#
1c8c5a9d |
| 06-Jun-2018 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
1) Add Maglev hashing scheduler to IPVS, from Inju Song.
2) Lots of new TC subsystem
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
1) Add Maglev hashing scheduler to IPVS, from Inju Song.
2) Lots of new TC subsystem tests from Roman Mashak.
3) Add TCP zero copy receive and fix delayed acks and autotuning with SO_RCVLOWAT, from Eric Dumazet.
4) Add XDP_REDIRECT support to mlx5 driver, from Jesper Dangaard Brouer.
5) Add ttl inherit support to vxlan, from Hangbin Liu.
6) Properly separate ipv6 routes into their logically independant components. fib6_info for the routing table, and fib6_nh for sets of nexthops, which thus can be shared. From David Ahern.
7) Add bpf_xdp_adjust_tail helper, which can be used to generate ICMP messages from XDP programs. From Nikita V. Shirokov.
8) Lots of long overdue cleanups to the r8169 driver, from Heiner Kallweit.
9) Add BTF ("BPF Type Format"), from Martin KaFai Lau.
10) Add traffic condition monitoring to iwlwifi, from Luca Coelho.
11) Plumb extack down into fib_rules, from Roopa Prabhu.
12) Add Flower classifier offload support to igb, from Vinicius Costa Gomes.
13) Add UDP GSO support, from Willem de Bruijn.
14) Add documentation for eBPF helpers, from Quentin Monnet.
15) Add TLS tx offload to mlx5, from Ilya Lesokhin.
16) Allow applications to be given the number of bytes available to read on a socket via a control message returned from recvmsg(), from Soheil Hassas Yeganeh.
17) Add x86_32 eBPF JIT compiler, from Wang YanQing.
18) Add AF_XDP sockets, with zerocopy support infrastructure as well. From Björn Töpel.
19) Remove indirect load support from all of the BPF JITs and handle these operations in the verifier by translating them into native BPF instead. From Daniel Borkmann.
20) Add GRO support to ipv6 gre tunnels, from Eran Ben Elisha.
21) Allow XDP programs to do lookups in the main kernel routing tables for forwarding. From David Ahern.
22) Allow drivers to store hardware state into an ELF section of kernel dump vmcore files, and use it in cxgb4. From Rahul Lakkireddy.
23) Various RACK and loss detection improvements in TCP, from Yuchung Cheng.
24) Add TCP SACK compression, from Eric Dumazet.
25) Add User Mode Helper support and basic bpfilter infrastructure, from Alexei Starovoitov.
26) Support ports and protocol values in RTM_GETROUTE, from Roopa Prabhu.
27) Support bulking in ->ndo_xdp_xmit() API, from Jesper Dangaard Brouer.
28) Add lots of forwarding selftests, from Petr Machata.
29) Add generic network device failover driver, from Sridhar Samudrala.
* ra.kernel.org:/pub/scm/linux/kernel/git/davem/net-next: (1959 commits) strparser: Add __strp_unpause and use it in ktls. rxrpc: Fix terminal retransmission connection ID to include the channel net: hns3: Optimize PF CMDQ interrupt switching process net: hns3: Fix for VF mailbox receiving unknown message net: hns3: Fix for VF mailbox cannot receiving PF response bnx2x: use the right constant Revert "net: sched: cls: Fix offloading when ingress dev is vxlan" net: dsa: b53: Fix for brcm tag issue in Cygnus SoC enic: fix UDP rss bits netdev-FAQ: clarify DaveM's position for stable backports rtnetlink: validate attributes in do_setlink() mlxsw: Add extack messages for port_{un, }split failures netdevsim: Add extack error message for devlink reload devlink: Add extack to reload and port_{un, }split operations net: metrics: add proper netlink validation ipmr: fix error path when ipmr_new_table fails ip6mr: only set ip6mr_table from setsockopt when ip6mr_new_table succeeds net: hns3: remove unused hclgevf_cfg_func_mta_filter netfilter: provide udp*_lib_lookup for nf_tproxy qed*: Utilize FW 8.37.2.0 ...
show more ...
|
#
819dd92b |
| 04-Jun-2018 |
Alexei Starovoitov <ast@kernel.org> |
bpfilter: switch to CC from HOSTCC
check that CC can build executables and use that compiler instead of HOSTCC
Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Alexei Starovoitov <ast@ker
bpfilter: switch to CC from HOSTCC
check that CC can build executables and use that compiler instead of HOSTCC
Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v4.17 |
|
#
e95a5f54 |
| 23-May-2018 |
David S. Miller <davem@davemloft.net> |
Merge branch 'bpfilter'
Alexei Starovoitov says:
==================== bpfilter
v2->v3: - followed Luis's suggestion and significantly simplied first patch with shmem_kernel_file_setup+kernel_wri
Merge branch 'bpfilter'
Alexei Starovoitov says:
==================== bpfilter
v2->v3: - followed Luis's suggestion and significantly simplied first patch with shmem_kernel_file_setup+kernel_write. Added kdoc for new helper - fixed typos and race to access pipes with mutex - tested with bpfilter being 'builtin'. CONFIG_BPFILTER_UMH=y|m both work. Interesting to see a usermode executable being embedded inside vmlinux. - it doesn't hurt to enable bpfilter in .config. ip_setsockopt commands sent to usermode via pipes and -ENOPROTOOPT is returned from userspace, so kernel falls back to original iptables code
v1->v2: this patch set is almost a full rewrite of the earlier umh modules approach The v1 of patches and follow up discussion was covered by LWN: https://lwn.net/Articles/749108/
I believe the v2 addresses all issues brought up by Andy and others. Mainly there are zero changes to kernel/module.c Instead of teaching module loading logic to recognize special umh module, let normal kernel modules execute part of its own .init.rodata as a new user space process (Andy's idea) Patch 1 introduces this new helper: int fork_usermode_blob(void *data, size_t len, struct umh_info *info); Input: data + len == executable file Output: struct umh_info { struct file *pipe_to_umh; struct file *pipe_from_umh; pid_t pid; };
Advantages vs v1: - the embedded user mode executable is stored as .init.rodata inside normal kernel module. These pages are freed when .ko finishes loading - the elf file is copied into tmpfs file. The user mode process is swappable. - the communication between user mode process and 'parent' kernel module is done via two unix pipes, hence protocol is not exposed to user space - impossible to launch umh on its own (that was the main issue of v1) and impossible to be man-in-the-middle due to pipes - bpfilter.ko consists of tiny kernel part that passes the data between kernel and umh via pipes and much bigger umh part that doing all the work - 'lsmod' shows bpfilter.ko as usual. 'rmmod bpfilter' removes kernel module and kills corresponding umh - signed bpfilter.ko covers the whole image including umh code
Few issues: - the user can still attach to the process and debug it with 'gdb /proc/pid/exe pid', but 'gdb -p pid' doesn't work. (a bit worse comparing to v1) - tinyconfig will notice a small increase in .text +766 | TEXT | 7c8b94806bec umh: introduce fork_usermode_blob() helper ====================
Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
d2ba09c1 |
| 21-May-2018 |
Alexei Starovoitov <ast@kernel.org> |
net: add skeleton of bpfilter kernel module
bpfilter.ko consists of bpfilter_kern.c (normal kernel module code) and user mode helper code that is embedded into bpfilter.ko
The steps to build bpfilt
net: add skeleton of bpfilter kernel module
bpfilter.ko consists of bpfilter_kern.c (normal kernel module code) and user mode helper code that is embedded into bpfilter.ko
The steps to build bpfilter.ko are the following: - main.c is compiled by HOSTCC into the bpfilter_umh elf executable file - with quite a bit of objcopy and Makefile magic the bpfilter_umh elf file is converted into bpfilter_umh.o object file with _binary_net_bpfilter_bpfilter_umh_start and _end symbols Example: $ nm ./bld_x64/net/bpfilter/bpfilter_umh.o 0000000000004cf8 T _binary_net_bpfilter_bpfilter_umh_end 0000000000004cf8 A _binary_net_bpfilter_bpfilter_umh_size 0000000000000000 T _binary_net_bpfilter_bpfilter_umh_start - bpfilter_umh.o and bpfilter_kern.o are linked together into bpfilter.ko
bpfilter_kern.c is a normal kernel module code that calls the fork_usermode_blob() helper to execute part of its own data as a user mode process.
Notice that _binary_net_bpfilter_bpfilter_umh_start - end is placed into .init.rodata section, so it's freed as soon as __init function of bpfilter.ko is finished. As part of __init the bpfilter.ko does first request/reply action via two unix pipe provided by fork_usermode_blob() helper to make sure that umh is healthy. If not it will kill it via pid.
Later bpfilter_process_sockopt() will be called from bpfilter hooks in get/setsockopt() to pass iptable commands into umh via bpfilter.ko
If admin does 'rmmod bpfilter' the __exit code bpfilter.ko will kill umh as well.
Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
dd8070bf |
| 23-May-2018 |
Johannes Berg <johannes.berg@intel.com> |
Merge remote-tracking branch 'net-next/master' into mac80211-next
Bring in net-next which had pulled in net, so I have the changes from mac80211 and can apply a patch that would otherwise conflict.
Merge remote-tracking branch 'net-next/master' into mac80211-next
Bring in net-next which had pulled in net, so I have the changes from mac80211 and can apply a patch that would otherwise conflict.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
show more ...
|
#
01adc485 |
| 07-May-2018 |
David S. Miller <davem@davemloft.net> |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Minor conflict, a CHECK was placed into an if() statement in net-next, whilst a newline was added to that CHECK call in 'net'. Thank
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Minor conflict, a CHECK was placed into an if() statement in net-next, whilst a newline was added to that CHECK call in 'net'. Thanks to Daniel for the merge resolution.
Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
08dbc7a6 |
| 03-May-2018 |
Alexei Starovoitov <ast@kernel.org> |
Merge branch 'AF_XDP-initial-support'
Björn Töpel says:
==================== This patch set introduces a new address family called AF_XDP that is optimized for high performance packet processing an
Merge branch 'AF_XDP-initial-support'
Björn Töpel says:
==================== This patch set introduces a new address family called AF_XDP that is optimized for high performance packet processing and, in upcoming patch sets, zero-copy semantics. In this patch set, we have removed all zero-copy related code in order to make it smaller, simpler and hopefully more review friendly. This patch set only supports copy-mode for the generic XDP path (XDP_SKB) for both RX and TX and copy-mode for RX using the XDP_DRV path. Zero-copy support requires XDP and driver changes that Jesper Dangaard Brouer is working on. Some of his work has already been accepted. We will publish our zero-copy support for RX and TX on top of his patch sets at a later point in time.
An AF_XDP socket (XSK) is created with the normal socket() syscall. Associated with each XSK are two queues: the RX queue and the TX queue. A socket can receive packets on the RX queue and it can send packets on the TX queue. These queues are registered and sized with the setsockopts XDP_RX_RING and XDP_TX_RING, respectively. It is mandatory to have at least one of these queues for each socket. In contrast to AF_PACKET V2/V3 these descriptor queues are separated from packet buffers. An RX or TX descriptor points to a data buffer in a memory area called a UMEM. RX and TX can share the same UMEM so that a packet does not have to be copied between RX and TX. Moreover, if a packet needs to be kept for a while due to a possible retransmit, the descriptor that points to that packet can be changed to point to another and reused right away. This again avoids copying data.
This new dedicated packet buffer area is call a UMEM. It consists of a number of equally size frames and each frame has a unique frame id. A descriptor in one of the queues references a frame by referencing its frame id. The user space allocates memory for this UMEM using whatever means it feels is most appropriate (malloc, mmap, huge pages, etc). This memory area is then registered with the kernel using the new setsockopt XDP_UMEM_REG. The UMEM also has two queues: the FILL queue and the COMPLETION queue. The fill queue is used by the application to send down frame ids for the kernel to fill in with RX packet data. References to these frames will then appear in the RX queue of the XSK once they have been received. The completion queue, on the other hand, contains frame ids that the kernel has transmitted completely and can now be used again by user space, for either TX or RX. Thus, the frame ids appearing in the completion queue are ids that were previously transmitted using the TX queue. In summary, the RX and FILL queues are used for the RX path and the TX and COMPLETION queues are used for the TX path.
The socket is then finally bound with a bind() call to a device and a specific queue id on that device, and it is not until bind is completed that traffic starts to flow. Note that in this patch set, all packet data is copied out to user-space.
A new feature in this patch set is that the UMEM can be shared between processes, if desired. If a process wants to do this, it simply skips the registration of the UMEM and its corresponding two queues, sets a flag in the bind call and submits the XSK of the process it would like to share UMEM with as well as its own newly created XSK socket. The new process will then receive frame id references in its own RX queue that point to this shared UMEM. Note that since the queue structures are single-consumer / single-producer (for performance reasons), the new process has to create its own socket with associated RX and TX queues, since it cannot share this with the other process. This is also the reason that there is only one set of FILL and COMPLETION queues per UMEM. It is the responsibility of a single process to handle the UMEM. If multiple-producer / multiple-consumer queues are implemented in the future, this requirement could be relaxed.
How is then packets distributed between these two XSK? We have introduced a new BPF map called XSKMAP (or BPF_MAP_TYPE_XSKMAP in full). The user-space application can place an XSK at an arbitrary place in this map. The XDP program can then redirect a packet to a specific index in this map and at this point XDP validates that the XSK in that map was indeed bound to that device and queue number. If not, the packet is dropped. If the map is empty at that index, the packet is also dropped. This also means that it is currently mandatory to have an XDP program loaded (and one XSK in the XSKMAP) to be able to get any traffic to user space through the XSK.
AF_XDP can operate in two different modes: XDP_SKB and XDP_DRV. If the driver does not have support for XDP, or XDP_SKB is explicitly chosen when loading the XDP program, XDP_SKB mode is employed that uses SKBs together with the generic XDP support and copies out the data to user space. A fallback mode that works for any network device. On the other hand, if the driver has support for XDP, it will be used by the AF_XDP code to provide better performance, but there is still a copy of the data into user space.
There is a xdpsock benchmarking/test application included that demonstrates how to use AF_XDP sockets with both private and shared UMEMs. Say that you would like your UDP traffic from port 4242 to end up in queue 16, that we will enable AF_XDP on. Here, we use ethtool for this:
ethtool -N p3p2 rx-flow-hash udp4 fn ethtool -N p3p2 flow-type udp4 src-port 4242 dst-port 4242 \ action 16
Running the rxdrop benchmark in XDP_DRV mode can then be done using:
samples/bpf/xdpsock -i p3p2 -q 16 -r -N
For XDP_SKB mode, use the switch "-S" instead of "-N" and all options can be displayed with "-h", as usual.
We have run some benchmarks on a dual socket system with two Broadwell E5 2660 @ 2.0 GHz with hyperthreading turned off. Each socket has 14 cores which gives a total of 28, but only two cores are used in these experiments. One for TR/RX and one for the user space application. The memory is DDR4 @ 2133 MT/s (1067 MHz) and the size of each DIMM is 8192MB and with 8 of those DIMMs in the system we have 64 GB of total memory. The compiler used is gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0. The NIC is Intel I40E 40Gbit/s using the i40e driver.
Below are the results in Mpps of the I40E NIC benchmark runs for 64 and 1500 byte packets, generated by a commercial packet generator HW outputing packets at full 40 Gbit/s line rate. The results are without retpoline so that we can compare against previous numbers. With retpoline, the AF_XDP numbers drop with between 10 - 15 percent.
AF_XDP performance 64 byte packets. Results from V2 in parenthesis. Benchmark XDP_SKB XDP_DRV rxdrop 2.9(3.0) 9.6(9.5) txpush 2.6(2.5) NA* l2fwd 1.9(1.9) 2.5(2.5) (TX using XDP_SKB in both cases)
AF_XDP performance 1500 byte packets: Benchmark XDP_SKB XDP_DRV rxdrop 2.1(2.2) 3.3(3.3) l2fwd 1.4(1.4) 1.8(1.8) (TX using XDP_SKB in both cases)
* NA since we have no support for TX using the XDP_DRV infrastructure in this patch set. This is for a future patch set since it involves changes to the XDP NDOs. Some of this has been upstreamed by Jesper Dangaard Brouer.
XDP performance on our system as a base line:
64 byte packets: XDP stats CPU pps issue-pps XDP-RX CPU 16 32.3(32.9)M 0
1500 byte packets: XDP stats CPU pps issue-pps XDP-RX CPU 16 3.3(3.3)M 0
Changes from V2:
* Fixed a race in XSKMAP map found by Will. The code has been completely rearchitected and is now simpler, faster, and hopefully also not racy. Please review and check if it holds.
If you would like to diff V2 against V3, you can find them here: https://github.com/bjoto/linux/tree/af-xdp-v2-on-bpf-next https://github.com/bjoto/linux/tree/af-xdp-v3-on-bpf-next
The structure of the patch set is as follows:
Patches 1-3: Basic socket and umem plumbing Patches 4-9: RX support together with the new XSKMAP Patches 10-13: TX support Patch 14: Statistics support with getsockopt() Patch 15: Sample application
We based this patch set on bpf-next commit a3fe1f6f2ada ("tools: bpftool: change time format for program 'loaded at:' information")
To do for this patch set:
* Syzkaller torture session being worked on
Post-series plan:
* Optimize performance
* Kernel selftest
* Kernel load module support of AF_XDP would be nice. Unclear how to achieve this though since our XDP code depends on net/core.
* Support for AF_XDP sockets without an XPD program loaded. In this case all the traffic on a queue should go up to the user space socket.
* Daniel Borkmann's suggestion for a "copy to XDP socket, and return XDP_PASS" for a tcpdump-like functionality.
* And of course getting to zero-copy support in small increments, starting with TX then adding RX.
Thanks: Björn and Magnus ====================
Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
#
c0c77d8f |
| 02-May-2018 |
Björn Töpel <bjorn.topel@intel.com> |
xsk: add user memory registration support sockopt
In this commit the base structure of the AF_XDP address family is set up. Further, we introduce the abilty register a window of user memory to the k
xsk: add user memory registration support sockopt
In this commit the base structure of the AF_XDP address family is set up. Further, we introduce the abilty register a window of user memory to the kernel via the XDP_UMEM_REG setsockopt syscall. The memory window is viewed by an AF_XDP socket as a set of equally large frames. After a user memory registration all frames are "owned" by the user application, and not the kernel.
v2: More robust checks on umem creation and unaccount on error. Call set_page_dirty_lock on cleanup. Simplified xdp_umem_reg.
Co-authored-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
#
552c69b3 |
| 02-May-2018 |
John Johansen <john.johansen@canonical.com> |
Merge tag 'v4.17-rc3' into apparmor-next
Linux v4.17-rc3
Merge in v4.17 for LSM updates
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
#
664b0bae |
| 04-Apr-2018 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge branch 'next' into for-linus
Prepare input updates for 4.17 merge window.
|
Revision tags: v4.16 |
|
#
c6380ecd |
| 08-Mar-2018 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'v4.16-rc4' into next
Sync up with mainline to bring in RAVE MFD device core.
|