History log of /openbmc/linux/net/socket.c (Results 176 – 200 of 3052)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 4c612826 25-Aug-2022 Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'net-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
"Including fixes from ipsec and netfilter (with one broken Fixes tag).

Merge tag 'net-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
"Including fixes from ipsec and netfilter (with one broken Fixes tag).

Current release - new code bugs:

- dsa: don't dereference NULL extack in dsa_slave_changeupper()

- dpaa: fix <1G ethernet on LS1046ARDB

- neigh: don't call kfree_skb() under spin_lock_irqsave()

Previous releases - regressions:

- r8152: fix the RX FIFO settings when suspending

- dsa: microchip: keep compatibility with device tree blobs with no
phy-mode

- Revert "net: macsec: update SCI upon MAC address change."

- Revert "xfrm: update SA curlft.use_time", comply with RFC 2367

Previous releases - always broken:

- netfilter: conntrack: work around exceeded TCP receive window

- ipsec: fix a null pointer dereference of dst->dev on a metadata dst
in xfrm_lookup_with_ifid

- moxa: get rid of asymmetry in DMA mapping/unmapping

- dsa: microchip: make learning configurable and keep it off while
standalone

- ice: xsk: prohibit usage of non-balanced queue id

- rxrpc: fix locking in rxrpc's sendmsg

Misc:

- another chunk of sysctl data race silencing"

* tag 'net-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (87 commits)
net: lantiq_xrx200: restore buffer if memory allocation failed
net: lantiq_xrx200: fix lock under memory pressure
net: lantiq_xrx200: confirm skb is allocated before using
net: stmmac: work around sporadic tx issue on link-up
ionic: VF initial random MAC address if no assigned mac
ionic: fix up issues with handling EAGAIN on FW cmds
ionic: clear broken state on generation change
rxrpc: Fix locking in rxrpc's sendmsg
net: ethernet: mtk_eth_soc: fix hw hash reporting for MTK_NETSYS_V2
MAINTAINERS: rectify file entry in BONDING DRIVER
i40e: Fix incorrect address type for IPv6 flow rules
ixgbe: stop resetting SYSTIME in ixgbe_ptp_start_cyclecounter
net: Fix a data-race around sysctl_somaxconn.
net: Fix a data-race around netdev_unregister_timeout_secs.
net: Fix a data-race around gro_normal_batch.
net: Fix data-races around sysctl_devconf_inherit_init_net.
net: Fix data-races around sysctl_fb_tunnels_only_for_init_net.
net: Fix a data-race around netdev_budget_usecs.
net: Fix data-races around sysctl_max_skb_frags.
net: Fix a data-race around netdev_budget.
...

show more ...


Revision tags: v5.15.63
# 0c4a9541 24-Aug-2022 David S. Miller <davem@davemloft.net>

Merge branch 'sysctl-data-races'

Kuniyuki Iwashima says:

====================
net: sysctl: Fix data-races around net.core.XXX

This series fixes data-races around all knobs in net_core_table and
ne

Merge branch 'sysctl-data-races'

Kuniyuki Iwashima says:

====================
net: sysctl: Fix data-races around net.core.XXX

This series fixes data-races around all knobs in net_core_table and
netns_core_table except for bpf stuff.

These knobs are skipped:

- 4 bpf knobs
- netdev_rss_key: Written only once by net_get_random_once() and
read-only knob
- rps_sock_flow_entries: Protected with sock_flow_mutex
- flow_limit_cpu_bitmap: Protected with flow_limit_update_mutex
- flow_limit_table_len: Protected with flow_limit_update_mutex
- default_qdisc: Protected with qdisc_mod_lock
- warnings: Unused
- high_order_alloc_disable: Protected with static_key_mutex
- skb_defer_max: Already using READ_ONCE()
- sysctl_txrehash: Already using READ_ONCE()

Note 5th patch fixes net.core.message_cost and net.core.message_burst,
and lib/ratelimit.c does not have an explicit maintainer.

Changes:
v3:
* Fix build failures of CONFIG_SYSCTL=n case in 13th & 14th patches

v2: https://lore.kernel.org/netdev/20220818035227.81567-1-kuniyu@amazon.com/
* Remove 4 bpf knobs and added 6 knobs

v1: https://lore.kernel.org/netdev/20220816052347.70042-1-kuniyu@amazon.com/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# 3c9ba81d 23-Aug-2022 Kuniyuki Iwashima <kuniyu@amazon.com>

net: Fix a data-race around sysctl_somaxconn.

While reading sysctl_somaxconn, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2"

net: Fix a data-race around sysctl_somaxconn.

While reading sysctl_somaxconn, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


Revision tags: v5.15.62, v5.15.61, v5.15.60, v5.15.59, v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55, v5.15.54, v5.15.53, v5.15.52, v5.15.51, v5.15.50, v5.15.49, v5.15.48, v5.15.47, v5.15.46, v5.15.45, v5.15.44, v5.15.43, v5.15.42, v5.18, v5.15.41, v5.15.40, v5.15.39, v5.15.38, v5.15.37, v5.15.36, v5.15.35, v5.15.34, v5.15.33, v5.15.32, v5.15.31, v5.17, v5.15.30, v5.15.29, v5.15.28, v5.15.27, v5.15.26, v5.15.25, v5.15.24, v5.15.23, v5.15.22, v5.15.21, v5.15.20, v5.15.19
# 0f60d288 30-Jan-2022 Al Viro <viro@zeniv.linux.org.uk>

dynamic_dname(): drop unused dentry argument

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 93fbff11 17-Aug-2022 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge branch 'i2c/make_remove_callback_void-immutable' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux into next

Sync up with the latest I2C code base to get updated prototype of I2C
bus

Merge branch 'i2c/make_remove_callback_void-immutable' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux into next

Sync up with the latest I2C code base to get updated prototype of I2C
bus remove() method.

show more ...


# cf36ae3e 17-Aug-2022 Thomas Zimmermann <tzimmermann@suse.de>

Merge drm/drm-fixes into drm-misc-fixes

Backmerging for v6.0-rc1.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>


# 44627916 05-Aug-2022 Andreas Gruenbacher <agruenba@redhat.com>

Merge part of branch 'for-next.instantiate' into for-next


# fc30eea1 04-Aug-2022 Rodrigo Vivi <rodrigo.vivi@intel.com>

Merge drm/drm-next into drm-intel-next

Sync up. In special to get the drm-intel-gt-next stuff.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# f86d1fbb 03-Aug-2022 Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'net-next-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking changes from Paolo Abeni:
"Core:

- Refactor the forward memory allocation to better cop

Merge tag 'net-next-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking changes from Paolo Abeni:
"Core:

- Refactor the forward memory allocation to better cope with memory
pressure with many open sockets, moving from a per socket cache to
a per-CPU one

- Replace rwlocks with RCU for better fairness in ping, raw sockets
and IP multicast router.

- Network-side support for IO uring zero-copy send.

- A few skb drop reason improvements, including codegen the source
file with string mapping instead of using macro magic.

- Rename reference tracking helpers to a more consistent netdev_*
schema.

- Adapt u64_stats_t type to address load/store tearing issues.

- Refine debug helper usage to reduce the log noise caused by bots.

BPF:

- Improve socket map performance, avoiding skb cloning on read
operation.

- Add support for 64 bits enum, to match types exposed by kernel.

- Introduce support for sleepable uprobes program.

- Introduce support for enum textual representation in libbpf.

- New helpers to implement synproxy with eBPF/XDP.

- Improve loop performances, inlining indirect calls when possible.

- Removed all the deprecated libbpf APIs.

- Implement new eBPF-based LSM flavor.

- Add type match support, which allow accurate queries to the eBPF
used types.

- A few TCP congetsion control framework usability improvements.

- Add new infrastructure to manipulate CT entries via eBPF programs.

- Allow for livepatch (KLP) and BPF trampolines to attach to the same
kernel function.

Protocols:

- Introduce per network namespace lookup tables for unix sockets,
increasing scalability and reducing contention.

- Preparation work for Wi-Fi 7 Multi-Link Operation (MLO) support.

- Add support to forciby close TIME_WAIT TCP sockets via user-space
tools.

- Significant performance improvement for the TLS 1.3 receive path,
both for zero-copy and not-zero-copy.

- Support for changing the initial MTPCP subflow priority/backup
status

- Introduce virtually contingus buffers for sockets over RDMA, to
cope better with memory pressure.

- Extend CAN ethtool support with timestamping capabilities

- Refactor CAN build infrastructure to allow building only the needed
features.

Driver API:

- Remove devlink mutex to allow parallel commands on multiple links.

- Add support for pause stats in distributed switch.

- Implement devlink helpers to query and flash line cards.

- New helper for phy mode to register conversion.

New hardware / drivers:

- Ethernet DSA driver for the rockchip mt7531 on BPI-R2 Pro.

- Ethernet DSA driver for the Renesas RZ/N1 A5PSW switch.

- Ethernet DSA driver for the Microchip LAN937x switch.

- Ethernet PHY driver for the Aquantia AQR113C EPHY.

- CAN driver for the OBD-II ELM327 interface.

- CAN driver for RZ/N1 SJA1000 CAN controller.

- Bluetooth: Infineon CYW55572 Wi-Fi plus Bluetooth combo device.

Drivers:

- Intel Ethernet NICs:
- i40e: add support for vlan pruning
- i40e: add support for XDP framented packets
- ice: improved vlan offload support
- ice: add support for PPPoE offload

- Mellanox Ethernet (mlx5)
- refactor packet steering offload for performance and scalability
- extend support for TC offload
- refactor devlink code to clean-up the locking schema
- support stacked vlans for bridge offloads
- use TLS objects pool to improve connection rate

- Netronome Ethernet NICs (nfp):
- extend support for IPv6 fields mangling offload
- add support for vepa mode in HW bridge
- better support for virtio data path acceleration (VDPA)
- enable TSO by default

- Microsoft vNIC driver (mana)
- add support for XDP redirect

- Others Ethernet drivers:
- bonding: add per-port priority support
- microchip lan743x: extend phy support
- Fungible funeth: support UDP segmentation offload and XDP xmit
- Solarflare EF100: add support for virtual function representors
- MediaTek SoC: add XDP support

- Mellanox Ethernet/IB switch (mlxsw):
- dropped support for unreleased H/W (XM router).
- improved stats accuracy
- unified bridge model coversion improving scalability (parts 1-6)
- support for PTP in Spectrum-2 asics

- Broadcom PHYs
- add PTP support for BCM54210E
- add support for the BCM53128 internal PHY

- Marvell Ethernet switches (prestera):
- implement support for multicast forwarding offload

- Embedded Ethernet switches:
- refactor OcteonTx MAC filter for better scalability
- improve TC H/W offload for the Felix driver
- refactor the Microchip ksz8 and ksz9477 drivers to share the
probe code (parts 1, 2), add support for phylink mac
configuration

- Other WiFi:
- Microchip wilc1000: diable WEP support and enable WPA3
- Atheros ath10k: encapsulation offload support

Old code removal:

- Neterion vxge ethernet driver: this is untouched since more than 10 years"

* tag 'net-next-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1890 commits)
doc: sfp-phylink: Fix a broken reference
wireguard: selftests: support UML
wireguard: allowedips: don't corrupt stack when detecting overflow
wireguard: selftests: update config fragments
wireguard: ratelimiter: use hrtimer in selftest
net/mlx5e: xsk: Discard unaligned XSK frames on striding RQ
net: usb: ax88179_178a: Bind only to vendor-specific interface
selftests: net: fix IOAM test skip return code
net: usb: make USB_RTL8153_ECM non user configurable
net: marvell: prestera: remove reduntant code
octeontx2-pf: Reduce minimum mtu size to 60
net: devlink: Fix missing mutex_unlock() call
net/tls: Remove redundant workqueue flush before destroy
net: txgbe: Fix an error handling path in txgbe_probe()
net: dsa: Fix spelling mistakes and cleanup code
Documentation: devlink: add add devlink-selftests to the table of contents
dccp: put dccp_qpolicy_full() and dccp_qpolicy_push() in the same lock
net: ionic: fix error check for vlan flags in ionic_set_nic_features()
net: ice: fix error NETIF_F_HW_VLAN_CTAG_FILTER check in ice_vsi_sync_fltr()
nfp: flower: add support for tunnel offload without key ID
...

show more ...


# 42df1cbf 02-Aug-2022 Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'for-5.20/io_uring-zerocopy-send-2022-07-29' of git://git.kernel.dk/linux-block

Pull io_uring zerocopy support from Jens Axboe:
"This adds support for efficient support for zerocopy sends

Merge tag 'for-5.20/io_uring-zerocopy-send-2022-07-29' of git://git.kernel.dk/linux-block

Pull io_uring zerocopy support from Jens Axboe:
"This adds support for efficient support for zerocopy sends through
io_uring. Both ipv4 and ipv6 is supported, as well as both TCP and
UDP.

The core network changes to support this is in a stable branch from
Jakub that both io_uring and net-next has pulled in, and the io_uring
changes are layered on top of that.

All of the work has been done by Pavel"

* tag 'for-5.20/io_uring-zerocopy-send-2022-07-29' of git://git.kernel.dk/linux-block: (34 commits)
io_uring: notification completion optimisation
io_uring: export req alloc from core
io_uring/net: use unsigned for flags
io_uring/net: make page accounting more consistent
io_uring/net: checks errors of zc mem accounting
io_uring/net: improve io_get_notif_slot types
selftests/io_uring: test zerocopy send
io_uring: enable managed frags with register buffers
io_uring: add zc notification flush requests
io_uring: rename IORING_OP_FILES_UPDATE
io_uring: flush notifiers after sendzc
io_uring: sendzc with fixed buffers
io_uring: allow to pass addr into sendzc
io_uring: account locked pages for non-fixed zc
io_uring: wire send zc request type
io_uring: add notification slot registration
io_uring: add rsrc referencing for notifiers
io_uring: complete notifiers in tw
io_uring: cache struct io_notif
io_uring: add zc notification infrastructure
...

show more ...


# b349b118 02-Aug-2022 Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'for-5.20/io_uring-2022-07-29' of git://git.kernel.dk/linux-block

Pull io_uring updates from Jens Axboe:

- As per (valid) complaint in the last merge window, fs/io_uring.c has
grown q

Merge tag 'for-5.20/io_uring-2022-07-29' of git://git.kernel.dk/linux-block

Pull io_uring updates from Jens Axboe:

- As per (valid) complaint in the last merge window, fs/io_uring.c has
grown quite large these days. io_uring isn't really tied to fs
either, as it supports a wide variety of functionality outside of
that.

Move the code to io_uring/ and split it into files that either
implement a specific request type, and split some code into helpers
as well. The code is organized a lot better like this, and io_uring.c
is now < 4K LOC (me).

- Deprecate the epoll_ctl opcode. It'll still work, just trigger a
warning once if used. If we don't get any complaints on this, and I
don't expect any, then we can fully remove it in a future release
(me).

- Improve the cancel hash locking (Hao)

- kbuf cleanups (Hao)

- Efficiency improvements to the task_work handling (Dylan, Pavel)

- Provided buffer improvements (Dylan)

- Add support for recv/recvmsg multishot support. This is similar to
the accept (or poll) support for have for multishot, where a single
SQE can trigger everytime data is received. For applications that
expect to do more than a few receives on an instantiated socket, this
greatly improves efficiency (Dylan).

- Efficiency improvements for poll handling (Pavel)

- Poll cancelation improvements (Pavel)

- Allow specifiying a range for direct descriptor allocations (Pavel)

- Cleanup the cqe32 handling (Pavel)

- Move io_uring types to greatly cleanup the tracing (Pavel)

- Tons of great code cleanups and improvements (Pavel)

- Add a way to do sync cancelations rather than through the sqe -> cqe
interface, as that's a lot easier to use for some use cases (me).

- Add support to IORING_OP_MSG_RING for sending direct descriptors to a
different ring. This avoids the usually problematic SCM case, as we
disallow those. (me)

- Make the per-command alloc cache we use for apoll generic, place
limits on it, and use it for netmsg as well (me).

- Various cleanups (me, Michal, Gustavo, Uros)

* tag 'for-5.20/io_uring-2022-07-29' of git://git.kernel.dk/linux-block: (172 commits)
io_uring: ensure REQ_F_ISREG is set async offload
net: fix compat pointer in get_compat_msghdr()
io_uring: Don't require reinitable percpu_ref
io_uring: fix types in io_recvmsg_multishot_overflow
io_uring: Use atomic_long_try_cmpxchg in __io_account_mem
io_uring: support multishot in recvmsg
net: copy from user before calling __get_compat_msghdr
net: copy from user before calling __copy_msghdr
io_uring: support 0 length iov in buffer select in compat
io_uring: fix multishot ending when not polled
io_uring: add netmsg cache
io_uring: impose max limit on apoll cache
io_uring: add abstraction around apoll cache
io_uring: move apoll cache to poll.c
io_uring: consolidate hash_locked io-wq handling
io_uring: clear REQ_F_HASH_LOCKED on hash removal
io_uring: don't race double poll setting REQ_F_ASYNC_DATA
io_uring: don't miss setting REQ_F_DOUBLE_POLL
io_uring: disable multishot recvmsg
io_uring: only trace one of complete or overflow
...

show more ...


# 8bb5e7f4 02-Aug-2022 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge branch 'next' into for-linus

Prepare input updates for 5.20 (or 6.0) merge window.


# 3c69a99b 24-Jul-2022 Michael Ellerman <mpe@ellerman.id.au>

Merge tag 'v5.19-rc7' into fixes

Merge v5.19-rc7 into fixes to bring in:
d11219ad53dc ("amdgpu: disable powerpc support for the newer display engine")


# 4effe18f 24-Jul-2022 Jens Axboe <axboe@kernel.dk>

Merge branch 'for-5.20/io_uring' into for-5.20/io_uring-zerocopy-send

* for-5.20/io_uring: (716 commits)
io_uring: ensure REQ_F_ISREG is set async offload
net: fix compat pointer in get_compat_m

Merge branch 'for-5.20/io_uring' into for-5.20/io_uring-zerocopy-send

* for-5.20/io_uring: (716 commits)
io_uring: ensure REQ_F_ISREG is set async offload
net: fix compat pointer in get_compat_msghdr()
io_uring: Don't require reinitable percpu_ref
io_uring: fix types in io_recvmsg_multishot_overflow
io_uring: Use atomic_long_try_cmpxchg in __io_account_mem
io_uring: support multishot in recvmsg
net: copy from user before calling __get_compat_msghdr
net: copy from user before calling __copy_msghdr
io_uring: support 0 length iov in buffer select in compat
io_uring: fix multishot ending when not polled
io_uring: add netmsg cache
io_uring: impose max limit on apoll cache
io_uring: add abstraction around apoll cache
io_uring: move apoll cache to poll.c
io_uring: consolidate hash_locked io-wq handling
io_uring: clear REQ_F_HASH_LOCKED on hash removal
io_uring: don't race double poll setting REQ_F_ASYNC_DATA
io_uring: don't miss setting REQ_F_DOUBLE_POLL
io_uring: disable multishot recvmsg
io_uring: only trace one of complete or overflow
...

Signed-off-by: Jens Axboe <axboe@kernel.dk>

show more ...


# 32e09298 24-Jul-2022 Jens Axboe <axboe@kernel.dk>

Merge branch 'io_uring-zerocopy-send' of git://git.kernel.org/pub/scm/linux/kernel/git/kuba/linux into for-5.20/io_uring-zerocopy-send

Merge prep net series for io_uring tx zc from the Jakub's tree.

Merge branch 'io_uring-zerocopy-send' of git://git.kernel.org/pub/scm/linux/kernel/git/kuba/linux into for-5.20/io_uring-zerocopy-send

Merge prep net series for io_uring tx zc from the Jakub's tree.

* 'io_uring-zerocopy-send' of git://git.kernel.org/pub/scm/linux/kernel/git/kuba/linux:
net: fix uninitialised msghdr->sg_from_iter
tcp: support externally provided ubufs
ipv6/udp: support externally provided ubufs
ipv4/udp: support externally provided ubufs
net: introduce __skb_fill_page_desc_noacc
net: introduce managed frags infrastructure
net: Allow custom iter handler in msghdr
skbuff: carry external ubuf_info in msghdr
skbuff: add SKBFL_DONT_ORPHAN flag
skbuff: don't mix ubuf_info from different sources
ipv6: avoid partial copy for zc
ipv4: avoid partial copy for zc

show more ...


# 7fa875b8 14-Jul-2022 Dylan Yudaken <dylany@fb.com>

net: copy from user before calling __copy_msghdr

this is in preparation for multishot receive from io_uring, where it needs
to have access to the original struct user_msghdr.

functionally this shou

net: copy from user before calling __copy_msghdr

this is in preparation for multishot receive from io_uring, where it needs
to have access to the original struct user_msghdr.

functionally this should be a no-op.

Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Dylan Yudaken <dylany@fb.com>
Link: https://lore.kernel.org/r/20220714110258.1336200-2-dylany@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

show more ...


# 7f9eee19 19-Jul-2022 Jakub Kicinski <kuba@kernel.org>

Merge branch 'io_uring-zerocopy-send' of git://git.kernel.org/pub/scm/linux/kernel/git/kuba/linux

Pavel Begunkov says:

====================
io_uring zerocopy send

The patchset implements io_uring

Merge branch 'io_uring-zerocopy-send' of git://git.kernel.org/pub/scm/linux/kernel/git/kuba/linux

Pavel Begunkov says:

====================
io_uring zerocopy send

The patchset implements io_uring zerocopy send. It works with both registered
and normal buffers, mixing is allowed but not recommended. Apart from usual
request completions, just as with MSG_ZEROCOPY, io_uring separately notifies
the userspace when buffers are freed and can be reused (see API design below),
which is delivered into io_uring's Completion Queue. Those "buffer-free"
notifications are not necessarily per request, but the userspace has control
over it and should explicitly attaching a number of requests to a single
notification. The series also adds some internal optimisations when used with
registered buffers like removing page referencing.

From the kernel networking perspective there are two main changes. The first
one is passing ubuf_info into the network layer from io_uring (inside of an
in kernel struct msghdr). This allows extra optimisations, e.g. ubuf_info
caching on the io_uring side, but also helps to avoid cross-referencing
and synchronisation problems. The second part is an optional optimisation
removing page referencing for requests with registered buffers.

Benchmarking UDP with an optimised version of the selftest (see [1]), which
sends a bunch of requests, waits for completions and repeats. "+ flush" column
posts one additional "buffer-free" notification per request, and just "zc"
doesn't post buffer notifications at all.

NIC (requests / second):
IO size | non-zc | zc | zc + flush
4000 | 495134 | 606420 (+22%) | 558971 (+12%)
1500 | 551808 | 577116 (+4.5%) | 565803 (+2.5%)
1000 | 584677 | 592088 (+1.2%) | 560885 (-4%)
600 | 596292 | 598550 (+0.4%) | 555366 (-6.7%)

dummy (requests / second):
IO size | non-zc | zc | zc + flush
8000 | 1299916 | 2396600 (+84%) | 2224219 (+71%)
4000 | 1869230 | 2344146 (+25%) | 2170069 (+16%)
1200 | 2071617 | 2361960 (+14%) | 2203052 (+6%)
600 | 2106794 | 2381527 (+13%) | 2195295 (+4%)

Previously it also brought a massive performance speedup compared to the
msg_zerocopy tool (see [3]), which is probably not super interesting. There
is also an additional bunch of refcounting optimisations that was omitted from
the series for simplicity and as they don't change the picture drastically,
they will be sent as follow up, as well as flushing optimisations closing the
performance gap b/w two last columns.

For TCP on localhost (with hacks enabling localhost zerocopy) and including
additional overhead for receive:

IO size | non-zc | zc
1200 | 4174 | 4148
4096 | 7597 | 11228

Using a real NIC 1200 bytes, zc is worse than non-zc ~5-10%, maybe the
omitted optimisations will somewhat help, should look better for 4000,
but couldn't test properly because of setup problems.

Links:

liburing (benchmark + tests):
[1] https://github.com/isilence/liburing/tree/zc_v4

kernel repo:
[2] https://github.com/isilence/linux/tree/zc_v4

RFC v1:
[3] https://lore.kernel.org/io-uring/cover.1638282789.git.asml.silence@gmail.com/

RFC v2:
https://lore.kernel.org/io-uring/cover.1640029579.git.asml.silence@gmail.com/

Net patches based:
git@github.com:isilence/linux.git zc_v4-net-base
or
https://github.com/isilence/linux/tree/zc_v4-net-base

API design overview:

The series introduces an io_uring concept of notifactors. From the userspace
perspective it's an entity to which it can bind one or more requests and then
requesting to flush it. Flushing a notifier makes it impossible to attach new
requests to it, and instructs the notifier to post a completion once all
requests attached to it are completed and the kernel doesn't need the buffers
anymore.

Notifications are stored in notification slots, which should be registered as
an array in io_uring. Each slot stores only one notifier at any particular
moment. Flushing removes it from the slot and the slot automatically replaces
it with a new notifier. All operations with notifiers are done by specifying
an index of a slot it's currently in.

When registering a notification the userspace specifies a u64 tag for each
slot, which will be copied in notification completion entries as
cqe::user_data. cqe::res is 0 and cqe::flags is equal to wrap around u32
sequence number counting notifiers of a slot.

====================

Link: https://lore.kernel.org/r/cover.1657643355.git.asml.silence@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

show more ...


# 7c701d92 12-Jul-2022 Pavel Begunkov <asml.silence@gmail.com>

skbuff: carry external ubuf_info in msghdr

Make possible for network in-kernel callers like io_uring to pass in a
custom ubuf_info by setting it in a new field of struct msghdr.

Signed-off-by: Pave

skbuff: carry external ubuf_info in msghdr

Make possible for network in-kernel callers like io_uring to pass in a
custom ubuf_info by setting it in a new field of struct msghdr.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

show more ...


# dc14036f 18-Jul-2022 Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Merge 5.19-rc7 into usb-next

We need the USB fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 0698461a 18-Jul-2022 Arnaldo Carvalho de Melo <acme@redhat.com>

Merge remote-tracking branch 'torvalds/master' into perf/core

To update the perf/core codebase.

Fix conflict by moving arch__post_evsel_config(evsel, attr) to the end
of evsel__config(), after what

Merge remote-tracking branch 'torvalds/master' into perf/core

To update the perf/core codebase.

Fix conflict by moving arch__post_evsel_config(evsel, attr) to the end
of evsel__config(), after what was added in:

49c692b7dfc9b6c0 ("perf offcpu: Accept allowed sample types only")

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

show more ...


# 45598fd4 13-Jul-2022 Christian Brauner <brauner@kernel.org>

Merge tag 'ovl-fixes-5.19-rc7' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs into fs.idmapped.overlay.acl

Bring in Miklos' tree which contains the temporary fix for POSIX ACLs
w

Merge tag 'ovl-fixes-5.19-rc7' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs into fs.idmapped.overlay.acl

Bring in Miklos' tree which contains the temporary fix for POSIX ACLs
with overlayfs on top of idmapped layers. We will add a proper fix on
top of it and then revert the temporary fix.

Cc: Seth Forshee <sforshee@digitalocean.com>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>

show more ...


# 7e501332 14-Jul-2022 Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Merge tag 'v5.19-rc6' into usb-linus

The usb-serial fixes are based on this branch, so merge it in here to
prevent merge confusion when merging in that tree.

Signed-off-by: Greg Kroah-Hartman <greg

Merge tag 'v5.19-rc6' into usb-linus

The usb-serial fixes are based on this branch, so merge it in here to
prevent merge confusion when merging in that tree.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

show more ...


# f83d9396 14-Jul-2022 Thomas Zimmermann <tzimmermann@suse.de>

Merge drm/drm-next into drm-misc-next-fixes

Backmerging from drm/drm-next for the final fixes that will go
into v5.20.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>


# 4de395f2 13-Jul-2022 Maxime Ripard <maxime@cerno.tech>

Merge drm/drm-next into drm-misc-next

I need to have some vc4 patches merged in -rc4, but drm-misc-next is
only at -rc2 for now.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>


# e23a5e14 12-Jul-2022 Dave Airlie <airlied@redhat.com>

Backmerge tag 'v5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into drm-next

Backmerge in rc6 so I can merge msm next easier.

Linux 5.19-rc6

Signed-off-by: Dave Airlie <

Backmerge tag 'v5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into drm-next

Backmerge in rc6 so I can merge msm next easier.

Linux 5.19-rc6

Signed-off-by: Dave Airlie <airlied@redhat.com>

show more ...


12345678910>>...123