#
4effe18f |
| 24-Jul-2022 |
Jens Axboe <axboe@kernel.dk> |
Merge branch 'for-5.20/io_uring' into for-5.20/io_uring-zerocopy-send
* for-5.20/io_uring: (716 commits) io_uring: ensure REQ_F_ISREG is set async offload net: fix compat pointer in get_compat_m
Merge branch 'for-5.20/io_uring' into for-5.20/io_uring-zerocopy-send
* for-5.20/io_uring: (716 commits) io_uring: ensure REQ_F_ISREG is set async offload net: fix compat pointer in get_compat_msghdr() io_uring: Don't require reinitable percpu_ref io_uring: fix types in io_recvmsg_multishot_overflow io_uring: Use atomic_long_try_cmpxchg in __io_account_mem io_uring: support multishot in recvmsg net: copy from user before calling __get_compat_msghdr net: copy from user before calling __copy_msghdr io_uring: support 0 length iov in buffer select in compat io_uring: fix multishot ending when not polled io_uring: add netmsg cache io_uring: impose max limit on apoll cache io_uring: add abstraction around apoll cache io_uring: move apoll cache to poll.c io_uring: consolidate hash_locked io-wq handling io_uring: clear REQ_F_HASH_LOCKED on hash removal io_uring: don't race double poll setting REQ_F_ASYNC_DATA io_uring: don't miss setting REQ_F_DOUBLE_POLL io_uring: disable multishot recvmsg io_uring: only trace one of complete or overflow ...
Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
4f6a94d3 |
| 15-Jul-2022 |
Jens Axboe <axboe@kernel.dk> |
net: fix compat pointer in get_compat_msghdr()
A previous change enabled external users to copy the data before calling __get_compat_msghdr(), but didn't modify get_compat_msghdr() or __io_compat_re
net: fix compat pointer in get_compat_msghdr()
A previous change enabled external users to copy the data before calling __get_compat_msghdr(), but didn't modify get_compat_msghdr() or __io_compat_recvmsg_copy_hdr() to take that into account. They are both stil passing in the __user pointer rather than the copied version.
Ensure we pass in the kernel struct, not the pointer to the user data.
Link: https://lore.kernel.org/all/46439555-644d-08a1-7d66-16f8f9a320f0@samsung.com/ Fixes: 1a3e4e94a1b9 ("net: copy from user before calling __get_compat_msghdr") Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
9b0fc3c0 |
| 15-Jul-2022 |
Dylan Yudaken <dylany@fb.com> |
io_uring: fix types in io_recvmsg_multishot_overflow
io_recvmsg_multishot_overflow had incorrect types on non x64 system. But also it had an unnecessary INT_MAX check, which could just be done by ch
io_uring: fix types in io_recvmsg_multishot_overflow
io_recvmsg_multishot_overflow had incorrect types on non x64 system. But also it had an unnecessary INT_MAX check, which could just be done by changing the type of the accumulator to int (also simplifying the casts).
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Fixes: a8b38c4ce724 ("io_uring: support multishot in recvmsg") Signed-off-by: Dylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220715130252.610639-1-dylany@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
9bb66906 |
| 14-Jul-2022 |
Dylan Yudaken <dylany@fb.com> |
io_uring: support multishot in recvmsg
Similar to multishot recv, this will require provided buffers to be used. However recvmsg is much more complex than recv as it has multiple outputs. Specifical
io_uring: support multishot in recvmsg
Similar to multishot recv, this will require provided buffers to be used. However recvmsg is much more complex than recv as it has multiple outputs. Specifically flags, name, and control messages.
Support this by introducing a new struct io_uring_recvmsg_out with 4 fields. namelen, controllen and flags match the similar out fields in msghdr from standard recvmsg(2), payloadlen is the length of the payload following the header. This struct is placed at the start of the returned buffer. Based on what the user specifies in struct msghdr, the next bytes of the buffer will be name (the next msg_namelen bytes), and then control (the next msg_controllen bytes). The payload will come at the end. The return value in the CQE is the total used size of the provided buffer.
Signed-off-by: Dylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220714110258.1336200-4-dylany@fb.com [axboe: style fixups, see link] Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
72c531f8 |
| 14-Jul-2022 |
Dylan Yudaken <dylany@fb.com> |
net: copy from user before calling __get_compat_msghdr
this is in preparation for multishot receive from io_uring, where it needs to have access to the original struct user_msghdr.
functionally thi
net: copy from user before calling __get_compat_msghdr
this is in preparation for multishot receive from io_uring, where it needs to have access to the original struct user_msghdr.
functionally this should be a no-op.
Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Dylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220714110258.1336200-3-dylany@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
7fa875b8 |
| 14-Jul-2022 |
Dylan Yudaken <dylany@fb.com> |
net: copy from user before calling __copy_msghdr
this is in preparation for multishot receive from io_uring, where it needs to have access to the original struct user_msghdr.
functionally this shou
net: copy from user before calling __copy_msghdr
this is in preparation for multishot receive from io_uring, where it needs to have access to the original struct user_msghdr.
functionally this should be a no-op.
Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Dylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220714110258.1336200-2-dylany@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.15.54 |
|
#
6d2f75a0 |
| 08-Jul-2022 |
Dylan Yudaken <dylany@fb.com> |
io_uring: support 0 length iov in buffer select in compat
Match up work done in "io_uring: allow iov_len = 0 for recvmsg and buffer select", but for compat code path.
Fixes: a68caad69ce5 ("io_uring
io_uring: support 0 length iov in buffer select in compat
Match up work done in "io_uring: allow iov_len = 0 for recvmsg and buffer select", but for compat code path.
Fixes: a68caad69ce5 ("io_uring: allow iov_len = 0 for recvmsg and buffer select") Signed-off-by: Dylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220708181838.1495428-3-dylany@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
e2df2ccb |
| 08-Jul-2022 |
Dylan Yudaken <dylany@fb.com> |
io_uring: fix multishot ending when not polled
If multishot is not actually polling then return IOU_OK rather than the result. If the result was > 0 this will confuse things further up the callstack
io_uring: fix multishot ending when not polled
If multishot is not actually polling then return IOU_OK rather than the result. If the result was > 0 this will confuse things further up the callstack which expect a return <= 0.
Fixes: 1300ebb20286 ("io_uring: multishot recv") Signed-off-by: Dylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220708181838.1495428-2-dylany@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
43e0bbbd |
| 07-Jul-2022 |
Jens Axboe <axboe@kernel.dk> |
io_uring: add netmsg cache
For recvmsg/sendmsg, if they don't complete inline, we currently need to allocate a struct io_async_msghdr for each request. This is a somewhat large struct.
Hook up send
io_uring: add netmsg cache
For recvmsg/sendmsg, if they don't complete inline, we currently need to allocate a struct io_async_msghdr for each request. This is a somewhat large struct.
Hook up sendmsg/recvmsg to use the io_alloc_cache. This reduces the alloc + free overhead considerably, yielding 4-5% of extra performance running netbench.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.15.53 |
|
#
cf0dd952 |
| 04-Jul-2022 |
Dylan Yudaken <dylany@fb.com> |
io_uring: disable multishot recvmsg
recvmsg has semantics that do not make it trivial to extend to multishot. Specifically it has user pointers and returns data in the original parameter. In order t
io_uring: disable multishot recvmsg
recvmsg has semantics that do not make it trivial to extend to multishot. Specifically it has user pointers and returns data in the original parameter. In order to make this API useful these will need to be somehow included with the provided buffers.
For now remove multishot for recvmsg as it is not useful.
Signed-off-by: Dylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220704140106.200167-1-dylany@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.15.52 |
|
#
b3fdea6e |
| 30-Jun-2022 |
Dylan Yudaken <dylany@fb.com> |
io_uring: multishot recv
Support multishot receive for io_uring. Typical server applications will run a loop where for each recv CQE it requeues another recv/recvmsg.
This can be simplified by usin
io_uring: multishot recv
Support multishot receive for io_uring. Typical server applications will run a loop where for each recv CQE it requeues another recv/recvmsg.
This can be simplified by using the existing multishot functionality combined with io_uring's provided buffers. The API is to add the IORING_RECV_MULTISHOT flag to the SQE. CQEs will then be posted (with IORING_CQE_F_MORE flag set) when data is available and is read. Once an error occurs or the socket ends, the multishot will be removed and a completion without IORING_CQE_F_MORE will be posted.
The benefit to this is that the recv is much more performant. * Subsequent receives are queued up straight away without requiring the application to finish a processing loop. * If there are more data in the socket (sat the provided buffer size is smaller than the socket buffer) then the data is immediately returned, improving batching. * Poll is only armed once and reused, saving CPU cycles
Signed-off-by: Dylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220630091231.1456789-11-dylany@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
cbd25748 |
| 30-Jun-2022 |
Dylan Yudaken <dylany@fb.com> |
io_uring: fix multishot accept ordering
Similar to multishot poll, drop multishot accept when CQE overflow occurs.
Signed-off-by: Dylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220
io_uring: fix multishot accept ordering
Similar to multishot poll, drop multishot accept when CQE overflow occurs.
Signed-off-by: Dylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220630091231.1456789-10-dylany@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
52120f0f |
| 30-Jun-2022 |
Dylan Yudaken <dylany@fb.com> |
io_uring: add allow_overflow to io_post_aux_cqe
Some use cases of io_post_aux_cqe would not want to overflow as is, but might want to change the flags/result. For example multishot receive requires
io_uring: add allow_overflow to io_post_aux_cqe
Some use cases of io_post_aux_cqe would not want to overflow as is, but might want to change the flags/result. For example multishot receive requires in order CQE, and so if there is an overflow it would need to stop receiving until the overflow is taken care of.
Signed-off-by: Dylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220630091231.1456789-8-dylany@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
d4e097da |
| 30-Jun-2022 |
Dylan Yudaken <dylany@fb.com> |
io_uring: recycle buffers on error
Rather than passing an error back to the user with a buffer attached, recycle the buffer immediately.
Signed-off-by: Dylan Yudaken <dylany@fb.com> Link: https://l
io_uring: recycle buffers on error
Rather than passing an error back to the user with a buffer attached, recycle the buffer immediately.
Signed-off-by: Dylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220630091231.1456789-5-dylany@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
5702196e |
| 30-Jun-2022 |
Dylan Yudaken <dylany@fb.com> |
io_uring: allow iov_len = 0 for recvmsg and buffer select
When using BUFFER_SELECT there is no technical requirement that the user actually provides iov, and this removes one copy_from_user call.
S
io_uring: allow iov_len = 0 for recvmsg and buffer select
When using BUFFER_SELECT there is no technical requirement that the user actually provides iov, and this removes one copy_from_user call.
So allow iov_len to be 0.
Signed-off-by: Dylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220630091231.1456789-4-dylany@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.15.51, v5.15.50, v5.15.49 |
|
#
27a9d66f |
| 16-Jun-2022 |
Pavel Begunkov <asml.silence@gmail.com> |
io_uring: kill extra io_uring_types.h includes
io_uring/io_uring.h already includes io_uring_types.h, no need to include it every time. Kill it in a bunch of places, it prepares us for following pat
io_uring: kill extra io_uring_types.h includes
io_uring/io_uring.h already includes io_uring_types.h, no need to include it every time. Kill it in a bunch of places, it prepares us for following patches.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/94d8c943fbe0ef949981c508ddcee7fc1c18850f.1655384063.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
d245bca6 |
| 17-Jun-2022 |
Pavel Begunkov <asml.silence@gmail.com> |
io_uring: don't expose io_fill_cqe_aux()
Deduplicate some code and add a helper for filling an aux CQE, locking and notification.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https:
io_uring: don't expose io_fill_cqe_aux()
Deduplicate some code and add a helper for filling an aux CQE, locking and notification.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b7c6557c8f9dc5c4cfb01292116c682a0ff61081.1655455613.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.15.48, v5.15.47 |
|
#
3b77495a |
| 13-Jun-2022 |
Jens Axboe <axboe@kernel.dk> |
io_uring: split provided buffers handling into its own file
Move both the opcodes related to it, and the internals code dealing with it.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
Revision tags: v5.15.46, v5.15.45, v5.15.44, v5.15.43 |
|
#
f9ead18c |
| 25-May-2022 |
Jens Axboe <axboe@kernel.dk> |
io_uring: split network related opcodes into its own file
While at it, convert the handlers to just use io_eopnotsupp_prep() if CONFIG_NET isn't set.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|