#
eb42cebb |
| 12-Jul-2022 |
Pavel Begunkov <asml.silence@gmail.com> |
io_uring: add zc notification infrastructure
Add internal part of send zerocopy notifications. There are two main structures, the first one is struct io_notif, which carries inside struct ubuf_info
io_uring: add zc notification infrastructure
Add internal part of send zerocopy notifications. There are two main structures, the first one is struct io_notif, which carries inside struct ubuf_info and maps 1:1 to it. io_uring will be binding a number of zerocopy send requests to it and ask to complete (aka flush) it. When flushed and all attached requests and skbs complete, it'll generate one and only one CQE. There are intended to be passed into the network layer as struct msghdr::msg_ubuf.
The second concept is notification slots. The userspace will be able to register an array of slots and subsequently addressing them by the index in the array. Slots are independent of each other. Each slot can have only one notifier at a time (called active notifier) but many notifiers during the lifetime. When active, a notifier not going to post any completion but the userspace can attach requests to it by specifying the corresponding slot while issueing send zc requests. Eventually, the userspace will want to "flush" the notifier losing any way to attach new requests to it, however it can use the next atomatically added notifier of this slot or of any other slot.
When the network layer is done with all enqueued skbs attached to a notifier and doesn't need the specified in them user data, the flushed notifier will post a CQE.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/3ecf54c31a85762bf679b0a432c9f43ecf7e61cc.1657643355.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
e70cb608 |
| 12-Jul-2022 |
Pavel Begunkov <asml.silence@gmail.com> |
io_uring: export io_put_task()
Make io_put_task() available to non-core parts of io_uring, we'll need it for notification infrastructure.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link
io_uring: export io_put_task()
Make io_put_task() available to non-core parts of io_uring, we'll need it for notification infrastructure.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/3686807d4c03b72e389947b0e8692d4d44334ef0.1657643355.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
f6b543fd |
| 21-Jul-2022 |
Jens Axboe <axboe@kernel.dk> |
io_uring: ensure REQ_F_ISREG is set async offload
If we're offloading requests directly to io-wq because IOSQE_ASYNC was set in the sqe, we can miss hashing writes appropriately because we haven't s
io_uring: ensure REQ_F_ISREG is set async offload
If we're offloading requests directly to io-wq because IOSQE_ASYNC was set in the sqe, we can miss hashing writes appropriately because we haven't set REQ_F_ISREG yet. This can cause a performance regression with buffered writes, as io-wq then no longer correctly serializes writes to that file.
Ensure that we set the flags in io_prep_async_work(), which will cause the io-wq work item to be hashed appropriately.
Fixes: 584b0180f0f4 ("io_uring: move read/write file prep state into actual opcode handler") Link: https://lore.kernel.org/io-uring/20220608080054.GB22428@xsang-OptiPlex-9020/ Reported-by: kernel test robot <oliver.sang@intel.com> Tested-by: Yin Fengwei <fengwei.yin@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.15.54, v5.15.53, v5.15.52 |
|
#
e0486f3f |
| 30-Jun-2022 |
Dylan Yudaken <dylany@fb.com> |
io_uring: only trace one of complete or overflow
In overflow we see a duplcate line in the trace, and in some cases 3 lines (if initial io_post_aux_cqe fails). Instead just trace once for each CQE
io_uring: only trace one of complete or overflow
In overflow we see a duplcate line in the trace, and in some cases 3 lines (if initial io_post_aux_cqe fails). Instead just trace once for each CQE
Signed-off-by: Dylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220630091231.1456789-13-dylany@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
52120f0f |
| 30-Jun-2022 |
Dylan Yudaken <dylany@fb.com> |
io_uring: add allow_overflow to io_post_aux_cqe
Some use cases of io_post_aux_cqe would not want to overflow as is, but might want to change the flags/result. For example multishot receive requires
io_uring: add allow_overflow to io_post_aux_cqe
Some use cases of io_post_aux_cqe would not want to overflow as is, but might want to change the flags/result. For example multishot receive requires in order CQE, and so if there is an overflow it would need to stop receiving until the overflow is taken care of.
Signed-off-by: Dylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220630091231.1456789-8-dylany@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
114eccdf |
| 30-Jun-2022 |
Dylan Yudaken <dylany@fb.com> |
io_uring: add IOU_STOP_MULTISHOT return code
For multishot we want a way to signal the caller that multishot has ended but also this might not be an error return.
For example sockets return 0 when
io_uring: add IOU_STOP_MULTISHOT return code
For multishot we want a way to signal the caller that multishot has ended but also this might not be an error return.
For example sockets return 0 when closed, which should end a multishot recv, but still have a CQE with result 0
Introduce IOU_STOP_MULTISHOT which does this and indicates that the return code is stored inside req->cqe
Signed-off-by: Dylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220630091231.1456789-7-dylany@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.15.51, v5.15.50 |
|
#
ed5ccb3b |
| 22-Jun-2022 |
Dylan Yudaken <dylany@fb.com> |
io_uring: remove priority tw list optimisation
This optimisation has some built in assumptions that make it easy to introduce bugs. It also does not have clear wins that make it worth keeping.
Sign
io_uring: remove priority tw list optimisation
This optimisation has some built in assumptions that make it easy to introduce bugs. It also does not have clear wins that make it worth keeping.
Signed-off-by: Dylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220622134028.2013417-2-dylany@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.15.49 |
|
#
a6b21fbb |
| 21-Jun-2022 |
Pavel Begunkov <asml.silence@gmail.com> |
io_uring: move list helpers to a separate file
It's annoying to have io-wq.h as a dependency every time we want some of struct io_wq_work_list helpers, move them into a separate file.
Signed-off-by
io_uring: move list helpers to a separate file
It's annoying to have io-wq.h as a dependency every time we want some of struct io_wq_work_list helpers, move them into a separate file.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/c1d891ce12b30767d1d2a3b7db2ca3abc1ecc4a2.1655802465.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
625d38b3 |
| 21-Jun-2022 |
Pavel Begunkov <asml.silence@gmail.com> |
io_uring: improve io_run_task_work()
Since SQPOLL now uses TWA_SIGNAL_NO_IPI, there won't be task work items without TIF_NOTIFY_SIGNAL. Simplify io_run_task_work() by removing task->task_works check
io_uring: improve io_run_task_work()
Since SQPOLL now uses TWA_SIGNAL_NO_IPI, there won't be task work items without TIF_NOTIFY_SIGNAL. Simplify io_run_task_work() by removing task->task_works check. Even though looks it doesn't cause extra cache bouncing, it's still nice to not touch it an extra time when it might be not cached.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/75d4f34b0c671075892821a409e28da6cb1d64fe.1655802465.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
9da070b1 |
| 19-Jun-2022 |
Pavel Begunkov <asml.silence@gmail.com> |
io_uring: consistent naming for inline completion
Improve naming of the inline/deferred completion helper so it's consistent with it's *_post counterpart. Add some comments and extra lockdeps to ens
io_uring: consistent naming for inline completion
Improve naming of the inline/deferred completion helper so it's consistent with it's *_post counterpart. Add some comments and extra lockdeps to ensure the locking is done right.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/797c619943dac06529e9d3fcb16e4c3cde6ad1a3.1655684496.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
46929b08 |
| 19-Jun-2022 |
Pavel Begunkov <asml.silence@gmail.com> |
io_uring: add io_commit_cqring_flush()
Since __io_commit_cqring_flush users moved to different files, introduce io_commit_cqring_flush() helper and encapsulate all flags testing details inside.
Sig
io_uring: add io_commit_cqring_flush()
Since __io_commit_cqring_flush users moved to different files, introduce io_commit_cqring_flush() helper and encapsulate all flags testing details inside.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/0da03887435dd9869ffe46dcd3962bf104afcca3.1655684496.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
25399321 |
| 19-Jun-2022 |
Pavel Begunkov <asml.silence@gmail.com> |
io_uring: introduce locking helpers for CQE posting
spin_lock(&ctx->completion_lock); /* post CQEs */ io_commit_cqring(ctx); spin_unlock(&ctx->completion_lock); io_cqring_ev_posted(ctx);
We have ma
io_uring: introduce locking helpers for CQE posting
spin_lock(&ctx->completion_lock); /* post CQEs */ io_commit_cqring(ctx); spin_unlock(&ctx->completion_lock); io_cqring_ev_posted(ctx);
We have many places repeating this sequence, and the three function unlock section is not perfect from the maintainance perspective and also makes it harder to add new locking/sync trick.
Introduce two helpers. io_cq_lock(), which is simple and only grabs ->completion_lock, and io_cq_unlock_post() encapsulating the three call section.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/fe0c682bf7f7b55d9be55b0d034be9c1949277dc.1655684496.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
d9dee430 |
| 19-Jun-2022 |
Pavel Begunkov <asml.silence@gmail.com> |
io_uring: remove ->flush_cqes optimisation
It's not clear how widely used IOSQE_CQE_SKIP_SUCCESS is, and how often ->flush_cqes flag prevents from completion being flushed. Sometimes it's high level
io_uring: remove ->flush_cqes optimisation
It's not clear how widely used IOSQE_CQE_SKIP_SUCCESS is, and how often ->flush_cqes flag prevents from completion being flushed. Sometimes it's high level of concurrency that enables it at least for one CQE, but sometimes it doesn't save much because nobody waiting on the CQ.
Remove ->flush_cqes flag and the optimisation, it should benefit the normal use case. Note, that there is no spurious eventfd problem with that as checks for spuriousness were incorporated into io_eventfd_signal().
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/692e81eeddccc096f449a7960365fa7b4a18f8e6.1655637157.git.asml.silence@gmail.com [axboe: remove now dead state->flush_cqes variable] Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
9046c641 |
| 19-Jun-2022 |
Pavel Begunkov <asml.silence@gmail.com> |
io_uring: reshuffle io_uring/io_uring.h
It's a good idea to first do forward declarations and then inline helpers, otherwise there will be keep stumbling on dependencies between them.
Signed-off-by
io_uring: reshuffle io_uring/io_uring.h
It's a good idea to first do forward declarations and then inline helpers, otherwise there will be keep stumbling on dependencies between them.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/1d7fa6672ed43f20ccc0c54ae201369ebc3ebfab.1655637157.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
ab1c84d8 |
| 16-Jun-2022 |
Pavel Begunkov <asml.silence@gmail.com> |
io_uring: make io_uring_types.h public
Move io_uring types to linux/include, need them public so tracing can see the definitions and we can clean trace/events/io_uring.h
Signed-off-by: Pavel Begunk
io_uring: make io_uring_types.h public
Move io_uring types to linux/include, need them public so tracing can see the definitions and we can clean trace/events/io_uring.h
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/a15f12e8cb7289b2de0deaddcc7518d98a132d17.1655384063.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
b3659a65 |
| 17-Jun-2022 |
Pavel Begunkov <asml.silence@gmail.com> |
io_uring: change ->cqe_cached invariant for CQE32
With IORING_SETUP_CQE32 ->cqe_cached doesn't store a real address but rather an implicit offset into cqes. Store the real cqe pointer and increment
io_uring: change ->cqe_cached invariant for CQE32
With IORING_SETUP_CQE32 ->cqe_cached doesn't store a real address but rather an implicit offset into cqes. Store the real cqe pointer and increment it accordingly if CQE32.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/1ee1838cba16bed96381a006950b36ba640d998c.1655455613.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
e8c328c3 |
| 17-Jun-2022 |
Pavel Begunkov <asml.silence@gmail.com> |
io_uring: deduplicate io_get_cqe() calls
Deduplicate calls to io_get_cqe() from __io_fill_cqe_req().
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/4fa077986
io_uring: deduplicate io_get_cqe() calls
Deduplicate calls to io_get_cqe() from __io_fill_cqe_req().
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/4fa077986cc3abab7c59ff4e7c390c783885465f.1655455613.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
ae5735c6 |
| 17-Jun-2022 |
Pavel Begunkov <asml.silence@gmail.com> |
io_uring: deduplicate __io_fill_cqe_req tracing
Deduplicate two trace_io_uring_complete() calls in __io_fill_cqe_req().
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.ker
io_uring: deduplicate __io_fill_cqe_req tracing
Deduplicate two trace_io_uring_complete() calls in __io_fill_cqe_req().
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/277ed85dba5189ab7d932164b314013a0f0b0fdc.1655455613.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
68494a65 |
| 17-Jun-2022 |
Pavel Begunkov <asml.silence@gmail.com> |
io_uring: introduce io_req_cqe_overflow()
__io_fill_cqe_req() is hot and inlined, we want it to be as small as possible. Add io_req_cqe_overflow() accepting only a request and doing all overflow acc
io_uring: introduce io_req_cqe_overflow()
__io_fill_cqe_req() is hot and inlined, we want it to be as small as possible. Add io_req_cqe_overflow() accepting only a request and doing all overflow accounting, and replace with it two calls to 6 argument io_cqring_event_overflow().
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/048b9fbcce56814d77a1a540409c98c3d383edcb.1655455613.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
faf88dde |
| 17-Jun-2022 |
Pavel Begunkov <asml.silence@gmail.com> |
io_uring: don't inline __io_get_cqe()
__io_get_cqe() is not as hot as io_get_cqe(), no need to inline it, it sheds ~500B from the binary.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link
io_uring: don't inline __io_get_cqe()
__io_get_cqe() is not as hot as io_get_cqe(), no need to inline it, it sheds ~500B from the binary.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/c1ac829198a881b7af8710926f99a3559b9f24c0.1655455613.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
d245bca6 |
| 17-Jun-2022 |
Pavel Begunkov <asml.silence@gmail.com> |
io_uring: don't expose io_fill_cqe_aux()
Deduplicate some code and add a helper for filling an aux CQE, locking and notification.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https:
io_uring: don't expose io_fill_cqe_aux()
Deduplicate some code and add a helper for filling an aux CQE, locking and notification.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b7c6557c8f9dc5c4cfb01292116c682a0ff61081.1655455613.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.15.48 |
|
#
75d7b3ae |
| 16-Jun-2022 |
Pavel Begunkov <asml.silence@gmail.com> |
io_uring: kill REQ_F_COMPLETE_INLINE
REQ_F_COMPLETE_INLINE is only needed to delay queueing into the completion list to io_queue_sqe() as __io_req_complete() is inlined and we don't want to bloat th
io_uring: kill REQ_F_COMPLETE_INLINE
REQ_F_COMPLETE_INLINE is only needed to delay queueing into the completion list to io_queue_sqe() as __io_req_complete() is inlined and we don't want to bloat the kernel.
As now we complete in a more centralised fashion in io_issue_sqe() we can get rid of the flag and queue to the list directly.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/600ba20a9338b8a39b249b23d3d177803613dde4.1655371007.git.asml.silence@gmail.com Reviewed-by: Hao Xu <howeyxu@tencent.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
aa1e90f6 |
| 15-Jun-2022 |
Pavel Begunkov <asml.silence@gmail.com> |
io_uring: move small helpers to headers
There is a bunch of inline helpers that will be useful not only to the core of io_uring, move them to headers.
Signed-off-by: Pavel Begunkov <asml.silence@gm
io_uring: move small helpers to headers
There is a bunch of inline helpers that will be useful not only to the core of io_uring, move them to headers.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/22df99c83723e44cba7e945e8519e64e3642c064.1655310733.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.15.47 |
|
#
f3b44f92 |
| 13-Jun-2022 |
Jens Axboe <axboe@kernel.dk> |
io_uring: move read/write related opcodes to its own file
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
Revision tags: v5.15.46, v5.15.45, v5.15.44 |
|
#
c98817e6 |
| 26-May-2022 |
Jens Axboe <axboe@kernel.dk> |
io_uring: move remaining file table manipulation to filetable.c
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|