Revision tags: v6.6.25, v6.6.24, v6.6.23 |
|
#
4cbc5e93 |
| 15-Mar-2024 |
Pavel Begunkov <asml.silence@gmail.com> |
io_uring: fix poll_remove stalled req completion
[ Upstream commit 5e3afe580a9f5ca173a6bd55ffe10948796ef7e5 ]
Taking the ctx lock is not enough to use the deferred request completion infrastructure
io_uring: fix poll_remove stalled req completion
[ Upstream commit 5e3afe580a9f5ca173a6bd55ffe10948796ef7e5 ]
Taking the ctx lock is not enough to use the deferred request completion infrastructure, it'll get queued into the list but no one would expect it there, so it will sit there until next io_submit_flush_completions(). It's hard to care about the cancellation path, so complete it via tw.
Fixes: ef7dfac51d8ed ("io_uring/poll: serialize poll linked timer start with poll removal") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/c446740bc16858f8a2a8dcdce899812f21d15f23.1710514702.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v6.6.16, v6.6.15 |
|
#
7cbd3aa5 |
| 29-Jan-2024 |
Jens Axboe <axboe@kernel.dk> |
io_uring/poll: add requeue return code from poll multishot handling
Commit 704ea888d646cb9d715662944cf389c823252ee0 upstream.
Since our poll handling is edge triggered, multishot handlers retry int
io_uring/poll: add requeue return code from poll multishot handling
Commit 704ea888d646cb9d715662944cf389c823252ee0 upstream.
Since our poll handling is edge triggered, multishot handlers retry internally until they know that no more data is available. In preparation for limiting these retries, add an internal return code, IOU_REQUEUE, which can be used to inform the poll backend about the handler wanting to retry, but that this should happen through a normal task_work requeue rather than keep hammering on the issue side for this one request.
No functional changes in this patch, nobody is using this return code just yet.
Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
0848bf7e |
| 29-Jan-2024 |
Jens Axboe <axboe@kernel.dk> |
io_uring/poll: move poll execution helpers higher up
Commit e84b01a880f635e3084a361afba41f95ff500d12 upstream.
In preparation for calling __io_poll_execute() higher up, move the functions to avoid
io_uring/poll: move poll execution helpers higher up
Commit e84b01a880f635e3084a361afba41f95ff500d12 upstream.
In preparation for calling __io_poll_execute() higher up, move the functions to avoid forward declarations.
No functional changes in this patch.
Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
Revision tags: v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46 |
|
#
b6b2bb58 |
| 11-Aug-2023 |
Pavel Begunkov <asml.silence@gmail.com> |
io_uring: never overflow io_aux_cqe
Now all callers of io_aux_cqe() set allow_overflow to false, remove the parameter and not allow overflowing auxilary multishot cqes.
When CQ is full the function
io_uring: never overflow io_aux_cqe
Now all callers of io_aux_cqe() set allow_overflow to false, remove the parameter and not allow overflowing auxilary multishot cqes.
When CQ is full the function callers and all multishot requests in general are expected to complete the request. That prevents indefinite in-background grows of the overflow list and let's the userspace to handle the backlog at its own pace.
Resubmitting a request should also be faster than accounting a bunch of overflows, so it should be better for perf when it happens, but a well behaving userspace should be trying to avoid overflows in any case.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/bb20d14d708ea174721e58bb53786b0521e4dd6d.1691757663.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4 |
|
#
d7b8b079 |
| 23-Jun-2023 |
Jens Axboe <axboe@kernel.dk> |
io_uring/cancel: support opcode based lookup and cancelation
Add IORING_ASYNC_CANCEL_OP flag for cancelation, which allows the application to target cancelation based on the opcode of the original r
io_uring/cancel: support opcode based lookup and cancelation
Add IORING_ASYNC_CANCEL_OP flag for cancelation, which allows the application to target cancelation based on the opcode of the original request.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
a30badf6 |
| 23-Jun-2023 |
Jens Axboe <axboe@kernel.dk> |
io_uring: use cancelation match helper for poll and timeout requests
Get rid of the request vs io_cancel_data checking and just use the exported helper for this.
Signed-off-by: Jens Axboe <axboe@ke
io_uring: use cancelation match helper for poll and timeout requests
Get rid of the request vs io_cancel_data checking and just use the exported helper for this.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
ad711c5d |
| 23-Jun-2023 |
Jens Axboe <axboe@kernel.dk> |
io_uring/poll: always set 'ctx' in io_cancel_data
This isn't strictly necessary for this callsite, as it uses it's internal lookup for this cancelation purpose. But let's be consistent with how it's
io_uring/poll: always set 'ctx' in io_cancel_data
This isn't strictly necessary for this callsite, as it uses it's internal lookup for this cancelation purpose. But let's be consistent with how it's used in general and set ctx as well.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v6.1.35 |
|
#
ef7dfac5 |
| 17-Jun-2023 |
Jens Axboe <axboe@kernel.dk> |
io_uring/poll: serialize poll linked timer start with poll removal
We selectively grab the ctx->uring_lock for poll update/removal, but we really should grab it from the start to fully synchronize w
io_uring/poll: serialize poll linked timer start with poll removal
We selectively grab the ctx->uring_lock for poll update/removal, but we really should grab it from the start to fully synchronize with linked timeouts. Normally this is indeed the case, but if requests are forced async by the application, we don't fully cover removal and timer disarm within the uring_lock.
Make this simpler by having consistent locking state for poll removal.
Cc: stable@vger.kernel.org # 6.1+ Reported-by: Querijn Voet <querijnqyn@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v6.1.34, v6.1.33 |
|
#
d86eaed1 |
| 07-Jun-2023 |
Jens Axboe <axboe@kernel.dk> |
io_uring: cleanup io_aux_cqe() API
Everybody is passing in the request, so get rid of the io_ring_ctx and explicit user_data pass-in. Both the ctx and user_data can be deduced from the request at ha
io_uring: cleanup io_aux_cqe() API
Everybody is passing in the request, so get rid of the io_ring_ctx and explicit user_data pass-in. Both the ctx and user_data can be deduced from the request at hand.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v6.1.32 |
|
#
c92fcfc2 |
| 02-Jun-2023 |
Jens Axboe <axboe@kernel.dk> |
io_uring: avoid indirect function calls for the hottest task_work
We use task_work for a variety of reasons, but doing completions or triggering rety after poll are by far the hottest two. Use the i
io_uring: avoid indirect function calls for the hottest task_work
We use task_work for a variety of reasons, but doing completions or triggering rety after poll are by far the hottest two. Use the indirect funtion call wrappers to avoid the indirect function call if CONFIG_RETPOLINE is set.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22 |
|
#
a282967c |
| 27-Mar-2023 |
Pavel Begunkov <asml.silence@gmail.com> |
io_uring: encapsulate task_work state
For task works we're passing around a bool pointer for whether the current ring is locked or not, let's wrap it in a structure, that will make it more opaque pr
io_uring: encapsulate task_work state
For task works we're passing around a bool pointer for whether the current ring is locked or not, let's wrap it in a structure, that will make it more opaque preventing abuse and will also help us to pass more info in the future if needed.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/1ecec9483d58696e248d1bfd52cf62b04442df1d.1679931367.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
005308f7 |
| 27-Mar-2023 |
Jens Axboe <axboe@kernel.dk> |
io_uring/poll: clear single/double poll flags on poll arming
Unless we have at least one entry queued, then don't call into io_poll_remove_entries(). Normally this isn't possible, but if we retry po
io_uring/poll: clear single/double poll flags on poll arming
Unless we have at least one entry queued, then don't call into io_poll_remove_entries(). Normally this isn't possible, but if we retry poll then we can have ->nr_entries cleared again as we're setting it up. If this happens for a poll retry, then we'll still have at least REQ_F_SINGLE_POLL set. io_poll_remove_entries() then thinks it has entries to remove.
Clear REQ_F_SINGLE_POLL and REQ_F_DOUBLE_POLL unconditionally when arming a poll request.
Fixes: c16bda37594f ("io_uring/poll: allow some retries for poll triggering spuriously") Cc: stable@vger.kernel.org Reported-by: Pengfei Xu <pengfei.xu@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15 |
|
#
1947ddf9 |
| 27-Feb-2023 |
Jens Axboe <axboe@kernel.dk> |
io_uring/poll: don't pass in wake func to io_init_poll_iocb()
We only use one, and it's io_poll_wake(). Hardwire that in the initial init, as well as in __io_queue_proc() if we're setting up for dou
io_uring/poll: don't pass in wake func to io_init_poll_iocb()
We only use one, and it's io_poll_wake(). Hardwire that in the initial init, as well as in __io_queue_proc() if we're setting up for double poll.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
c16bda37 |
| 25-Feb-2023 |
Jens Axboe <axboe@kernel.dk> |
io_uring/poll: allow some retries for poll triggering spuriously
If we get woken spuriously when polling and fail the operation with -EAGAIN again, then we generally only allow polling again if data
io_uring/poll: allow some retries for poll triggering spuriously
If we get woken spuriously when polling and fail the operation with -EAGAIN again, then we generally only allow polling again if data had been transferred at some point. This is indicated with REQ_F_PARTIAL_IO. However, if the spurious poll triggers when the socket was originally empty, then we haven't transferred data yet and we will fail the poll re-arm. This either punts the socket to io-wq if it's blocking, or it fails the request with -EAGAIN if not. Neither condition is desirable, as the former will slow things down, while the latter will make the application confused.
We want to ensure that a repeated poll trigger doesn't lead to infinite work making no progress, that's what the REQ_F_PARTIAL_IO check was for. But it doesn't protect against a loop post the first receive, and it's unnecessarily strict if we started out with an empty socket.
Add a somewhat random retry count, just to put an upper limit on the potential number of retries that will be done. This should be high enough that we won't really hit it in practice, unless something needs to be aborted anyway.
Cc: stable@vger.kernel.org # v5.10+ Link: https://github.com/axboe/liburing/issues/364 Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8, v6.1.7, v6.1.6 |
|
#
a7dd2782 |
| 12-Jan-2023 |
Breno Leitao <leitao@debian.org> |
io_uring: Rename struct io_op_def
The current io_op_def struct is becoming huge and the name is a bit generic.
The goal of this patch is to rename this struct to `io_issue_def`. This struct will co
io_uring: Rename struct io_op_def
The current io_op_def struct is becoming huge and the name is a bit generic.
The goal of this patch is to rename this struct to `io_issue_def`. This struct will contain the hot functions associated with the issue code path.
For now, this patch only renames the structure, and an upcoming patch will break up the structure in two, moving the non-issue fields to a secondary struct.
Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/20230112144411.2624698-1-leitao@debian.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
8caa03f1 |
| 20-Jan-2023 |
Jens Axboe <axboe@kernel.dk> |
io_uring/poll: don't reissue in case of poll race on multishot request
A previous commit fixed a poll race that can occur, but it's only applicable for multishot requests. For a multishot request, w
io_uring/poll: don't reissue in case of poll race on multishot request
A previous commit fixed a poll race that can occur, but it's only applicable for multishot requests. For a multishot request, we can safely ignore a spurious wakeup, as we never leave the waitqueue to begin with.
A blunt reissue of a multishot armed request can cause us to leak a buffer, if they are ring provided. While this seems like a bug in itself, it's not really defined behavior to reissue a multishot request directly. It's less efficient to do so as well, and not required to rearm anything like it is for singleshot poll requests.
Cc: stable@vger.kernel.org Fixes: 6e5aedb9324a ("io_uring/poll: attempt request issue after racy poll wakeup") Reported-and-tested-by: Olivier Langlois <olivier@trillion01.com> Link: https://github.com/axboe/liburing/issues/778 Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v6.1.5, v6.0.19 |
|
#
6e5aedb9 |
| 10-Jan-2023 |
Jens Axboe <axboe@kernel.dk> |
io_uring/poll: attempt request issue after racy poll wakeup
If we have multiple requests waiting on the same target poll waitqueue, then it's quite possible to get a request triggered and get disapp
io_uring/poll: attempt request issue after racy poll wakeup
If we have multiple requests waiting on the same target poll waitqueue, then it's quite possible to get a request triggered and get disappointed in not being able to make any progress with it. If we race in doing so, we'll potentially leave the poll request on the internal tables, but removed from the waitqueue. That means that any subsequent trigger of the poll waitqueue will not kick that request into action, causing an application to potentially wait for completion of a request that will never happen.
Fix this by adding a new poll return state, IOU_POLL_REISSUE. Rather than have complicated logic for how to re-arm a given type of request, just punt it for a reissue.
While in there, move the 'ret' variable to the only section where it gets used. This avoids confusion the scope of it.
Cc: stable@vger.kernel.org Fixes: eb0089d629ba ("io_uring: single shot poll removal optimisation") Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
febb985c |
| 09-Jan-2023 |
Jens Axboe <axboe@kernel.dk> |
io_uring/poll: add hash if ready poll request can't complete inline
If we don't, then we may lose access to it completely, leading to a request leak. This will eventually stall the ring exit process
io_uring/poll: add hash if ready poll request can't complete inline
If we don't, then we may lose access to it completely, leading to a request leak. This will eventually stall the ring exit process as well.
Cc: stable@vger.kernel.org Fixes: 49f1c68e048f ("io_uring: optimise submission side poll_refs") Reported-and-tested-by: syzbot+6c95df01470a47fc3af4@syzkaller.appspotmail.com Link: https://lore.kernel.org/io-uring/0000000000009f829805f1ce87b2@google.com/ Suggested-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11 |
|
#
443e5755 |
| 30-Nov-2022 |
Pavel Begunkov <asml.silence@gmail.com> |
io_uring: combine poll tw handlers
Merge apoll and regular poll tw handlers, it will help with inlining.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/482e5
io_uring: combine poll tw handlers
Merge apoll and regular poll tw handlers, it will help with inlining.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/482e59edb9fc81bd275fdbf486837330fb27120a.1669821213.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
c3bfb57e |
| 30-Nov-2022 |
Pavel Begunkov <asml.silence@gmail.com> |
io_uring: improve poll warning handling
Don't try to complete requests if their refs are broken and we've got a warning, it's much better to drop them and potentially leaking than double freeing.
S
io_uring: improve poll warning handling
Don't try to complete requests if their refs are broken and we've got a warning, it's much better to drop them and potentially leaking than double freeing.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/31edf9f96f05d03ab62c114508a231a2dce434cb.1669821213.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
047b6aef |
| 30-Nov-2022 |
Pavel Begunkov <asml.silence@gmail.com> |
io_uring: remove ctx variable in io_poll_check_events
ctx is only used by io_poll_check_events() for multishot poll CQE posting, don't save it on stack in advance.
Signed-off-by: Pavel Begunkov <as
io_uring: remove ctx variable in io_poll_check_events
ctx is only used by io_poll_check_events() for multishot poll CQE posting, don't save it on stack in advance.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/552c1771f8a0e7688afdb4f538ead245f53e80e7.1669821213.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
9805fa2d |
| 30-Nov-2022 |
Pavel Begunkov <asml.silence@gmail.com> |
io_uring: carve io_poll_check_events fast path
The fast path in io_poll_check_events() is when we have only one (i.e. master) reference. Move all verification, cancellations checks, edge case handli
io_uring: carve io_poll_check_events fast path
The fast path in io_poll_check_events() is when we have only one (i.e. master) reference. Move all verification, cancellations checks, edge case handling and so on under a common if.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/8c21c5d5e027e32dc553705e88796dec79ff6f93.1669821213.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v6.0.10, v5.15.80 |
|
#
12ad3d2d |
| 25-Nov-2022 |
Lin Ma <linma@zju.edu.cn> |
io_uring/poll: fix poll_refs race with cancelation
There is an interesting race condition of poll_refs which could result in a NULL pointer dereference. The crash trace is like:
KASAN: null-ptr-der
io_uring/poll: fix poll_refs race with cancelation
There is an interesting race condition of poll_refs which could result in a NULL pointer dereference. The crash trace is like:
KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 0 PID: 30781 Comm: syz-executor.2 Not tainted 6.0.0-g493ffd6605b2 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 RIP: 0010:io_poll_remove_entry io_uring/poll.c:154 [inline] RIP: 0010:io_poll_remove_entries+0x171/0x5b4 io_uring/poll.c:190 Code: ... RSP: 0018:ffff88810dfefba0 EFLAGS: 00010202 RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000040000 RDX: ffffc900030c4000 RSI: 000000000003ffff RDI: 0000000000040000 RBP: 0000000000000008 R08: ffffffff9764d3dd R09: fffffbfff3836781 R10: fffffbfff3836781 R11: 0000000000000000 R12: 1ffff11003422d60 R13: ffff88801a116b04 R14: ffff88801a116ac0 R15: dffffc0000000000 FS: 00007f9c07497700(0000) GS:ffff88811a600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffb5c00ea98 CR3: 0000000105680005 CR4: 0000000000770ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> io_apoll_task_func+0x3f/0xa0 io_uring/poll.c:299 handle_tw_list io_uring/io_uring.c:1037 [inline] tctx_task_work+0x37e/0x4f0 io_uring/io_uring.c:1090 task_work_run+0x13a/0x1b0 kernel/task_work.c:177 get_signal+0x2402/0x25a0 kernel/signal.c:2635 arch_do_signal_or_restart+0x3b/0x660 arch/x86/kernel/signal.c:869 exit_to_user_mode_loop kernel/entry/common.c:166 [inline] exit_to_user_mode_prepare+0xc2/0x160 kernel/entry/common.c:201 __syscall_exit_to_user_mode_work kernel/entry/common.c:283 [inline] syscall_exit_to_user_mode+0x58/0x160 kernel/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x63/0xcd
The root cause for this is a tiny overlooking in io_poll_check_events() when cocurrently run with poll cancel routine io_poll_cancel_req().
The interleaving to trigger use-after-free:
CPU0 | CPU1 | io_apoll_task_func() | io_poll_cancel_req() io_poll_check_events() | // do while first loop | v = atomic_read(...) | // v = poll_refs = 1 | ... | io_poll_mark_cancelled() | atomic_or() | // poll_refs = IO_POLL_CANCEL_FLAG | 1 | atomic_sub_return(...) | // poll_refs = IO_POLL_CANCEL_FLAG | // loop continue | | | io_poll_execute() | io_poll_get_ownership() | // poll_refs = IO_POLL_CANCEL_FLAG | 1 | // gets the ownership v = atomic_read(...) | // poll_refs not change | | if (v & IO_POLL_CANCEL_FLAG) | return -ECANCELED; | // io_poll_check_events return | // will go into | // io_req_complete_failed() free req | | | io_apoll_task_func() | // also go into io_req_complete_failed()
And the interleaving to trigger the kernel WARNING:
CPU0 | CPU1 | io_apoll_task_func() | io_poll_cancel_req() io_poll_check_events() | // do while first loop | v = atomic_read(...) | // v = poll_refs = 1 | ... | io_poll_mark_cancelled() | atomic_or() | // poll_refs = IO_POLL_CANCEL_FLAG | 1 | atomic_sub_return(...) | // poll_refs = IO_POLL_CANCEL_FLAG | // loop continue | | v = atomic_read(...) | // v = IO_POLL_CANCEL_FLAG | | io_poll_execute() | io_poll_get_ownership() | // poll_refs = IO_POLL_CANCEL_FLAG | 1 | // gets the ownership | WARN_ON_ONCE(!(v & IO_POLL_REF_MASK))) | // v & IO_POLL_REF_MASK = 0 WARN | | | io_apoll_task_func() | // also go into io_req_complete_failed()
By looking up the source code and communicating with Pavel, the implementation of this atomic poll refs should continue the loop of io_poll_check_events() just to avoid somewhere else to grab the ownership. Therefore, this patch simply adds another AND operation to make sure the loop will stop if it finds the poll_refs is exactly equal to IO_POLL_CANCEL_FLAG. Since io_poll_cancel_req() grabs ownership and will finally make its way to io_req_complete_failed(), the req will be reclaimed as expected.
Fixes: aa43477b0402 ("io_uring: poll rework") Signed-off-by: Lin Ma <linma@zju.edu.cn> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> [axboe: tweak description and code style] Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
a26a35e9 |
| 20-Nov-2022 |
Pavel Begunkov <asml.silence@gmail.com> |
io_uring: make poll refs more robust
poll_refs carry two functions, the first is ownership over the request. The second is notifying the io_poll_check_events() that there was an event but wake up co
io_uring: make poll refs more robust
poll_refs carry two functions, the first is ownership over the request. The second is notifying the io_poll_check_events() that there was an event but wake up couldn't grab the ownership, so io_poll_check_events() should retry.
We want to make poll_refs more robust against overflows. Instead of always incrementing it, which covers two purposes with one atomic, check if poll_refs is elevated enough and if so set a retry flag without attempts to grab ownership. The gap between the bias check and following atomics may seem racy, but we don't need it to be strict. Moreover there might only be maximum 4 parallel updates: by the first and the second poll entries, __io_arm_poll_handler() and cancellation. From those four, only poll wake ups may be executed multiple times, but they're protected by a spin.
Cc: stable@vger.kernel.org Reported-by: Lin Ma <linma@zju.edu.cn> Fixes: aa43477b04025 ("io_uring: poll rework") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/c762bc31f8683b3270f3587691348a7119ef9c9d.1668963050.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
2f389343 |
| 20-Nov-2022 |
Pavel Begunkov <asml.silence@gmail.com> |
io_uring: cmpxchg for poll arm refs release
Replace atomically substracting the ownership reference at the end of arming a poll with a cmpxchg. We try to release ownership by setting 0 assuming that
io_uring: cmpxchg for poll arm refs release
Replace atomically substracting the ownership reference at the end of arming a poll with a cmpxchg. We try to release ownership by setting 0 assuming that poll_refs didn't change while we were arming. If it did change, we keep the ownership and use it to queue a tw, which is fully capable to process all events and (even tolerates spurious wake ups).
It's a bit more elegant as we reduce races b/w setting the cancellation flag and getting refs with this release, and with that we don't have to worry about any kinds of underflows. It's not the fastest path for polling. The performance difference b/w cmpxchg and atomic dec is usually negligible and it's not the fastest path.
Cc: stable@vger.kernel.org Fixes: aa43477b04025 ("io_uring: poll rework") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/0c95251624397ea6def568ff040cad2d7926fd51.1668963050.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|