Revision tags: v6.6.35, v6.6.34, v6.6.33 |
|
#
43cfac7b |
| 01-Jun-2024 |
Jens Axboe <axboe@kernel.dk> |
io_uring: check for non-NULL file pointer in io_file_can_poll()
commit 5fc16fa5f13b3c06fdb959ef262050bd810416a2 upstream.
In earlier kernels, it was possible to trigger a NULL pointer dereference o
io_uring: check for non-NULL file pointer in io_file_can_poll()
commit 5fc16fa5f13b3c06fdb959ef262050bd810416a2 upstream.
In earlier kernels, it was possible to trigger a NULL pointer dereference off the forced async preparation path, if no file had been assigned. The trace leading to that looks as follows:
BUG: kernel NULL pointer dereference, address: 00000000000000b0 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP CPU: 67 PID: 1633 Comm: buf-ring-invali Not tainted 6.8.0-rc3+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS unknown 2/2/2022 RIP: 0010:io_buffer_select+0xc3/0x210 Code: 00 00 48 39 d1 0f 82 ae 00 00 00 48 81 4b 48 00 00 01 00 48 89 73 70 0f b7 50 0c 66 89 53 42 85 ed 0f 85 d2 00 00 00 48 8b 13 <48> 8b 92 b0 00 00 00 48 83 7a 40 00 0f 84 21 01 00 00 4c 8b 20 5b RSP: 0018:ffffb7bec38c7d88 EFLAGS: 00010246 RAX: ffff97af2be61000 RBX: ffff97af234f1700 RCX: 0000000000000040 RDX: 0000000000000000 RSI: ffff97aecfb04820 RDI: ffff97af234f1700 RBP: 0000000000000000 R08: 0000000000200030 R09: 0000000000000020 R10: ffffb7bec38c7dc8 R11: 000000000000c000 R12: ffffb7bec38c7db8 R13: ffff97aecfb05800 R14: ffff97aecfb05800 R15: ffff97af2be5e000 FS: 00007f852f74b740(0000) GS:ffff97b1eeec0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000b0 CR3: 000000016deab005 CR4: 0000000000370ef0 Call Trace: <TASK> ? __die+0x1f/0x60 ? page_fault_oops+0x14d/0x420 ? do_user_addr_fault+0x61/0x6a0 ? exc_page_fault+0x6c/0x150 ? asm_exc_page_fault+0x22/0x30 ? io_buffer_select+0xc3/0x210 __io_import_iovec+0xb5/0x120 io_readv_prep_async+0x36/0x70 io_queue_sqe_fallback+0x20/0x260 io_submit_sqes+0x314/0x630 __do_sys_io_uring_enter+0x339/0xbc0 ? __do_sys_io_uring_register+0x11b/0xc50 ? vm_mmap_pgoff+0xce/0x160 do_syscall_64+0x5f/0x180 entry_SYSCALL_64_after_hwframe+0x46/0x4e RIP: 0033:0x55e0a110a67e Code: ba cc 00 00 00 45 31 c0 44 0f b6 92 d0 00 00 00 31 d2 41 b9 08 00 00 00 41 83 e2 01 41 c1 e2 04 41 09 c2 b8 aa 01 00 00 0f 05 <c3> 90 89 30 eb a9 0f 1f 40 00 48 8b 42 20 8b 00 a8 06 75 af 85 f6
because the request is marked forced ASYNC and has a bad file fd, and hence takes the forced async prep path.
Current kernels with the request async prep cleaned up can no longer hit this issue, but for ease of backporting, let's add this safety check in here too as it really doesn't hurt. For both cases, this will inevitably end with a CQE posted with -EBADF.
Cc: stable@vger.kernel.org Fixes: a76c0b31eef5 ("io_uring: commit non-pollable provided mapped buffers upfront") Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
Revision tags: v6.6.35, v6.6.34, v6.6.33 |
|
#
43cfac7b |
| 01-Jun-2024 |
Jens Axboe <axboe@kernel.dk> |
io_uring: check for non-NULL file pointer in io_file_can_poll()
commit 5fc16fa5f13b3c06fdb959ef262050bd810416a2 upstream.
In earlier kernels, it was possible to trigger a NULL pointer dereference o
io_uring: check for non-NULL file pointer in io_file_can_poll()
commit 5fc16fa5f13b3c06fdb959ef262050bd810416a2 upstream.
In earlier kernels, it was possible to trigger a NULL pointer dereference off the forced async preparation path, if no file had been assigned. The trace leading to that looks as follows:
BUG: kernel NULL pointer dereference, address: 00000000000000b0 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP CPU: 67 PID: 1633 Comm: buf-ring-invali Not tainted 6.8.0-rc3+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS unknown 2/2/2022 RIP: 0010:io_buffer_select+0xc3/0x210 Code: 00 00 48 39 d1 0f 82 ae 00 00 00 48 81 4b 48 00 00 01 00 48 89 73 70 0f b7 50 0c 66 89 53 42 85 ed 0f 85 d2 00 00 00 48 8b 13 <48> 8b 92 b0 00 00 00 48 83 7a 40 00 0f 84 21 01 00 00 4c 8b 20 5b RSP: 0018:ffffb7bec38c7d88 EFLAGS: 00010246 RAX: ffff97af2be61000 RBX: ffff97af234f1700 RCX: 0000000000000040 RDX: 0000000000000000 RSI: ffff97aecfb04820 RDI: ffff97af234f1700 RBP: 0000000000000000 R08: 0000000000200030 R09: 0000000000000020 R10: ffffb7bec38c7dc8 R11: 000000000000c000 R12: ffffb7bec38c7db8 R13: ffff97aecfb05800 R14: ffff97aecfb05800 R15: ffff97af2be5e000 FS: 00007f852f74b740(0000) GS:ffff97b1eeec0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000b0 CR3: 000000016deab005 CR4: 0000000000370ef0 Call Trace: <TASK> ? __die+0x1f/0x60 ? page_fault_oops+0x14d/0x420 ? do_user_addr_fault+0x61/0x6a0 ? exc_page_fault+0x6c/0x150 ? asm_exc_page_fault+0x22/0x30 ? io_buffer_select+0xc3/0x210 __io_import_iovec+0xb5/0x120 io_readv_prep_async+0x36/0x70 io_queue_sqe_fallback+0x20/0x260 io_submit_sqes+0x314/0x630 __do_sys_io_uring_enter+0x339/0xbc0 ? __do_sys_io_uring_register+0x11b/0xc50 ? vm_mmap_pgoff+0xce/0x160 do_syscall_64+0x5f/0x180 entry_SYSCALL_64_after_hwframe+0x46/0x4e RIP: 0033:0x55e0a110a67e Code: ba cc 00 00 00 45 31 c0 44 0f b6 92 d0 00 00 00 31 d2 41 b9 08 00 00 00 41 83 e2 01 41 c1 e2 04 41 09 c2 b8 aa 01 00 00 0f 05 <c3> 90 89 30 eb a9 0f 1f 40 00 48 8b 42 20 8b 00 a8 06 75 af 85 f6
because the request is marked forced ASYNC and has a bad file fd, and hence takes the forced async prep path.
Current kernels with the request async prep cleaned up can no longer hit this issue, but for ease of backporting, let's add this safety check in here too as it really doesn't hurt. For both cases, this will inevitably end with a CQE posted with -EBADF.
Cc: stable@vger.kernel.org Fixes: a76c0b31eef5 ("io_uring: commit non-pollable provided mapped buffers upfront") Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
Revision tags: v6.6.32, v6.6.31, v6.6.30, v6.6.29, v6.6.28, v6.6.27, v6.6.26, v6.6.25, v6.6.24 |
|
#
65938e81 |
| 02-Apr-2024 |
Jens Axboe <axboe@kernel.dk> |
io_uring/kbuf: hold io_buffer_list reference over mmap
commit 561e4f9451d65fc2f7eef564e0064373e3019793 upstream.
If we look up the kbuf, ensure that it doesn't get unregistered until after we're do
io_uring/kbuf: hold io_buffer_list reference over mmap
commit 561e4f9451d65fc2f7eef564e0064373e3019793 upstream.
If we look up the kbuf, ensure that it doesn't get unregistered until after we're done with it. Since we're inside mmap, we cannot safely use the io_uring lock. Rely on the fact that we can lookup the buffer list under RCU now and grab a reference to it, preventing it from being unregistered until we're done with it. The lookup returns the io_buffer_list directly with it referenced.
Cc: stable@vger.kernel.org # v6.4+ Fixes: 5cf4f52e6d8a ("io_uring: free io_buffer_list entries via RCU") Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
Revision tags: v6.6.23 |
|
#
b392402d |
| 15-Mar-2024 |
Jens Axboe <axboe@kernel.dk> |
io_uring/kbuf: protect io_buffer_list teardown with a reference
commit 6b69c4ab4f685327d9e10caf0d84217ba23a8c4b upstream.
No functional changes in this patch, just in preparation for being able to
io_uring/kbuf: protect io_buffer_list teardown with a reference
commit 6b69c4ab4f685327d9e10caf0d84217ba23a8c4b upstream.
No functional changes in this patch, just in preparation for being able to keep the buffer list alive outside of the ctx->uring_lock.
Cc: stable@vger.kernel.org # v6.4+ Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
4c0a5da0 |
| 14-Mar-2024 |
Jens Axboe <axboe@kernel.dk> |
io_uring/kbuf: get rid of bl->is_ready
commit 3b80cff5a4d117c53d38ce805823084eaeffbde6 upstream.
Now that xarray is being exclusively used for the buffer_list lookup, this check is no longer needed
io_uring/kbuf: get rid of bl->is_ready
commit 3b80cff5a4d117c53d38ce805823084eaeffbde6 upstream.
Now that xarray is being exclusively used for the buffer_list lookup, this check is no longer needed. Get rid of it and the is_ready member.
Cc: stable@vger.kernel.org # v6.4+ Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
d6e03f6d |
| 14-Mar-2024 |
Jens Axboe <axboe@kernel.dk> |
io_uring/kbuf: get rid of lower BGID lists
commit 09ab7eff38202159271534d2f5ad45526168f2a5 upstream.
Just rely on the xarray for any kind of bgid. This simplifies things, and it really doesn't brin
io_uring/kbuf: get rid of lower BGID lists
commit 09ab7eff38202159271534d2f5ad45526168f2a5 upstream.
Just rely on the xarray for any kind of bgid. This simplifies things, and it really doesn't bring us much, if anything.
Cc: stable@vger.kernel.org # v6.4+ Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
Revision tags: v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5 |
|
#
7e6621b9 |
| 05-Dec-2023 |
Jens Axboe <axboe@kernel.dk> |
io_uring/kbuf: check for buffer list readiness after NULL check
[ Upstream commit 9865346b7e8374b57f1c3ccacdc77846c6352ff4 ]
Move the buffer list 'is_ready' check below the validity check for the b
io_uring/kbuf: check for buffer list readiness after NULL check
[ Upstream commit 9865346b7e8374b57f1c3ccacdc77846c6352ff4 ]
Move the buffer list 'is_ready' check below the validity check for the buffer list for a given group.
Fixes: 5cf4f52e6d8a ("io_uring: free io_buffer_list entries via RCU") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
b2173a8b |
| 05-Dec-2023 |
Dan Carpenter <dan.carpenter@linaro.org> |
io_uring/kbuf: Fix an NULL vs IS_ERR() bug in io_alloc_pbuf_ring()
[ Upstream commit e53f7b54b1fdecae897f25002ff0cff04faab228 ]
The io_mem_alloc() function returns error pointers, not NULL. Update
io_uring/kbuf: Fix an NULL vs IS_ERR() bug in io_alloc_pbuf_ring()
[ Upstream commit e53f7b54b1fdecae897f25002ff0cff04faab228 ]
The io_mem_alloc() function returns error pointers, not NULL. Update the check accordingly.
Fixes: b10b73c102a2 ("io_uring/kbuf: recycle freed mapped buffer ring entries") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/r/5ed268d3-a997-4f64-bd71-47faa92101ab@moroto.mountain Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v6.6.4 |
|
#
9e1152a6 |
| 28-Nov-2023 |
Jens Axboe <axboe@kernel.dk> |
io_uring/kbuf: recycle freed mapped buffer ring entries
commit b10b73c102a2eab91e1cd62a03d6446f1dfecc64 upstream.
Right now we stash any potentially mmap'ed provided ring buffer range for freeing a
io_uring/kbuf: recycle freed mapped buffer ring entries
commit b10b73c102a2eab91e1cd62a03d6446f1dfecc64 upstream.
Right now we stash any potentially mmap'ed provided ring buffer range for freeing at release time, regardless of when they get unregistered. Since we're keeping track of these ranges anyway, keep track of their registration state as well, and use that to recycle ranges when appropriate rather than always allocate new ones.
The lookup is a basic scan of entries, checking for the best matching free entry.
Fixes: c392cbecd8ec ("io_uring/kbuf: defer release of mapped buffer rings") Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
Revision tags: v6.6.3 |
|
#
7138ebbe |
| 27-Nov-2023 |
Jens Axboe <axboe@kernel.dk> |
io_uring/kbuf: defer release of mapped buffer rings
commit c392cbecd8eca4c53f2bf508731257d9d0a21c2d upstream.
If a provided buffer ring is setup with IOU_PBUF_RING_MMAP, then the kernel allocates t
io_uring/kbuf: defer release of mapped buffer rings
commit c392cbecd8eca4c53f2bf508731257d9d0a21c2d upstream.
If a provided buffer ring is setup with IOU_PBUF_RING_MMAP, then the kernel allocates the memory for it and the application is expected to mmap(2) this memory. However, io_uring uses remap_pfn_range() for this operation, so we cannot rely on normal munmap/release on freeing them for us.
Stash an io_buf_free entry away for each of these, if any, and provide a helper to free them post ->release().
Cc: stable@vger.kernel.org Fixes: c56e022c0a27 ("io_uring: add support for user mapped provided buffer ring") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
09f75200 |
| 27-Nov-2023 |
Jens Axboe <axboe@kernel.dk> |
io_uring: free io_buffer_list entries via RCU
commit 5cf4f52e6d8aa2d3b7728f568abbf9d42a3af252 upstream.
mmap_lock nests under uring_lock out of necessity, as we may be doing user copies with uring_
io_uring: free io_buffer_list entries via RCU
commit 5cf4f52e6d8aa2d3b7728f568abbf9d42a3af252 upstream.
mmap_lock nests under uring_lock out of necessity, as we may be doing user copies with uring_lock held. However, for mmap of provided buffer rings, we attempt to grab uring_lock with mmap_lock already held from do_mmap(). This makes lockdep, rightfully, complain:
WARNING: possible circular locking dependency detected 6.7.0-rc1-00009-gff3337ebaf94-dirty #4438 Not tainted ------------------------------------------------------ buf-ring.t/442 is trying to acquire lock: ffff00020e1480a8 (&ctx->uring_lock){+.+.}-{3:3}, at: io_uring_validate_mmap_request.isra.0+0x4c/0x140
but task is already holding lock: ffff0000dc226190 (&mm->mmap_lock){++++}-{3:3}, at: vm_mmap_pgoff+0x124/0x264
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&mm->mmap_lock){++++}-{3:3}: __might_fault+0x90/0xbc io_register_pbuf_ring+0x94/0x488 __arm64_sys_io_uring_register+0x8dc/0x1318 invoke_syscall+0x5c/0x17c el0_svc_common.constprop.0+0x108/0x130 do_el0_svc+0x2c/0x38 el0_svc+0x4c/0x94 el0t_64_sync_handler+0x118/0x124 el0t_64_sync+0x168/0x16c
-> #0 (&ctx->uring_lock){+.+.}-{3:3}: __lock_acquire+0x19a0/0x2d14 lock_acquire+0x2e0/0x44c __mutex_lock+0x118/0x564 mutex_lock_nested+0x20/0x28 io_uring_validate_mmap_request.isra.0+0x4c/0x140 io_uring_mmu_get_unmapped_area+0x3c/0x98 get_unmapped_area+0xa4/0x158 do_mmap+0xec/0x5b4 vm_mmap_pgoff+0x158/0x264 ksys_mmap_pgoff+0x1d4/0x254 __arm64_sys_mmap+0x80/0x9c invoke_syscall+0x5c/0x17c el0_svc_common.constprop.0+0x108/0x130 do_el0_svc+0x2c/0x38 el0_svc+0x4c/0x94 el0t_64_sync_handler+0x118/0x124 el0t_64_sync+0x168/0x16c
From that mmap(2) path, we really just need to ensure that the buffer list doesn't go away from underneath us. For the lower indexed entries, they never go away until the ring is freed and we can always sanely reference those as long as the caller has a file reference. For the higher indexed ones in our xarray, we just need to ensure that the buffer list remains valid while we return the address of it.
Free the higher indexed io_buffer_list entries via RCU. With that we can avoid needing ->uring_lock inside mmap(2), and simply hold the RCU read lock around the buffer list lookup and address check.
To ensure that the arrayed lookup either returns a valid fully formulated entry via RCU lookup, add an 'is_ready' flag that we access with store and release memory ordering. This isn't needed for the xarray lookups, but doesn't hurt either. Since this isn't a fast path, retain it across both types. Similarly, for the allocated array inside the ctx, ensure we use the proper load/acquire as setup could in theory be running in parallel with mmap.
While in there, add a few lockdep checks for documentation purposes.
Cc: stable@vger.kernel.org Fixes: c56e022c0a27 ("io_uring: add support for user mapped provided buffer ring") Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
Revision tags: v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6 |
|
#
46484864 |
| 04-Oct-2023 |
Gabriel Krisman Bertazi <krisman@suse.de> |
io_uring/kbuf: Allow the full buffer id space for provided buffers
[ Upstream commit f74c746e476b9dad51448b9a9421aae72b60e25f ]
nbufs tracks the number of buffers and not the last bgid. In 16-bit,
io_uring/kbuf: Allow the full buffer id space for provided buffers
[ Upstream commit f74c746e476b9dad51448b9a9421aae72b60e25f ]
nbufs tracks the number of buffers and not the last bgid. In 16-bit, we have 2^16 valid buffers, but the check mistakenly rejects the last bid. Let's fix it to make the interface consistent with the documentation.
Fixes: ddf0322db79c ("io_uring: add IORING_OP_PROVIDE_BUFFERS") Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de> Link: https://lore.kernel.org/r/20231005000531.30800-3-krisman@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
de92bc45 |
| 04-Oct-2023 |
Gabriel Krisman Bertazi <krisman@suse.de> |
io_uring/kbuf: Fix check of BID wrapping in provided buffers
[ Upstream commit ab69838e7c75b0edb699c1a8f42752b30333c46f ]
Commit 3851d25c75ed0 ("io_uring: check for rollover of buffer ID when provi
io_uring/kbuf: Fix check of BID wrapping in provided buffers
[ Upstream commit ab69838e7c75b0edb699c1a8f42752b30333c46f ]
Commit 3851d25c75ed0 ("io_uring: check for rollover of buffer ID when providing buffers") introduced a check to prevent wrapping the BID counter when sqe->off is provided, but it's off-by-one too restrictive, rejecting the last possible BID (65534).
i.e., the following fails with -EINVAL.
io_uring_prep_provide_buffers(sqe, addr, size, 0xFFFF, 0, 0);
Fixes: 3851d25c75ed ("io_uring: check for rollover of buffer ID when providing buffers") Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de> Link: https://lore.kernel.org/r/20231005000531.30800-2-krisman@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
f8024f1f |
| 02-Oct-2023 |
Jens Axboe <axboe@kernel.dk> |
io_uring/kbuf: don't allow registered buffer rings on highmem pages
syzbot reports that registering a mapped buffer ring on arm32 can trigger an OOPS. Registered buffer rings have two modes, one of
io_uring/kbuf: don't allow registered buffer rings on highmem pages
syzbot reports that registering a mapped buffer ring on arm32 can trigger an OOPS. Registered buffer rings have two modes, one of them is the application passing in the memory that the buffer ring should reside in. Once those pages are mapped, we use page_address() to get a virtual address. This will obviously fail on highmem pages, which aren't mapped.
Add a check if we have any highmem pages after mapping, and fail the attempt to register a provided buffer ring if we do. This will return the same error as kernels that don't support provided buffer rings to begin with.
Link: https://lore.kernel.org/io-uring/000000000000af635c0606bcb889@google.com/ Fixes: c56e022c0a27 ("io_uring: add support for user mapped provided buffer ring") Cc: stable@vger.kernel.org Reported-by: syzbot+2113e61b8848fa7951d8@syzkaller.appspotmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46 |
|
#
99a9e0b8 |
| 16-Aug-2023 |
Matthew Wilcox (Oracle) <willy@infradead.org> |
io_uring: stop calling free_compound_page()
Patch series "Remove _folio_dtor and _folio_order", v2.
This patch (of 13):
folio_put() is the standard way to write this, and it's not appreciably slo
io_uring: stop calling free_compound_page()
Patch series "Remove _folio_dtor and _folio_order", v2.
This patch (of 13):
folio_put() is the standard way to write this, and it's not appreciably slower. This is an enabling patch for removing free_compound_page() entirely.
Link: https://lkml.kernel.org/r/20230816151201.3655946-1-willy@infradead.org Link: https://lkml.kernel.org/r/20230816151201.3655946-2-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Yanteng Si <siyanteng@loongson.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
Revision tags: v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24 |
|
#
ceac766a |
| 11-Apr-2023 |
Pavel Begunkov <asml.silence@gmail.com> |
io_uring/kbuf: remove extra ->buf_ring null check
The kernel test robot complains about __io_remove_buffers().
io_uring/kbuf.c:221 __io_remove_buffers() warn: variable dereferenced before check 'bl
io_uring/kbuf: remove extra ->buf_ring null check
The kernel test robot complains about __io_remove_buffers().
io_uring/kbuf.c:221 __io_remove_buffers() warn: variable dereferenced before check 'bl->buf_ring' (see line 219)
That check is not needed as ->buf_ring will always be set, so we can remove it and so silence the warning.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/9a632bbf749d9d911e605255652ce08d18e7d2c6.1681210788.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v6.1.23, v6.1.22, v6.1.21 |
|
#
fcb46c0c |
| 17-Mar-2023 |
Jens Axboe <axboe@kernel.dk> |
io_uring/kbuf: disallow mapping a badly aligned provided ring buffer
On at least parisc, we have strict requirements on how we virtually map an address that is shared between the application and the
io_uring/kbuf: disallow mapping a badly aligned provided ring buffer
On at least parisc, we have strict requirements on how we virtually map an address that is shared between the application and the kernel. On these platforms, IOU_PBUF_RING_MMAP should be used when setting up a shared ring buffer for provided buffers. If the application is mapping these pages and asking the kernel to pin+map them as well, then we have no control over what virtual address we get in the kernel.
For that case, do a sanity check if SHM_COLOUR is defined, and disallow the mapping request. The application must fall back to using IOU_PBUF_RING_MMAP for this case, and liburing will do that transparently with the set of helpers that it has.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v6.1.20 |
|
#
c56e022c |
| 14-Mar-2023 |
Jens Axboe <axboe@kernel.dk> |
io_uring: add support for user mapped provided buffer ring
The ring mapped provided buffer rings rely on the application allocating the memory for the ring, and then the kernel will map it. This gen
io_uring: add support for user mapped provided buffer ring
The ring mapped provided buffer rings rely on the application allocating the memory for the ring, and then the kernel will map it. This generally works fine, but runs into issues on some architectures where we need to be able to ensure that the kernel and application virtual address for the ring play nicely together. This at least impacts architectures that set SHM_COLOUR, but potentially also anyone setting SHMLBA.
To use this variant of ring provided buffers, the application need not allocate any memory for the ring. Instead the kernel will do so, and the allocation must subsequently call mmap(2) on the ring with the offset set to:
IORING_OFF_PBUF_RING | (bgid << IORING_OFF_PBUF_SHIFT)
to get a virtual address for the buffer ring. Normally the application would allocate a suitable piece of memory (and correctly aligned) and simply pass that in via io_uring_buf_reg.ring_addr and the kernel would map it.
Outside of the setup differences, the kernel allocate + user mapped provided buffer ring works exactly the same.
Acked-by: Helge Deller <deller@gmx.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
81cf17cd |
| 14-Mar-2023 |
Jens Axboe <axboe@kernel.dk> |
io_uring/kbuf: rename struct io_uring_buf_reg 'pad' to'flags'
In preparation for allowing flags to be set for registration, rename the padding and use it for that.
Acked-by: Helge Deller <deller@gm
io_uring/kbuf: rename struct io_uring_buf_reg 'pad' to'flags'
In preparation for allowing flags to be set for registration, rename the padding and use it for that.
Acked-by: Helge Deller <deller@gmx.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
25a2c188 |
| 14-Mar-2023 |
Jens Axboe <axboe@kernel.dk> |
io_uring/kbuf: add buffer_list->is_mapped member
Rather than rely on checking buffer_list->buf_pages or ->buf_nr_pages, add a separate member that tracks if this is a ring mapped provided buffer lis
io_uring/kbuf: add buffer_list->is_mapped member
Rather than rely on checking buffer_list->buf_pages or ->buf_nr_pages, add a separate member that tracks if this is a ring mapped provided buffer list or not.
Acked-by: Helge Deller <deller@gmx.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
ba56b632 |
| 14-Mar-2023 |
Jens Axboe <axboe@kernel.dk> |
io_uring/kbuf: move pinning of provided buffer ring into helper
In preparation for allowing the kernel to allocate the provided buffer rings and have the application mmap it instead, abstract out th
io_uring/kbuf: move pinning of provided buffer ring into helper
In preparation for allowing the kernel to allocate the provided buffer rings and have the application mmap it instead, abstract out the current method of pinning and mapping the user allocated ring.
No functional changes intended in this patch.
Acked-by: Helge Deller <deller@gmx.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
b4a72c05 |
| 01-Apr-2023 |
Wojciech Lukowicz <wlukowicz01@gmail.com> |
io_uring: fix memory leak when removing provided buffers
When removing provided buffers, io_buffer structs are not being disposed of, leading to a memory leak. They can't be freed individually, beca
io_uring: fix memory leak when removing provided buffers
When removing provided buffers, io_buffer structs are not being disposed of, leading to a memory leak. They can't be freed individually, because they are allocated in page-sized groups. They need to be added to some free list instead, such as io_buffers_cache. All callers already hold the lock protecting it, apart from when destroying buffers, so had to extend the lock there.
Fixes: cc3cec8367cb ("io_uring: speedup provided buffer handling") Signed-off-by: Wojciech Lukowicz <wlukowicz01@gmail.com> Link: https://lore.kernel.org/r/20230401195039.404909-2-wlukowicz01@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
c0921e51 |
| 01-Apr-2023 |
Wojciech Lukowicz <wlukowicz01@gmail.com> |
io_uring: fix return value when removing provided buffers
When a request to remove buffers is submitted, and the given number to be removed is larger than available in the specified buffer group, th
io_uring: fix return value when removing provided buffers
When a request to remove buffers is submitted, and the given number to be removed is larger than available in the specified buffer group, the resulting CQE result will be the number of removed buffers + 1, which is 1 more than it should be.
Previously, the head was part of the list and it got removed after the loop, so the increment was needed. Now, the head is not an element of the list, so the increment shouldn't be there anymore.
Fixes: dbc7d452e7cf ("io_uring: manage provided buffers strictly ordered") Signed-off-by: Wojciech Lukowicz <wlukowicz01@gmail.com> Link: https://lore.kernel.org/r/20230401195039.404909-2-wlukowicz01@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2 |
|
#
48ba0837 |
| 18-Feb-2023 |
Wojciech Lukowicz <wlukowicz01@gmail.com> |
io_uring: fix size calculation when registering buf ring
Using struct_size() to calculate the size of io_uring_buf_ring will sum the size of the struct and of the bufs array. However, the struct's f
io_uring: fix size calculation when registering buf ring
Using struct_size() to calculate the size of io_uring_buf_ring will sum the size of the struct and of the bufs array. However, the struct's fields are overlaid with the array making the calculated size larger than it should be.
When registering a ring with N * PAGE_SIZE / sizeof(struct io_uring_buf) entries, i.e. with fully filled pages, the calculated size will span one more page than it should and io_uring will try to pin the following page. Depending on how the application allocated the ring, it might succeed using an unrelated page or fail returning EFAULT.
The size of the ring should be the product of ring_entries and the size of io_uring_buf, i.e. the size of the bufs array only.
Fixes: c7fb19428d67 ("io_uring: add support for ring mapped supplied buffers") Signed-off-by: Wojciech Lukowicz <wlukowicz01@gmail.com> Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de> Link: https://lore.kernel.org/r/20230218184141.70891-1-wlukowicz01@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8, v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80 |
|
#
c3b49093 |
| 24-Nov-2022 |
Pavel Begunkov <asml.silence@gmail.com> |
io_uring: don't use complete_post in kbuf
Now we're handling IOPOLL completions more generically, get rid uses of _post() and send requests through the normal path. It may have some extra mertis per
io_uring: don't use complete_post in kbuf
Now we're handling IOPOLL completions more generically, get rid uses of _post() and send requests through the normal path. It may have some extra mertis performance wise, but we don't care much as there is a better interface for selected buffers.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/4deded706587f55b006dc33adf0c13cfc3b2319f.1669310258.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|