History log of /openbmc/linux/io_uring/kbuf.c (Results 1 – 25 of 36)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.6.35, v6.6.34, v6.6.33
# 43cfac7b 01-Jun-2024 Jens Axboe <axboe@kernel.dk>

io_uring: check for non-NULL file pointer in io_file_can_poll()

commit 5fc16fa5f13b3c06fdb959ef262050bd810416a2 upstream.

In earlier kernels, it was possible to trigger a NULL pointer
dereference o

io_uring: check for non-NULL file pointer in io_file_can_poll()

commit 5fc16fa5f13b3c06fdb959ef262050bd810416a2 upstream.

In earlier kernels, it was possible to trigger a NULL pointer
dereference off the forced async preparation path, if no file had
been assigned. The trace leading to that looks as follows:

BUG: kernel NULL pointer dereference, address: 00000000000000b0
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP
CPU: 67 PID: 1633 Comm: buf-ring-invali Not tainted 6.8.0-rc3+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS unknown 2/2/2022
RIP: 0010:io_buffer_select+0xc3/0x210
Code: 00 00 48 39 d1 0f 82 ae 00 00 00 48 81 4b 48 00 00 01 00 48 89 73 70 0f b7 50 0c 66 89 53 42 85 ed 0f 85 d2 00 00 00 48 8b 13 <48> 8b 92 b0 00 00 00 48 83 7a 40 00 0f 84 21 01 00 00 4c 8b 20 5b
RSP: 0018:ffffb7bec38c7d88 EFLAGS: 00010246
RAX: ffff97af2be61000 RBX: ffff97af234f1700 RCX: 0000000000000040
RDX: 0000000000000000 RSI: ffff97aecfb04820 RDI: ffff97af234f1700
RBP: 0000000000000000 R08: 0000000000200030 R09: 0000000000000020
R10: ffffb7bec38c7dc8 R11: 000000000000c000 R12: ffffb7bec38c7db8
R13: ffff97aecfb05800 R14: ffff97aecfb05800 R15: ffff97af2be5e000
FS: 00007f852f74b740(0000) GS:ffff97b1eeec0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000b0 CR3: 000000016deab005 CR4: 0000000000370ef0
Call Trace:
<TASK>
? __die+0x1f/0x60
? page_fault_oops+0x14d/0x420
? do_user_addr_fault+0x61/0x6a0
? exc_page_fault+0x6c/0x150
? asm_exc_page_fault+0x22/0x30
? io_buffer_select+0xc3/0x210
__io_import_iovec+0xb5/0x120
io_readv_prep_async+0x36/0x70
io_queue_sqe_fallback+0x20/0x260
io_submit_sqes+0x314/0x630
__do_sys_io_uring_enter+0x339/0xbc0
? __do_sys_io_uring_register+0x11b/0xc50
? vm_mmap_pgoff+0xce/0x160
do_syscall_64+0x5f/0x180
entry_SYSCALL_64_after_hwframe+0x46/0x4e
RIP: 0033:0x55e0a110a67e
Code: ba cc 00 00 00 45 31 c0 44 0f b6 92 d0 00 00 00 31 d2 41 b9 08 00 00 00 41 83 e2 01 41 c1 e2 04 41 09 c2 b8 aa 01 00 00 0f 05 <c3> 90 89 30 eb a9 0f 1f 40 00 48 8b 42 20 8b 00 a8 06 75 af 85 f6

because the request is marked forced ASYNC and has a bad file fd, and
hence takes the forced async prep path.

Current kernels with the request async prep cleaned up can no longer hit
this issue, but for ease of backporting, let's add this safety check in
here too as it really doesn't hurt. For both cases, this will inevitably
end with a CQE posted with -EBADF.

Cc: stable@vger.kernel.org
Fixes: a76c0b31eef5 ("io_uring: commit non-pollable provided mapped buffers upfront")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

show more ...


Revision tags: v6.6.35, v6.6.34, v6.6.33
# 43cfac7b 01-Jun-2024 Jens Axboe <axboe@kernel.dk>

io_uring: check for non-NULL file pointer in io_file_can_poll()

commit 5fc16fa5f13b3c06fdb959ef262050bd810416a2 upstream.

In earlier kernels, it was possible to trigger a NULL pointer
dereference o

io_uring: check for non-NULL file pointer in io_file_can_poll()

commit 5fc16fa5f13b3c06fdb959ef262050bd810416a2 upstream.

In earlier kernels, it was possible to trigger a NULL pointer
dereference off the forced async preparation path, if no file had
been assigned. The trace leading to that looks as follows:

BUG: kernel NULL pointer dereference, address: 00000000000000b0
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP
CPU: 67 PID: 1633 Comm: buf-ring-invali Not tainted 6.8.0-rc3+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS unknown 2/2/2022
RIP: 0010:io_buffer_select+0xc3/0x210
Code: 00 00 48 39 d1 0f 82 ae 00 00 00 48 81 4b 48 00 00 01 00 48 89 73 70 0f b7 50 0c 66 89 53 42 85 ed 0f 85 d2 00 00 00 48 8b 13 <48> 8b 92 b0 00 00 00 48 83 7a 40 00 0f 84 21 01 00 00 4c 8b 20 5b
RSP: 0018:ffffb7bec38c7d88 EFLAGS: 00010246
RAX: ffff97af2be61000 RBX: ffff97af234f1700 RCX: 0000000000000040
RDX: 0000000000000000 RSI: ffff97aecfb04820 RDI: ffff97af234f1700
RBP: 0000000000000000 R08: 0000000000200030 R09: 0000000000000020
R10: ffffb7bec38c7dc8 R11: 000000000000c000 R12: ffffb7bec38c7db8
R13: ffff97aecfb05800 R14: ffff97aecfb05800 R15: ffff97af2be5e000
FS: 00007f852f74b740(0000) GS:ffff97b1eeec0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000b0 CR3: 000000016deab005 CR4: 0000000000370ef0
Call Trace:
<TASK>
? __die+0x1f/0x60
? page_fault_oops+0x14d/0x420
? do_user_addr_fault+0x61/0x6a0
? exc_page_fault+0x6c/0x150
? asm_exc_page_fault+0x22/0x30
? io_buffer_select+0xc3/0x210
__io_import_iovec+0xb5/0x120
io_readv_prep_async+0x36/0x70
io_queue_sqe_fallback+0x20/0x260
io_submit_sqes+0x314/0x630
__do_sys_io_uring_enter+0x339/0xbc0
? __do_sys_io_uring_register+0x11b/0xc50
? vm_mmap_pgoff+0xce/0x160
do_syscall_64+0x5f/0x180
entry_SYSCALL_64_after_hwframe+0x46/0x4e
RIP: 0033:0x55e0a110a67e
Code: ba cc 00 00 00 45 31 c0 44 0f b6 92 d0 00 00 00 31 d2 41 b9 08 00 00 00 41 83 e2 01 41 c1 e2 04 41 09 c2 b8 aa 01 00 00 0f 05 <c3> 90 89 30 eb a9 0f 1f 40 00 48 8b 42 20 8b 00 a8 06 75 af 85 f6

because the request is marked forced ASYNC and has a bad file fd, and
hence takes the forced async prep path.

Current kernels with the request async prep cleaned up can no longer hit
this issue, but for ease of backporting, let's add this safety check in
here too as it really doesn't hurt. For both cases, this will inevitably
end with a CQE posted with -EBADF.

Cc: stable@vger.kernel.org
Fixes: a76c0b31eef5 ("io_uring: commit non-pollable provided mapped buffers upfront")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

show more ...


Revision tags: v6.6.32, v6.6.31, v6.6.30, v6.6.29, v6.6.28, v6.6.27, v6.6.26, v6.6.25, v6.6.24
# 65938e81 02-Apr-2024 Jens Axboe <axboe@kernel.dk>

io_uring/kbuf: hold io_buffer_list reference over mmap

commit 561e4f9451d65fc2f7eef564e0064373e3019793 upstream.

If we look up the kbuf, ensure that it doesn't get unregistered until
after we're do

io_uring/kbuf: hold io_buffer_list reference over mmap

commit 561e4f9451d65fc2f7eef564e0064373e3019793 upstream.

If we look up the kbuf, ensure that it doesn't get unregistered until
after we're done with it. Since we're inside mmap, we cannot safely use
the io_uring lock. Rely on the fact that we can lookup the buffer list
under RCU now and grab a reference to it, preventing it from being
unregistered until we're done with it. The lookup returns the
io_buffer_list directly with it referenced.

Cc: stable@vger.kernel.org # v6.4+
Fixes: 5cf4f52e6d8a ("io_uring: free io_buffer_list entries via RCU")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

show more ...


Revision tags: v6.6.23
# b392402d 15-Mar-2024 Jens Axboe <axboe@kernel.dk>

io_uring/kbuf: protect io_buffer_list teardown with a reference

commit 6b69c4ab4f685327d9e10caf0d84217ba23a8c4b upstream.

No functional changes in this patch, just in preparation for being able
to

io_uring/kbuf: protect io_buffer_list teardown with a reference

commit 6b69c4ab4f685327d9e10caf0d84217ba23a8c4b upstream.

No functional changes in this patch, just in preparation for being able
to keep the buffer list alive outside of the ctx->uring_lock.

Cc: stable@vger.kernel.org # v6.4+
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

show more ...


# 4c0a5da0 14-Mar-2024 Jens Axboe <axboe@kernel.dk>

io_uring/kbuf: get rid of bl->is_ready

commit 3b80cff5a4d117c53d38ce805823084eaeffbde6 upstream.

Now that xarray is being exclusively used for the buffer_list lookup,
this check is no longer needed

io_uring/kbuf: get rid of bl->is_ready

commit 3b80cff5a4d117c53d38ce805823084eaeffbde6 upstream.

Now that xarray is being exclusively used for the buffer_list lookup,
this check is no longer needed. Get rid of it and the is_ready member.

Cc: stable@vger.kernel.org # v6.4+
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

show more ...


# d6e03f6d 14-Mar-2024 Jens Axboe <axboe@kernel.dk>

io_uring/kbuf: get rid of lower BGID lists

commit 09ab7eff38202159271534d2f5ad45526168f2a5 upstream.

Just rely on the xarray for any kind of bgid. This simplifies things, and
it really doesn't brin

io_uring/kbuf: get rid of lower BGID lists

commit 09ab7eff38202159271534d2f5ad45526168f2a5 upstream.

Just rely on the xarray for any kind of bgid. This simplifies things, and
it really doesn't bring us much, if anything.

Cc: stable@vger.kernel.org # v6.4+
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

show more ...


Revision tags: v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5
# 7e6621b9 05-Dec-2023 Jens Axboe <axboe@kernel.dk>

io_uring/kbuf: check for buffer list readiness after NULL check

[ Upstream commit 9865346b7e8374b57f1c3ccacdc77846c6352ff4 ]

Move the buffer list 'is_ready' check below the validity check for
the b

io_uring/kbuf: check for buffer list readiness after NULL check

[ Upstream commit 9865346b7e8374b57f1c3ccacdc77846c6352ff4 ]

Move the buffer list 'is_ready' check below the validity check for
the buffer list for a given group.

Fixes: 5cf4f52e6d8a ("io_uring: free io_buffer_list entries via RCU")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


# b2173a8b 05-Dec-2023 Dan Carpenter <dan.carpenter@linaro.org>

io_uring/kbuf: Fix an NULL vs IS_ERR() bug in io_alloc_pbuf_ring()

[ Upstream commit e53f7b54b1fdecae897f25002ff0cff04faab228 ]

The io_mem_alloc() function returns error pointers, not NULL. Update

io_uring/kbuf: Fix an NULL vs IS_ERR() bug in io_alloc_pbuf_ring()

[ Upstream commit e53f7b54b1fdecae897f25002ff0cff04faab228 ]

The io_mem_alloc() function returns error pointers, not NULL. Update
the check accordingly.

Fixes: b10b73c102a2 ("io_uring/kbuf: recycle freed mapped buffer ring entries")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/r/5ed268d3-a997-4f64-bd71-47faa92101ab@moroto.mountain
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


Revision tags: v6.6.4
# 9e1152a6 28-Nov-2023 Jens Axboe <axboe@kernel.dk>

io_uring/kbuf: recycle freed mapped buffer ring entries

commit b10b73c102a2eab91e1cd62a03d6446f1dfecc64 upstream.

Right now we stash any potentially mmap'ed provided ring buffer range
for freeing a

io_uring/kbuf: recycle freed mapped buffer ring entries

commit b10b73c102a2eab91e1cd62a03d6446f1dfecc64 upstream.

Right now we stash any potentially mmap'ed provided ring buffer range
for freeing at release time, regardless of when they get unregistered.
Since we're keeping track of these ranges anyway, keep track of their
registration state as well, and use that to recycle ranges when
appropriate rather than always allocate new ones.

The lookup is a basic scan of entries, checking for the best matching
free entry.

Fixes: c392cbecd8ec ("io_uring/kbuf: defer release of mapped buffer rings")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

show more ...


Revision tags: v6.6.3
# 7138ebbe 27-Nov-2023 Jens Axboe <axboe@kernel.dk>

io_uring/kbuf: defer release of mapped buffer rings

commit c392cbecd8eca4c53f2bf508731257d9d0a21c2d upstream.

If a provided buffer ring is setup with IOU_PBUF_RING_MMAP, then the
kernel allocates t

io_uring/kbuf: defer release of mapped buffer rings

commit c392cbecd8eca4c53f2bf508731257d9d0a21c2d upstream.

If a provided buffer ring is setup with IOU_PBUF_RING_MMAP, then the
kernel allocates the memory for it and the application is expected to
mmap(2) this memory. However, io_uring uses remap_pfn_range() for this
operation, so we cannot rely on normal munmap/release on freeing them
for us.

Stash an io_buf_free entry away for each of these, if any, and provide
a helper to free them post ->release().

Cc: stable@vger.kernel.org
Fixes: c56e022c0a27 ("io_uring: add support for user mapped provided buffer ring")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

show more ...


# 09f75200 27-Nov-2023 Jens Axboe <axboe@kernel.dk>

io_uring: free io_buffer_list entries via RCU

commit 5cf4f52e6d8aa2d3b7728f568abbf9d42a3af252 upstream.

mmap_lock nests under uring_lock out of necessity, as we may be doing
user copies with uring_

io_uring: free io_buffer_list entries via RCU

commit 5cf4f52e6d8aa2d3b7728f568abbf9d42a3af252 upstream.

mmap_lock nests under uring_lock out of necessity, as we may be doing
user copies with uring_lock held. However, for mmap of provided buffer
rings, we attempt to grab uring_lock with mmap_lock already held from
do_mmap(). This makes lockdep, rightfully, complain:

WARNING: possible circular locking dependency detected
6.7.0-rc1-00009-gff3337ebaf94-dirty #4438 Not tainted
------------------------------------------------------
buf-ring.t/442 is trying to acquire lock:
ffff00020e1480a8 (&ctx->uring_lock){+.+.}-{3:3}, at: io_uring_validate_mmap_request.isra.0+0x4c/0x140

but task is already holding lock:
ffff0000dc226190 (&mm->mmap_lock){++++}-{3:3}, at: vm_mmap_pgoff+0x124/0x264

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&mm->mmap_lock){++++}-{3:3}:
__might_fault+0x90/0xbc
io_register_pbuf_ring+0x94/0x488
__arm64_sys_io_uring_register+0x8dc/0x1318
invoke_syscall+0x5c/0x17c
el0_svc_common.constprop.0+0x108/0x130
do_el0_svc+0x2c/0x38
el0_svc+0x4c/0x94
el0t_64_sync_handler+0x118/0x124
el0t_64_sync+0x168/0x16c

-> #0 (&ctx->uring_lock){+.+.}-{3:3}:
__lock_acquire+0x19a0/0x2d14
lock_acquire+0x2e0/0x44c
__mutex_lock+0x118/0x564
mutex_lock_nested+0x20/0x28
io_uring_validate_mmap_request.isra.0+0x4c/0x140
io_uring_mmu_get_unmapped_area+0x3c/0x98
get_unmapped_area+0xa4/0x158
do_mmap+0xec/0x5b4
vm_mmap_pgoff+0x158/0x264
ksys_mmap_pgoff+0x1d4/0x254
__arm64_sys_mmap+0x80/0x9c
invoke_syscall+0x5c/0x17c
el0_svc_common.constprop.0+0x108/0x130
do_el0_svc+0x2c/0x38
el0_svc+0x4c/0x94
el0t_64_sync_handler+0x118/0x124
el0t_64_sync+0x168/0x16c

From that mmap(2) path, we really just need to ensure that the buffer
list doesn't go away from underneath us. For the lower indexed entries,
they never go away until the ring is freed and we can always sanely
reference those as long as the caller has a file reference. For the
higher indexed ones in our xarray, we just need to ensure that the
buffer list remains valid while we return the address of it.

Free the higher indexed io_buffer_list entries via RCU. With that we can
avoid needing ->uring_lock inside mmap(2), and simply hold the RCU read
lock around the buffer list lookup and address check.

To ensure that the arrayed lookup either returns a valid fully formulated
entry via RCU lookup, add an 'is_ready' flag that we access with store
and release memory ordering. This isn't needed for the xarray lookups,
but doesn't hurt either. Since this isn't a fast path, retain it across
both types. Similarly, for the allocated array inside the ctx, ensure
we use the proper load/acquire as setup could in theory be running in
parallel with mmap.

While in there, add a few lockdep checks for documentation purposes.

Cc: stable@vger.kernel.org
Fixes: c56e022c0a27 ("io_uring: add support for user mapped provided buffer ring")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

show more ...


Revision tags: v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6
# 46484864 04-Oct-2023 Gabriel Krisman Bertazi <krisman@suse.de>

io_uring/kbuf: Allow the full buffer id space for provided buffers

[ Upstream commit f74c746e476b9dad51448b9a9421aae72b60e25f ]

nbufs tracks the number of buffers and not the last bgid. In 16-bit,

io_uring/kbuf: Allow the full buffer id space for provided buffers

[ Upstream commit f74c746e476b9dad51448b9a9421aae72b60e25f ]

nbufs tracks the number of buffers and not the last bgid. In 16-bit, we
have 2^16 valid buffers, but the check mistakenly rejects the last
bid. Let's fix it to make the interface consistent with the
documentation.

Fixes: ddf0322db79c ("io_uring: add IORING_OP_PROVIDE_BUFFERS")
Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
Link: https://lore.kernel.org/r/20231005000531.30800-3-krisman@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


# de92bc45 04-Oct-2023 Gabriel Krisman Bertazi <krisman@suse.de>

io_uring/kbuf: Fix check of BID wrapping in provided buffers

[ Upstream commit ab69838e7c75b0edb699c1a8f42752b30333c46f ]

Commit 3851d25c75ed0 ("io_uring: check for rollover of buffer ID when
provi

io_uring/kbuf: Fix check of BID wrapping in provided buffers

[ Upstream commit ab69838e7c75b0edb699c1a8f42752b30333c46f ]

Commit 3851d25c75ed0 ("io_uring: check for rollover of buffer ID when
providing buffers") introduced a check to prevent wrapping the BID
counter when sqe->off is provided, but it's off-by-one too
restrictive, rejecting the last possible BID (65534).

i.e., the following fails with -EINVAL.

io_uring_prep_provide_buffers(sqe, addr, size, 0xFFFF, 0, 0);

Fixes: 3851d25c75ed ("io_uring: check for rollover of buffer ID when providing buffers")
Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
Link: https://lore.kernel.org/r/20231005000531.30800-2-krisman@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


# f8024f1f 02-Oct-2023 Jens Axboe <axboe@kernel.dk>

io_uring/kbuf: don't allow registered buffer rings on highmem pages

syzbot reports that registering a mapped buffer ring on arm32 can
trigger an OOPS. Registered buffer rings have two modes, one of

io_uring/kbuf: don't allow registered buffer rings on highmem pages

syzbot reports that registering a mapped buffer ring on arm32 can
trigger an OOPS. Registered buffer rings have two modes, one of them
is the application passing in the memory that the buffer ring should
reside in. Once those pages are mapped, we use page_address() to get
a virtual address. This will obviously fail on highmem pages, which
aren't mapped.

Add a check if we have any highmem pages after mapping, and fail the
attempt to register a provided buffer ring if we do. This will return
the same error as kernels that don't support provided buffer rings to
begin with.

Link: https://lore.kernel.org/io-uring/000000000000af635c0606bcb889@google.com/
Fixes: c56e022c0a27 ("io_uring: add support for user mapped provided buffer ring")
Cc: stable@vger.kernel.org
Reported-by: syzbot+2113e61b8848fa7951d8@syzkaller.appspotmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

show more ...


Revision tags: v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46
# 99a9e0b8 16-Aug-2023 Matthew Wilcox (Oracle) <willy@infradead.org>

io_uring: stop calling free_compound_page()

Patch series "Remove _folio_dtor and _folio_order", v2.


This patch (of 13):

folio_put() is the standard way to write this, and it's not appreciably
slo

io_uring: stop calling free_compound_page()

Patch series "Remove _folio_dtor and _folio_order", v2.


This patch (of 13):

folio_put() is the standard way to write this, and it's not appreciably
slower. This is an enabling patch for removing free_compound_page()
entirely.

Link: https://lkml.kernel.org/r/20230816151201.3655946-1-willy@infradead.org
Link: https://lkml.kernel.org/r/20230816151201.3655946-2-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

show more ...


Revision tags: v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24
# ceac766a 11-Apr-2023 Pavel Begunkov <asml.silence@gmail.com>

io_uring/kbuf: remove extra ->buf_ring null check

The kernel test robot complains about __io_remove_buffers().

io_uring/kbuf.c:221 __io_remove_buffers() warn: variable dereferenced
before check 'bl

io_uring/kbuf: remove extra ->buf_ring null check

The kernel test robot complains about __io_remove_buffers().

io_uring/kbuf.c:221 __io_remove_buffers() warn: variable dereferenced
before check 'bl->buf_ring' (see line 219)

That check is not needed as ->buf_ring will always be set, so we can
remove it and so silence the warning.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/9a632bbf749d9d911e605255652ce08d18e7d2c6.1681210788.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

show more ...


Revision tags: v6.1.23, v6.1.22, v6.1.21
# fcb46c0c 17-Mar-2023 Jens Axboe <axboe@kernel.dk>

io_uring/kbuf: disallow mapping a badly aligned provided ring buffer

On at least parisc, we have strict requirements on how we virtually map
an address that is shared between the application and the

io_uring/kbuf: disallow mapping a badly aligned provided ring buffer

On at least parisc, we have strict requirements on how we virtually map
an address that is shared between the application and the kernel. On
these platforms, IOU_PBUF_RING_MMAP should be used when setting up a
shared ring buffer for provided buffers. If the application is mapping
these pages and asking the kernel to pin+map them as well, then we have
no control over what virtual address we get in the kernel.

For that case, do a sanity check if SHM_COLOUR is defined, and disallow
the mapping request. The application must fall back to using
IOU_PBUF_RING_MMAP for this case, and liburing will do that transparently
with the set of helpers that it has.

Signed-off-by: Jens Axboe <axboe@kernel.dk>

show more ...


Revision tags: v6.1.20
# c56e022c 14-Mar-2023 Jens Axboe <axboe@kernel.dk>

io_uring: add support for user mapped provided buffer ring

The ring mapped provided buffer rings rely on the application allocating
the memory for the ring, and then the kernel will map it. This gen

io_uring: add support for user mapped provided buffer ring

The ring mapped provided buffer rings rely on the application allocating
the memory for the ring, and then the kernel will map it. This generally
works fine, but runs into issues on some architectures where we need
to be able to ensure that the kernel and application virtual address for
the ring play nicely together. This at least impacts architectures that
set SHM_COLOUR, but potentially also anyone setting SHMLBA.

To use this variant of ring provided buffers, the application need not
allocate any memory for the ring. Instead the kernel will do so, and
the allocation must subsequently call mmap(2) on the ring with the
offset set to:

IORING_OFF_PBUF_RING | (bgid << IORING_OFF_PBUF_SHIFT)

to get a virtual address for the buffer ring. Normally the application
would allocate a suitable piece of memory (and correctly aligned) and
simply pass that in via io_uring_buf_reg.ring_addr and the kernel would
map it.

Outside of the setup differences, the kernel allocate + user mapped
provided buffer ring works exactly the same.

Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

show more ...


# 81cf17cd 14-Mar-2023 Jens Axboe <axboe@kernel.dk>

io_uring/kbuf: rename struct io_uring_buf_reg 'pad' to'flags'

In preparation for allowing flags to be set for registration, rename
the padding and use it for that.

Acked-by: Helge Deller <deller@gm

io_uring/kbuf: rename struct io_uring_buf_reg 'pad' to'flags'

In preparation for allowing flags to be set for registration, rename
the padding and use it for that.

Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

show more ...


# 25a2c188 14-Mar-2023 Jens Axboe <axboe@kernel.dk>

io_uring/kbuf: add buffer_list->is_mapped member

Rather than rely on checking buffer_list->buf_pages or ->buf_nr_pages,
add a separate member that tracks if this is a ring mapped provided
buffer lis

io_uring/kbuf: add buffer_list->is_mapped member

Rather than rely on checking buffer_list->buf_pages or ->buf_nr_pages,
add a separate member that tracks if this is a ring mapped provided
buffer list or not.

Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

show more ...


# ba56b632 14-Mar-2023 Jens Axboe <axboe@kernel.dk>

io_uring/kbuf: move pinning of provided buffer ring into helper

In preparation for allowing the kernel to allocate the provided buffer
rings and have the application mmap it instead, abstract out th

io_uring/kbuf: move pinning of provided buffer ring into helper

In preparation for allowing the kernel to allocate the provided buffer
rings and have the application mmap it instead, abstract out the
current method of pinning and mapping the user allocated ring.

No functional changes intended in this patch.

Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

show more ...


# b4a72c05 01-Apr-2023 Wojciech Lukowicz <wlukowicz01@gmail.com>

io_uring: fix memory leak when removing provided buffers

When removing provided buffers, io_buffer structs are not being disposed
of, leading to a memory leak. They can't be freed individually, beca

io_uring: fix memory leak when removing provided buffers

When removing provided buffers, io_buffer structs are not being disposed
of, leading to a memory leak. They can't be freed individually, because
they are allocated in page-sized groups. They need to be added to some
free list instead, such as io_buffers_cache. All callers already hold
the lock protecting it, apart from when destroying buffers, so had to
extend the lock there.

Fixes: cc3cec8367cb ("io_uring: speedup provided buffer handling")
Signed-off-by: Wojciech Lukowicz <wlukowicz01@gmail.com>
Link: https://lore.kernel.org/r/20230401195039.404909-2-wlukowicz01@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

show more ...


# c0921e51 01-Apr-2023 Wojciech Lukowicz <wlukowicz01@gmail.com>

io_uring: fix return value when removing provided buffers

When a request to remove buffers is submitted, and the given number to be
removed is larger than available in the specified buffer group, th

io_uring: fix return value when removing provided buffers

When a request to remove buffers is submitted, and the given number to be
removed is larger than available in the specified buffer group, the
resulting CQE result will be the number of removed buffers + 1, which is
1 more than it should be.

Previously, the head was part of the list and it got removed after the
loop, so the increment was needed. Now, the head is not an element of
the list, so the increment shouldn't be there anymore.

Fixes: dbc7d452e7cf ("io_uring: manage provided buffers strictly ordered")
Signed-off-by: Wojciech Lukowicz <wlukowicz01@gmail.com>
Link: https://lore.kernel.org/r/20230401195039.404909-2-wlukowicz01@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

show more ...


Revision tags: v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2
# 48ba0837 18-Feb-2023 Wojciech Lukowicz <wlukowicz01@gmail.com>

io_uring: fix size calculation when registering buf ring

Using struct_size() to calculate the size of io_uring_buf_ring will sum
the size of the struct and of the bufs array. However, the struct's f

io_uring: fix size calculation when registering buf ring

Using struct_size() to calculate the size of io_uring_buf_ring will sum
the size of the struct and of the bufs array. However, the struct's fields
are overlaid with the array making the calculated size larger than it
should be.

When registering a ring with N * PAGE_SIZE / sizeof(struct io_uring_buf)
entries, i.e. with fully filled pages, the calculated size will span one
more page than it should and io_uring will try to pin the following page.
Depending on how the application allocated the ring, it might succeed
using an unrelated page or fail returning EFAULT.

The size of the ring should be the product of ring_entries and the size
of io_uring_buf, i.e. the size of the bufs array only.

Fixes: c7fb19428d67 ("io_uring: add support for ring mapped supplied buffers")
Signed-off-by: Wojciech Lukowicz <wlukowicz01@gmail.com>
Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de>
Link: https://lore.kernel.org/r/20230218184141.70891-1-wlukowicz01@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

show more ...


Revision tags: v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8, v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80
# c3b49093 24-Nov-2022 Pavel Begunkov <asml.silence@gmail.com>

io_uring: don't use complete_post in kbuf

Now we're handling IOPOLL completions more generically, get rid uses of
_post() and send requests through the normal path. It may have some
extra mertis per

io_uring: don't use complete_post in kbuf

Now we're handling IOPOLL completions more generically, get rid uses of
_post() and send requests through the normal path. It may have some
extra mertis performance wise, but we don't care much as there is a
better interface for selected buffers.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/4deded706587f55b006dc33adf0c13cfc3b2319f.1669310258.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

show more ...


12