Revision tags: v5.14.6, v5.10.67, v5.10.66, v5.14.5, v5.14.4, v5.10.65, v5.14.3, v5.10.64, v5.14.2, v5.10.63, v5.14.1, v5.10.62, v5.14, v5.10.61, v5.10.60 |
|
#
270a1c91 |
| 12-Aug-2021 |
Jens Axboe <axboe@kernel.dk> |
block: provide bio_clear_hipri() helper
Any case that turns off REQ_HIPRI must also clear BIO_PERCPU_CACHE, as non-polled IO may complete through hard/soft IRQ and hence isn't safe for our polled bi
block: provide bio_clear_hipri() helper
Any case that turns off REQ_HIPRI must also clear BIO_PERCPU_CACHE, as non-polled IO may complete through hard/soft IRQ and hence isn't safe for our polled bio alloc cache.
Provide a helper that does just that, and use it in the merging code as well if we split a bio and turn off polling.
Fixes: be863b9e4348 ("block: clear BIO_PERCPU_CACHE flag if polling isn't supported") Reported-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
4f1e9630 |
| 01-Aug-2021 |
Chunguang Xu <brookxu@tencent.com> |
blk-throtl: optimize IOPS throttle for large IO scenarios
After patch 54efd50 (block: make generic_make_request handle arbitrarily sized bios), the IO through io-throttle may be larger, and these IO
blk-throtl: optimize IOPS throttle for large IO scenarios
After patch 54efd50 (block: make generic_make_request handle arbitrarily sized bios), the IO through io-throttle may be larger, and these IOs may be further split into more small IOs. However, IOPS throttle does not seem to be aware of this change, which makes the calculation of IOPS of large IOs incomplete, resulting in disk-side IOPS that does not meet expectations. Maybe we should fix this problem.
We can reproduce it by set max_sectors_kb of disk to 128, set blkio.write_iops_throttle to 100, run a dd instance inside blkio and use iostat to watch IOPS:
dd if=/dev/zero of=/dev/sdb bs=1M count=1000 oflag=direct
As a result, without this change the average IOPS is 1995, with this change the IOPS is 98.
Signed-off-by: Chunguang Xu <brookxu@tencent.com> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/65869aaad05475797d63b4c3fed4f529febe3c26.1627876014.git.brookxu@tencent.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
866663b7 |
| 28-Jul-2021 |
Ming Lei <ming.lei@redhat.com> |
block: return ELEVATOR_DISCARD_MERGE if possible
When merging one bio to request, if they are discard IO and the queue supports multi-range discard, we need to return ELEVATOR_DISCARD_MERGE because
block: return ELEVATOR_DISCARD_MERGE if possible
When merging one bio to request, if they are discard IO and the queue supports multi-range discard, we need to return ELEVATOR_DISCARD_MERGE because both block core and related drivers(nvme, virtio-blk) doesn't handle mixed discard io merge(traditional IO merge together with discard merge) well.
Fix the issue by returning ELEVATOR_DISCARD_MERGE in this situation, so both blk-mq and drivers just need to handle multi-range discard.
Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name> Signed-off-by: Ming Lei <ming.lei@redhat.com> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Fixes: 2705dfb20947 ("block: fix discard request merge") Link: https://lore.kernel.org/r/20210729034226.1591070-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.10.53, v5.10.52, v5.10.51, v5.10.50, v5.10.49 |
|
#
2705dfb2 |
| 27-Jun-2021 |
Ming Lei <ming.lei@redhat.com> |
block: fix discard request merge
ll_new_hw_segment() is reached only in case of single range discard merge, and we don't have max discard segment size limit actually, so it is wrong to run the follo
block: fix discard request merge
ll_new_hw_segment() is reached only in case of single range discard merge, and we don't have max discard segment size limit actually, so it is wrong to run the following check:
if (req->nr_phys_segments + nr_phys_segs > blk_rq_get_max_segments(req))
it may be always false since req->nr_phys_segments is initialized as one, and bio's segment count is still 1, blk_rq_get_max_segments(reg) is 1 too.
Fix the issue by not doing the check and bypassing the calculation of discard request's nr_phys_segments.
Based on analysis from Wang Shanker.
Cc: Christoph Hellwig <hch@lst.de> Reported-by: Wang Shanker <shankerwangmiao@gmail.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20210628023312.1903255-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.13, v5.10.46 |
|
#
fd2ef39c |
| 23-Jun-2021 |
Jan Kara <jack@suse.cz> |
blk: Fix lock inversion between ioc lock and bfqd lock
Lockdep complains about lock inversion between ioc->lock and bfqd->lock:
bfqd -> ioc: put_io_context+0x33/0x90 -> ioc->lock grabbed blk_mq_f
blk: Fix lock inversion between ioc lock and bfqd lock
Lockdep complains about lock inversion between ioc->lock and bfqd->lock:
bfqd -> ioc: put_io_context+0x33/0x90 -> ioc->lock grabbed blk_mq_free_request+0x51/0x140 blk_put_request+0xe/0x10 blk_attempt_req_merge+0x1d/0x30 elv_attempt_insert_merge+0x56/0xa0 blk_mq_sched_try_insert_merge+0x4b/0x60 bfq_insert_requests+0x9e/0x18c0 -> bfqd->lock grabbed blk_mq_sched_insert_requests+0xd6/0x2b0 blk_mq_flush_plug_list+0x154/0x280 blk_finish_plug+0x40/0x60 ext4_writepages+0x696/0x1320 do_writepages+0x1c/0x80 __filemap_fdatawrite_range+0xd7/0x120 sync_file_range+0xac/0xf0
ioc->bfqd: bfq_exit_icq+0xa3/0xe0 -> bfqd->lock grabbed put_io_context_active+0x78/0xb0 -> ioc->lock grabbed exit_io_context+0x48/0x50 do_exit+0x7e9/0xdd0 do_group_exit+0x54/0xc0
To avoid this inversion we change blk_mq_sched_try_insert_merge() to not free the merged request but rather leave that upto the caller similarly to blk_mq_sched_try_merge(). And in bfq_insert_requests() we make sure to free all the merged requests after dropping bfqd->lock.
Fixes: aee69d78dec0 ("block, bfq: introduce the BFQ-v0 I/O scheduler as an extra scheduler") Reviewed-by: Ming Lei <ming.lei@redhat.com> Acked-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20210623093634.27879-3-jack@suse.cz Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.10.43, v5.10.42, v5.10.41, v5.10.40, v5.10.39, v5.4.119, v5.10.36, v5.10.35, v5.10.34, v5.4.116, v5.10.33, v5.12, v5.10.32, v5.10.31, v5.10.30, v5.10.27, v5.10.26, v5.10.25, v5.10.24, v5.10.23, v5.10.22, v5.10.21, v5.10.20, v5.10.19, v5.4.101, v5.10.18, v5.10.17, v5.11, v5.10.16 |
|
#
a958937f |
| 11-Feb-2021 |
David Jeffery <djeffery@redhat.com> |
block: recalculate segment count for multi-segment discards correctly
When a stacked block device inserts a request into another block device using blk_insert_cloned_request, the request's nr_phys_s
block: recalculate segment count for multi-segment discards correctly
When a stacked block device inserts a request into another block device using blk_insert_cloned_request, the request's nr_phys_segments field gets recalculated by a call to blk_recalc_rq_segments in blk_cloned_rq_check_limits. But blk_recalc_rq_segments does not know how to handle multi-segment discards. For disk types which can handle multi-segment discards like nvme, this results in discard requests which claim a single segment when it should report several, triggering a warning in nvme and causing nvme to fail the discard from the invalid state.
WARNING: CPU: 5 PID: 191 at drivers/nvme/host/core.c:700 nvme_setup_discard+0x170/0x1e0 [nvme_core] ... nvme_setup_cmd+0x217/0x270 [nvme_core] nvme_loop_queue_rq+0x51/0x1b0 [nvme_loop] __blk_mq_try_issue_directly+0xe7/0x1b0 blk_mq_request_issue_directly+0x41/0x70 ? blk_account_io_start+0x40/0x50 dm_mq_queue_rq+0x200/0x3e0 blk_mq_dispatch_rq_list+0x10a/0x7d0 ? __sbitmap_queue_get+0x25/0x90 ? elv_rb_del+0x1f/0x30 ? deadline_remove_request+0x55/0xb0 ? dd_dispatch_request+0x181/0x210 __blk_mq_do_dispatch_sched+0x144/0x290 ? bio_attempt_discard_merge+0x134/0x1f0 __blk_mq_sched_dispatch_requests+0x129/0x180 blk_mq_sched_dispatch_requests+0x30/0x60 __blk_mq_run_hw_queue+0x47/0xe0 __blk_mq_delay_run_hw_queue+0x15b/0x170 blk_mq_sched_insert_requests+0x68/0xe0 blk_mq_flush_plug_list+0xf0/0x170 blk_finish_plug+0x36/0x50 xlog_cil_committed+0x19f/0x290 [xfs] xlog_cil_process_committed+0x57/0x80 [xfs] xlog_state_do_callback+0x1e0/0x2a0 [xfs] xlog_ioend_work+0x2f/0x80 [xfs] process_one_work+0x1b6/0x350 worker_thread+0x53/0x3e0 ? process_one_work+0x350/0x350 kthread+0x11b/0x140 ? __kthread_bind_mask+0x60/0x60 ret_from_fork+0x22/0x30
This patch fixes blk_recalc_rq_segments to be aware of devices which can have multi-segment discards. It calculates the correct discard segment count by counting the number of bio as each discard bio is considered its own segment.
Fixes: 1e739730c5b9 ("block: optionally merge discontiguous discard bios into a single request") Signed-off-by: David Jeffery <djeffery@redhat.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Laurence Oberman <loberman@redhat.com> Link: https://lore.kernel.org/r/20210211143807.GA115624@redhat Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.10.15, v5.10.14 |
|
#
309dca30 |
| 24-Jan-2021 |
Christoph Hellwig <hch@lst.de> |
block: store a block_device pointer in struct bio
Replace the gendisk pointer in struct bio with a pointer to the newly improved struct block device. From that the gendisk can be trivially accessed
block: store a block_device pointer in struct bio
Replace the gendisk pointer in struct bio with a pointer to the newly improved struct block device. From that the gendisk can be trivially accessed with an extra indirection, but it also allows to directly look up all information related to partition remapping.
Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.10 |
|
#
cc29e1bf |
| 26-Nov-2020 |
Jeffle Xu <jefflexu@linux.alibaba.com> |
block: disable iopoll for split bio
iopoll is initially for small size, latency sensitive IO. It doesn't work well for big IO, especially when it needs to be split to multiple bios. In this case, th
block: disable iopoll for split bio
iopoll is initially for small size, latency sensitive IO. It doesn't work well for big IO, especially when it needs to be split to multiple bios. In this case, the returned cookie of __submit_bio_noacct_mq() is indeed the cookie of the last split bio. The completion of *this* last split bio done by iopoll doesn't mean the whole original bio has completed. Callers of iopoll still need to wait for completion of other split bios.
Besides bio splitting may cause more trouble for iopoll which isn't supposed to be used in case of big IO.
iopoll for split bio may cause potential race if CPU migration happens during bio submission. Since the returned cookie is that of the last split bio, polling on the corresponding hardware queue doesn't help complete other split bios, if these split bios are enqueued into different hardware queues. Since interrupts are disabled for polling queues, the completion of these other split bios depends on timeout mechanism, thus causing a potential hang.
iopoll for split bio may also cause hang for sync polling. Currently both the blkdev and iomap-based fs (ext4/xfs, etc) support sync polling in direct IO routine. These routines will submit bio without REQ_NOWAIT flag set, and then start sync polling in current process context. The process may hang in blk_mq_get_tag() if the submitted bio has to be split into multiple bios and can rapidly exhaust the queue depth. The process are waiting for the completion of the previously allocated requests, which should be reaped by the following polling, and thus causing a deadlock.
To avoid these subtle trouble described above, just disable iopoll for split bio and return BLK_QC_T_NONE in this case. The side effect is that non-HIPRI IO also returns BLK_QC_T_NONE now. It should be acceptable since the returned cookie is never used for non-HIPRI IO.
Suggested-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
a54895fa |
| 03-Dec-2020 |
Christoph Hellwig <hch@lst.de> |
block: remove the request_queue to argument request based tracepoints
The request_queue can trivially be derived from the request.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien
block: remove the request_queue to argument request based tracepoints
The request_queue can trivially be derived from the request.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
eb6f7f7c |
| 03-Dec-2020 |
Christoph Hellwig <hch@lst.de> |
block: remove the request_queue argument to the block_split tracepoint
The request_queue can trivially be derived from the bio.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le
block: remove the request_queue argument to the block_split tracepoint
The request_queue can trivially be derived from the bio.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
e8a676d6 |
| 03-Dec-2020 |
Christoph Hellwig <hch@lst.de> |
block: simplify and extend the block_bio_merge tracepoint class
The block_bio_merge tracepoint class can be reused for most bio-based tracepoints. For that it just needs to lose the superfluous q a
block: simplify and extend the block_bio_merge tracepoint class
The block_bio_merge tracepoint class can be reused for most bio-based tracepoints. For that it just needs to lose the superfluous q and rq parameters.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
cb8432d6 |
| 26-Nov-2020 |
Christoph Hellwig <hch@lst.de> |
block: allocate struct hd_struct as part of struct bdev_inode
Allocate hd_struct together with struct block_device to pre-load the lifetime rule changes in preparation of merging the two structures.
block: allocate struct hd_struct as part of struct bdev_inode
Allocate hd_struct together with struct block_device to pre-load the lifetime rule changes in preparation of merging the two structures.
Note that part0 was previously embedded into struct gendisk, but is a separate allocation now, and already points to the block_device instead of the hd_struct. The lifetime of struct gendisk is still controlled by the struct device embedded in the part0 hd_struct.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
87aa69aa |
| 28-Jul-2021 |
Ming Lei <ming.lei@redhat.com> |
block: return ELEVATOR_DISCARD_MERGE if possible
[ Upstream commit 866663b7b52d2da267b28e12eed89ee781b8fed1 ]
When merging one bio to request, if they are discard IO and the queue supports multi-ra
block: return ELEVATOR_DISCARD_MERGE if possible
[ Upstream commit 866663b7b52d2da267b28e12eed89ee781b8fed1 ]
When merging one bio to request, if they are discard IO and the queue supports multi-range discard, we need to return ELEVATOR_DISCARD_MERGE because both block core and related drivers(nvme, virtio-blk) doesn't handle mixed discard io merge(traditional IO merge together with discard merge) well.
Fix the issue by returning ELEVATOR_DISCARD_MERGE in this situation, so both blk-mq and drivers just need to handle multi-range discard.
Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name> Signed-off-by: Ming Lei <ming.lei@redhat.com> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Fixes: 2705dfb20947 ("block: fix discard request merge") Link: https://lore.kernel.org/r/20210729034226.1591070-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
591f69d7 |
| 01-Aug-2021 |
Chunguang Xu <brookxu@tencent.com> |
blk-throtl: optimize IOPS throttle for large IO scenarios
[ Upstream commit 4f1e9630afe6332de7286820fedd019f19eac057 ]
After patch 54efd50 (block: make generic_make_request handle arbitrarily sized
blk-throtl: optimize IOPS throttle for large IO scenarios
[ Upstream commit 4f1e9630afe6332de7286820fedd019f19eac057 ]
After patch 54efd50 (block: make generic_make_request handle arbitrarily sized bios), the IO through io-throttle may be larger, and these IOs may be further split into more small IOs. However, IOPS throttle does not seem to be aware of this change, which makes the calculation of IOPS of large IOs incomplete, resulting in disk-side IOPS that does not meet expectations. Maybe we should fix this problem.
We can reproduce it by set max_sectors_kb of disk to 128, set blkio.write_iops_throttle to 100, run a dd instance inside blkio and use iostat to watch IOPS:
dd if=/dev/zero of=/dev/sdb bs=1M count=1000 oflag=direct
As a result, without this change the average IOPS is 1995, with this change the IOPS is 98.
Signed-off-by: Chunguang Xu <brookxu@tencent.com> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/65869aaad05475797d63b4c3fed4f529febe3c26.1627876014.git.brookxu@tencent.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
1208f10b |
| 27-Jun-2021 |
Ming Lei <ming.lei@redhat.com> |
block: fix discard request merge
[ Upstream commit 2705dfb2094777e405e065105e307074af8965c1 ]
ll_new_hw_segment() is reached only in case of single range discard merge, and we don't have max discar
block: fix discard request merge
[ Upstream commit 2705dfb2094777e405e065105e307074af8965c1 ]
ll_new_hw_segment() is reached only in case of single range discard merge, and we don't have max discard segment size limit actually, so it is wrong to run the following check:
if (req->nr_phys_segments + nr_phys_segs > blk_rq_get_max_segments(req))
it may be always false since req->nr_phys_segments is initialized as one, and bio's segment count is still 1, blk_rq_get_max_segments(reg) is 1 too.
Fix the issue by not doing the check and bypassing the calculation of discard request's nr_phys_segments.
Based on analysis from Wang Shanker.
Cc: Christoph Hellwig <hch@lst.de> Reported-by: Wang Shanker <shankerwangmiao@gmail.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20210628023312.1903255-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
fc062d21 |
| 11-Feb-2021 |
David Jeffery <djeffery@redhat.com> |
block: recalculate segment count for multi-segment discards correctly
[ Upstream commit a958937ff166fc60d1c3a721036f6ff41bfa2821 ]
When a stacked block device inserts a request into another block d
block: recalculate segment count for multi-segment discards correctly
[ Upstream commit a958937ff166fc60d1c3a721036f6ff41bfa2821 ]
When a stacked block device inserts a request into another block device using blk_insert_cloned_request, the request's nr_phys_segments field gets recalculated by a call to blk_recalc_rq_segments in blk_cloned_rq_check_limits. But blk_recalc_rq_segments does not know how to handle multi-segment discards. For disk types which can handle multi-segment discards like nvme, this results in discard requests which claim a single segment when it should report several, triggering a warning in nvme and causing nvme to fail the discard from the invalid state.
WARNING: CPU: 5 PID: 191 at drivers/nvme/host/core.c:700 nvme_setup_discard+0x170/0x1e0 [nvme_core] ... nvme_setup_cmd+0x217/0x270 [nvme_core] nvme_loop_queue_rq+0x51/0x1b0 [nvme_loop] __blk_mq_try_issue_directly+0xe7/0x1b0 blk_mq_request_issue_directly+0x41/0x70 ? blk_account_io_start+0x40/0x50 dm_mq_queue_rq+0x200/0x3e0 blk_mq_dispatch_rq_list+0x10a/0x7d0 ? __sbitmap_queue_get+0x25/0x90 ? elv_rb_del+0x1f/0x30 ? deadline_remove_request+0x55/0xb0 ? dd_dispatch_request+0x181/0x210 __blk_mq_do_dispatch_sched+0x144/0x290 ? bio_attempt_discard_merge+0x134/0x1f0 __blk_mq_sched_dispatch_requests+0x129/0x180 blk_mq_sched_dispatch_requests+0x30/0x60 __blk_mq_run_hw_queue+0x47/0xe0 __blk_mq_delay_run_hw_queue+0x15b/0x170 blk_mq_sched_insert_requests+0x68/0xe0 blk_mq_flush_plug_list+0xf0/0x170 blk_finish_plug+0x36/0x50 xlog_cil_committed+0x19f/0x290 [xfs] xlog_cil_process_committed+0x57/0x80 [xfs] xlog_state_do_callback+0x1e0/0x2a0 [xfs] xlog_ioend_work+0x2f/0x80 [xfs] process_one_work+0x1b6/0x350 worker_thread+0x53/0x3e0 ? process_one_work+0x350/0x350 kthread+0x11b/0x140 ? __kthread_bind_mask+0x60/0x60 ret_from_fork+0x22/0x30
This patch fixes blk_recalc_rq_segments to be aware of devices which can have multi-segment discards. It calculates the correct discard segment count by counting the number of bio as each discard bio is considered its own segment.
Fixes: 1e739730c5b9 ("block: optionally merge discontiguous discard bios into a single request") Signed-off-by: David Jeffery <djeffery@redhat.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Laurence Oberman <loberman@redhat.com> Link: https://lore.kernel.org/r/20210211143807.GA115624@redhat Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
3ee16db3 |
| 30-Nov-2020 |
Mike Snitzer <snitzer@redhat.com> |
dm: fix IO splitting
Commit 882ec4e609c1 ("dm table: stack 'chunk_sectors' limit to account for target-specific splitting") caused a couple regressions: 1) Using lcm_not_zero() when stacking chunk_s
dm: fix IO splitting
Commit 882ec4e609c1 ("dm table: stack 'chunk_sectors' limit to account for target-specific splitting") caused a couple regressions: 1) Using lcm_not_zero() when stacking chunk_sectors was a bug because chunk_sectors must reflect the most limited of all devices in the IO stack. 2) DM targets that set max_io_len but that do _not_ provide an .iterate_devices method no longer had there IO split properly.
And commit 5091cdec56fa ("dm: change max_io_len() to use blk_max_size_offset()") also caused a regression where DM no longer supported varied (per target) IO splitting. The implication being the potential for severely reduced performance for IO stacks that use a DM target like dm-cache to hide performance limitations of a slower device (e.g. one that requires 4K IO splitting).
Coming full circle: Fix all these issues by discontinuing stacking chunk_sectors up using ti->max_io_len in dm_calculate_queue_limits(), add optional chunk_sectors override argument to blk_max_size_offset() and update DM's max_io_len() to pass ti->max_io_len to its blk_max_size_offset() call.
Passing in an optional chunk_sectors override to blk_max_size_offset() allows for code reuse of block's centralized calculation for max IO size based on provided offset and split boundary.
Fixes: 882ec4e609c1 ("dm table: stack 'chunk_sectors' limit to account for target-specific splitting") Fixes: 5091cdec56fa ("dm: change max_io_len() to use blk_max_size_offset()") Cc: stable@vger.kernel.org Reported-by: John Dorminy <jdorminy@redhat.com> Reported-by: Bruce Johnston <bjohnsto@redhat.com> Reported-by: Kirill Tkhai <ktkhai@virtuozzo.com> Reviewed-by: John Dorminy <jdorminy@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Reviewed-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.8.17, v5.8.16, v5.8.15, v5.9, v5.8.14 |
|
#
eda5cc99 |
| 06-Oct-2020 |
Christoph Hellwig <hch@lst.de> |
block: move blk_mq_sched_try_merge to blk-merge.c
Move blk_mq_sched_try_merge to blk-merge.c, which allows to mark a lot of the merge infrastructure static there.
Signed-off-by: Christoph Hellwig <
block: move blk_mq_sched_try_merge to blk-merge.c
Move blk_mq_sched_try_merge to blk-merge.c, which allows to mark a lot of the merge infrastructure static there.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.8.13, v5.8.12, v5.8.11, v5.8.10, v5.8.9, v5.8.8, v5.8.7, v5.8.6, v5.4.62 |
|
#
265600b7 |
| 01-Sep-2020 |
Baolin Wang <baolin.wang@linux.alibaba.com> |
block: Remove a duplicative condition
Remove a duplicative condition to remove below cppcheck warnings:
"warning: Redundant condition: sched_allow_merge. '!A || (A && B)' is equivalent to '!A || B'
block: Remove a duplicative condition
Remove a duplicative condition to remove below cppcheck warnings:
"warning: Redundant condition: sched_allow_merge. '!A || (A && B)' is equivalent to '!A || B' [redundantCondition]"
Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
7d7ca7c5 |
| 27-Aug-2020 |
Baolin Wang <baolin.wang@linux.alibaba.com> |
block: Add a new helper to attempt to merge a bio
There are lots of duplicated code when trying to merge a bio from plug list and sw queue, we can introduce a new helper to attempt to merge a bio, w
block: Add a new helper to attempt to merge a bio
There are lots of duplicated code when trying to merge a bio from plug list and sw queue, we can introduce a new helper to attempt to merge a bio, which can simplify the blk_bio_list_merge() and blk_attempt_plug_merge().
Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
bdc6a287 |
| 27-Aug-2020 |
Baolin Wang <baolin.wang@linux.alibaba.com> |
block: Move blk_mq_bio_list_merge() into blk-merge.c
Move the blk_mq_bio_list_merge() into blk-merge.c and rename it as a generic name.
Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ba
block: Move blk_mq_bio_list_merge() into blk-merge.c
Move the blk_mq_bio_list_merge() into blk-merge.c and rename it as a generic name.
Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
8e756373 |
| 27-Aug-2020 |
Baolin Wang <baolin.wang@linux.alibaba.com> |
block: Move bio merge related functions into blk-merge.c
It's better to move bio merge related functions into blk-merge.c, which contains all merge related functions.
Reviewed-by: Christoph Hellwig
block: Move bio merge related functions into blk-merge.c
It's better to move bio merge related functions into blk-merge.c, which contains all merge related functions.
Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.8.5, v5.8.4, v5.4.61, v5.8.3, v5.4.60, v5.8.2, v5.4.59, v5.8.1, v5.4.58, v5.4.57 |
|
#
e4b469c6 |
| 06-Aug-2020 |
Keith Busch <kbusch@kernel.org> |
block: fix get_max_io_size()
A previous commit aligning splits to physical block sizes inadvertently modified one return case such that that it now returns 0 length splits when the number of sectors
block: fix get_max_io_size()
A previous commit aligning splits to physical block sizes inadvertently modified one return case such that that it now returns 0 length splits when the number of sectors doesn't exceed the physical offset. This later hits a BUG in bio_split(). Restore the previous working behavior.
Fixes: 9cc5169cd478b ("block: Improve physical block alignment of split bios") Reported-by: Eric Deal <eric.deal@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Cc: Bart Van Assche <bvanassche@acm.org> Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
943b40c8 |
| 17-Aug-2020 |
Ming Lei <ming.lei@redhat.com> |
block: respect queue limit of max discard segment
When queue_max_discard_segments(q) is 1, blk_discard_mergable() will return false for discard request, then normal request merge is applied. However
block: respect queue limit of max discard segment
When queue_max_discard_segments(q) is 1, blk_discard_mergable() will return false for discard request, then normal request merge is applied. However, only queue_max_segments() is checked, so max discard segment limit isn't respected.
Check max discard segment limit in the request merge code for fixing the issue.
Discard request failure of virtio_blk is fixed.
Fixes: 69840466086d ("block: fix the DISCARD request merge") Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.4.56, v5.8, v5.7.12, v5.4.55, v5.7.11, v5.4.54, v5.7.10, v5.4.53, v5.4.52, v5.7.9, v5.7.8, v5.4.51, v5.4.50, v5.7.7, v5.4.49, v5.7.6, v5.7.5, v5.4.48, v5.7.4, v5.7.3, v5.4.47, v5.4.46, v5.7.2, v5.4.45, v5.7.1 |
|
#
3f649ab7 |
| 03-Jun-2020 |
Kees Cook <keescook@chromium.org> |
treewide: Remove uninitialized_var() usage
Using uninitialized_var() is dangerous as it papers over real bugs[1] (or can in the future), and suppresses unrelated compiler warnings (e.g. "unused vari
treewide: Remove uninitialized_var() usage
Using uninitialized_var() is dangerous as it papers over real bugs[1] (or can in the future), and suppresses unrelated compiler warnings (e.g. "unused variable"). If the compiler thinks it is uninitialized, either simply initialize the variable or make compiler changes.
In preparation for removing[2] the[3] macro[4], remove all remaining needless uses with the following script:
git grep '\buninitialized_var\b' | cut -d: -f1 | sort -u | \ xargs perl -pi -e \ 's/\buninitialized_var\(([^\)]+)\)/\1/g; s:\s*/\* (GCC be quiet|to make compiler happy) \*/$::g;'
drivers/video/fbdev/riva/riva_hw.c was manually tweaked to avoid pathological white-space.
No outstanding warnings were found building allmodconfig with GCC 9.3.0 for x86_64, i386, arm64, arm, powerpc, powerpc64le, s390x, mips, sparc64, alpha, and m68k.
[1] https://lore.kernel.org/lkml/20200603174714.192027-1-glider@google.com/ [2] https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1TGqCR5vQkCzWJ0QxK6CernOU6eedsudAixw@mail.gmail.com/ [3] https://lore.kernel.org/lkml/CA+55aFwgbgqhbp1fkxvRKEpzyR5J8n1vKT1VZdz9knmPuXhOeg@mail.gmail.com/ [4] https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yVJu65TpLgN_ybYNv0VEOKA@mail.gmail.com/
Reviewed-by: Leon Romanovsky <leonro@mellanox.com> # drivers/infiniband and mlx4/mlx5 Acked-by: Jason Gunthorpe <jgg@mellanox.com> # IB Acked-by: Kalle Valo <kvalo@codeaurora.org> # wireless drivers Reviewed-by: Chao Yu <yuchao0@huawei.com> # erofs Signed-off-by: Kees Cook <keescook@chromium.org>
show more ...
|