Revision tags: v5.15.52, v5.15.51, v5.15.50, v5.15.49, v5.15.48, v5.15.47 |
|
#
efef739d |
| 14-Jun-2022 |
Christoph Hellwig <hch@lst.de> |
block: fold blk_max_size_offset into get_max_io_size
Now that blk_max_size_offset has a single caller left, fold it into that and clean up the naming convention for the local variables there.
Signe
block: fold blk_max_size_offset into get_max_io_size
Now that blk_max_size_offset has a single caller left, fold it into that and clean up the naming convention for the local variables there.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Pankaj Raghav <p.raghav@samsung.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20220614090934.570632-6-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
84613bed |
| 14-Jun-2022 |
Christoph Hellwig <hch@lst.de> |
block: cleanup variable naming in get_max_io_size
get_max_io_size has a very odd choice of variables names and initialization patterns. Switch to more descriptive names and more clear initializatio
block: cleanup variable naming in get_max_io_size
get_max_io_size has a very odd choice of variables names and initialization patterns. Switch to more descriptive names and more clear initialization of them.
Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220614090934.570632-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
c8875190 |
| 14-Jun-2022 |
Christoph Hellwig <hch@lst.de> |
block: open code blk_max_size_offset in blk_rq_get_max_sectors
blk_rq_get_max_sectors always uses q->limits.chunk_sectors as the chunk_sectors argument, and already checks for max_sectors through th
block: open code blk_max_size_offset in blk_rq_get_max_sectors
blk_rq_get_max_sectors always uses q->limits.chunk_sectors as the chunk_sectors argument, and already checks for max_sectors through the call to blk_queue_get_max_sectors. That means much of blk_max_size_offset is not needed and open coding it simplifies the code.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20220614090934.570632-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
67927d22 |
| 10-Jun-2022 |
Keith Busch <kbusch@kernel.org> |
block/merge: count bytes instead of sectors
Individual bv_len's may not be a sector size.
Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Joh
block/merge: count bytes instead of sectors
Individual bv_len's may not be a sector size.
Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20220610195830.3574005-7-kbusch@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.15.46, v5.15.45, v5.15.44, v5.15.43, v5.15.42, v5.18, v5.15.41, v5.15.40, v5.15.39, v5.15.38, v5.15.37, v5.15.36, v5.15.35, v5.15.34, v5.15.33, v5.15.32, v5.15.31, v5.17, v5.15.30, v5.15.29 |
|
#
6b2b0459 |
| 14-Mar-2022 |
Tejun Heo <tj@kernel.org> |
block: don't merge across cgroup boundaries if blkcg is enabled
blk-iocost and iolatency are cgroup aware rq-qos policies but they didn't disable merges across different cgroups. This obviously can
block: don't merge across cgroup boundaries if blkcg is enabled
blk-iocost and iolatency are cgroup aware rq-qos policies but they didn't disable merges across different cgroups. This obviously can lead to accounting and control errors but more importantly to priority inversions - e.g. an IO which belongs to a higher priority cgroup or IO class may end up getting throttled incorrectly because it gets merged to an IO issued from a low priority cgroup.
Fix it by adding blk_cgroup_mergeable() which is called from merge paths and rejects cross-cgroup and cross-issue_as_root merges.
Signed-off-by: Tejun Heo <tj@kernel.org> Fixes: d70675121546 ("block: introduce blk-iolatency io controller") Cc: stable@vger.kernel.org # v4.19+ Cc: Josef Bacik <jbacik@fb.com> Link: https://lore.kernel.org/r/Yi/eE/6zFNyWJ+qd@slm.duckdns.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
5b205071 |
| 11-Mar-2022 |
Jens Axboe <axboe@kernel.dk> |
block: ensure plug merging checks the correct queue at least once
Song reports that a RAID rebuild workload runs much slower recently, and it is seeing a lot less merging than it did previously. The
block: ensure plug merging checks the correct queue at least once
Song reports that a RAID rebuild workload runs much slower recently, and it is seeing a lot less merging than it did previously. The reason is that a previous commit reduced the amount of work we do for plug merging. RAID rebuild interleaves requests between disks, so a last-entry check in plug merging always misses a merge opportunity since we always find a different disk than what we are looking for.
Modify the logic such that it's still a one-hit cache, but ensure that we check enough to find the right target before giving up.
Fixes: d38a9c04c0d5 ("block: only check previous entry for plug merge attempt") Reported-and-tested-by: Song Liu <song@kernel.org> Reviewed-by: Song Liu <songliubraving@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.15.28, v5.15.27 |
|
#
c75e707f |
| 04-Mar-2022 |
Christoph Hellwig <hch@lst.de> |
block: remove the per-bio/request write hint
With the NVMe support for this gone, there are no consumers of these hints left, so remove them.
Signed-off-by: Christoph Hellwig <hch@lst.de> Link: htt
block: remove the per-bio/request write hint
With the NVMe support for this gone, there are no consumers of these hints left, so remove them.
Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220304175556.407719-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.15.26, v5.15.25, v5.15.24, v5.15.23 |
|
#
73bd66d9 |
| 09-Feb-2022 |
Christoph Hellwig <hch@lst.de> |
scsi: block: Remove REQ_OP_WRITE_SAME support
No more users of REQ_OP_WRITE_SAME or drivers implementing it are left, so remove the infrastructure.
[mkp: fold in and tweak sysfs reporting fix]
Lin
scsi: block: Remove REQ_OP_WRITE_SAME support
No more users of REQ_OP_WRITE_SAME or drivers implementing it are left, so remove the infrastructure.
[mkp: fold in and tweak sysfs reporting fix]
Link: https://lore.kernel.org/r/20220209082828.2629273-8-hch@lst.de Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
show more ...
|
#
9f5ede3c |
| 15-Feb-2022 |
Ming Lei <ming.lei@redhat.com> |
block: throttle split bio in case of iops limit
Commit 111be8839817 ("block-throttle: avoid double charge") marks bio as BIO_THROTTLED unconditionally if __blk_throtl_bio() is called on this bio, th
block: throttle split bio in case of iops limit
Commit 111be8839817 ("block-throttle: avoid double charge") marks bio as BIO_THROTTLED unconditionally if __blk_throtl_bio() is called on this bio, then this bio won't be called into __blk_throtl_bio() any more. This way is to avoid double charge in case of bio splitting. It is reasonable for read/write throughput limit, but not reasonable for IOPS limit because block layer provides io accounting against split bio.
Chunguang Xu has already observed this issue and fixed it in commit 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO scenarios"). However, that patch only covers bio splitting in __blk_queue_split(), and we have other kind of bio splitting, such as bio_split() & submit_bio_noacct() and other ways.
This patch tries to fix the issue in one generic way by always charging the bio for iops limit in blk_throtl_bio(). This way is reasonable: re-submission & fast-cloned bio is charged if it is submitted to same disk/queue, and BIO_THROTTLED will be cleared if bio->bi_bdev is changed.
This new approach can get much more smooth/stable iops limit compared with commit 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO scenarios") since that commit can't throttle current split bios actually.
Also this way won't cause new double bio iops charge in blk_throtl_dispatch_work_fn() in which blk_throtl_bio() won't be called any more.
Reported-by: Ning Li <lining2020x@163.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Chunguang Xu <brookxu@tencent.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220216044514.2903784-7-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.15.22, v5.15.21, v5.15.20, v5.15.19, v5.15.18, v5.15.17, v5.4.173, v5.15.16, v5.15.15, v5.16, v5.15.10, v5.15.9, v5.15.8, v5.15.7, v5.15.6 |
|
#
79bb1dbd |
| 26-Nov-2021 |
Christoph Hellwig <hch@lst.de> |
block: don't check ->rq_disk in merges
There is a 1:1 relationship between request_queues and gendisks now, so no need for these extra checks.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed
block: don't check ->rq_disk in merges
There is a 1:1 relationship between request_queues and gendisks now, so no need for these extra checks.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20211126121802.2090656-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.15.5 |
|
#
82d981d4 |
| 23-Nov-2021 |
Christoph Hellwig <hch@lst.de> |
block: don't include <linux/part_stat.h> in blk.h
Not needed, shift it into the source files that need it instead.
Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/2021
block: don't include <linux/part_stat.h> in blk.h
Not needed, shift it into the source files that need it instead.
Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211123185312.1432157-9-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
2aa7745b |
| 23-Nov-2021 |
Christoph Hellwig <hch@lst.de> |
block: don't include blk-mq-sched.h in blk.h
No needed, shift it into the source files that need it instead.
Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/2021112318
block: don't include blk-mq-sched.h in blk.h
No needed, shift it into the source files that need it instead.
Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211123185312.1432157-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
0c5bcc92 |
| 23-Nov-2021 |
Christoph Hellwig <hch@lst.de> |
blk-mq: simplify the plug handling in blk_mq_submit_bio
blk_mq_submit_bio has two different plug cases, one that uses full plugging and a limited plugging one.
The limited plugging case is only use
blk-mq: simplify the plug handling in blk_mq_submit_bio
blk_mq_submit_bio has two different plug cases, one that uses full plugging and a limited plugging one.
The limited plugging case is only used for a corner case that does not matter in real life:
- no ->commit_rqs (so not NVMe) - no shared tags (so not SCSI) - not rotational (so no old disk or floppy driver) - must have multiple queues (so no eMMC)
Remove the limited merging case and all the related junk to simplify blk_mq_submit_bio and the functions called from it.
Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211123160443.1315598-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.15.4, v5.15.3, v5.15.2, v5.15.1 |
|
#
a1cb6537 |
| 02-Nov-2021 |
Ming Lei <ming.lei@redhat.com> |
blk-mq: only try to run plug merge if request has same queue with incoming bio
It is obvious that io merge can't be done between two different queues, so just try to run io merge in case of same que
blk-mq: only try to run plug merge if request has same queue with incoming bio
It is obvious that io merge can't be done between two different queues, so just try to run io merge in case of same queue.
Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20211102133502.3619184-2-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.15, v5.14.14 |
|
#
859897c3 |
| 19-Oct-2021 |
Pavel Begunkov <asml.silence@gmail.com> |
block: convert leftovers to bdev_get_queue
Convert bdev->bd_disk->queue to bdev_get_queue(), which is faster. Apparently, there are a few such spots in block that got lost during rebases.
Reviewed-
block: convert leftovers to bdev_get_queue
Convert bdev->bd_disk->queue to bdev_get_queue(), which is faster. Apparently, there are a few such spots in block that got lost during rebases.
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
bc490f81 |
| 18-Oct-2021 |
Jens Axboe <axboe@kernel.dk> |
block: change plugging to use a singly linked list
Use a singly linked list for the blk_plug. This saves 8 bytes in the blk_plug struct, and makes for faster list manipulations than doubly linked li
block: change plugging to use a singly linked list
Use a singly linked list for the blk_plug. This saves 8 bytes in the blk_plug struct, and makes for faster list manipulations than doubly linked lists. As we don't use the doubly linked lists for anything, singly linked is just fine.
This yields a bump in default (merging enabled) performance from 7.0 to 7.1M IOPS, and ~7.5M IOPS with merging disabled.
Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
87c037d1 |
| 18-Oct-2021 |
Jens Axboe <axboe@kernel.dk> |
block: return whether or not to unplug through boolean
Instead of returning the same queue request through a request pointer, use a boolean to accomplish the same.
Reviewed-by: Christoph Hellwig <h
block: return whether or not to unplug through boolean
Instead of returning the same queue request through a request pointer, use a boolean to accomplish the same.
Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.14.13 |
|
#
abd45c15 |
| 13-Oct-2021 |
Jens Axboe <axboe@kernel.dk> |
block: handle fast path of bio splitting inline
The fast path is no splitting needed. Separate the handling into a check part we can inline, and an out-of-line handling path if we do need to split.
block: handle fast path of bio splitting inline
The fast path is no splitting needed. Separate the handling into a check part we can inline, and an out-of-line handling path if we do need to split.
Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.14.12 |
|
#
6ce913fe |
| 12-Oct-2021 |
Christoph Hellwig <hch@lst.de> |
block: rename REQ_HIPRI to REQ_POLLED
Unlike the RWF_HIPRI userspace ABI which is intentionally kept vague, the bio flag is specific to the polling implementation, so rename and document it properly
block: rename REQ_HIPRI to REQ_POLLED
Unlike the RWF_HIPRI userspace ABI which is intentionally kept vague, the bio flag is specific to the polling implementation, so rename and document it properly.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Tested-by: Mark Wunderlich <mark.wunderlich@intel.com> Link: https://lore.kernel.org/r/20211012111226.760968-12-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
d38a9c04 |
| 14-Oct-2021 |
Jens Axboe <axboe@kernel.dk> |
block: only check previous entry for plug merge attempt
Currently we scan the entire plug list, which is potentially very expensive. In an IOPS bound workload, we can drive about 5.6M IOPS with merg
block: only check previous entry for plug merge attempt
Currently we scan the entire plug list, which is potentially very expensive. In an IOPS bound workload, we can drive about 5.6M IOPS with merging enabled, and profiling shows that the plug merge check is the (by far) most expensive thing we're doing:
Overhead Command Shared Object Symbol + 20.89% io_uring [kernel.vmlinux] [k] blk_attempt_plug_merge + 4.98% io_uring [kernel.vmlinux] [k] io_submit_sqes + 4.78% io_uring [kernel.vmlinux] [k] blkdev_direct_IO + 4.61% io_uring [kernel.vmlinux] [k] blk_mq_submit_bio
Instead of browsing the whole list, just check the previously inserted entry. That is enough for a naive merge check and will catch most cases, and for devices that need full merging, the IO scheduler attached to such devices will do that anyway. The plug merge is meant to be an inexpensive check to avoid getting a request, but if we repeatedly scan the list for every single insert, it is very much not a cheap check.
With this patch, the workload instead runs at ~7.0M IOPS, providing a 25% improvement. Disabling merging entirely yields another 5% improvement.
Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
ff18d77b |
| 12-Oct-2021 |
Christoph Hellwig <hch@lst.de> |
block: move bio_get_{first,last}_bvec out of bio.h
bio_get_first_bvec and bio_get_last_bvec are only used in blk-merge.c, so move them there.
Signed-off-by: Christoph Hellwig <hch@lst.de> Link: htt
block: move bio_get_{first,last}_bvec out of bio.h
bio_get_first_bvec and bio_get_last_bvec are only used in blk-merge.c, so move them there.
Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211012161804.991559-8-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.14.11, v5.14.10 |
|
#
a7b36ee6 |
| 05-Oct-2021 |
Jens Axboe <axboe@kernel.dk> |
block: move blk-throtl fast path inline
Even if no policies are defined, we spend ~2% of the total IO time checking. Move the fast path inline.
Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Je
block: move blk-throtl fast path inline
Even if no policies are defined, we spend ~2% of the total IO time checking. Move the fast path inline.
Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.14.9, v5.14.8, v5.14.7 |
|
#
fe45e630 |
| 20-Sep-2021 |
Christoph Hellwig <hch@lst.de> |
block: move integrity handling out of <linux/blkdev.h>
Split the integrity/metadata handling definitions out into a new header.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes T
block: move integrity handling out of <linux/blkdev.h>
Split the integrity/metadata handling definitions out into a new header.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20210920123328.1399408-17-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
badf7f64 |
| 20-Sep-2021 |
Christoph Hellwig <hch@lst.de> |
block: move a few merge helpers out of <linux/blkdev.h>
These are block-layer internal helpers, so move them to block/blk.h and block/blk-merge.c. Also update a comment a bit to use better grammar.
block: move a few merge helpers out of <linux/blkdev.h>
These are block-layer internal helpers, so move them to block/blk.h and block/blk-merge.c. Also update a comment a bit to use better grammar.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20210920123328.1399408-16-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
2e76c69c |
| 14-Mar-2022 |
Tejun Heo <tj@kernel.org> |
block: don't merge across cgroup boundaries if blkcg is enabled
commit 6b2b04590b51aa4cf395fcd185ce439cab5961dc upstream.
blk-iocost and iolatency are cgroup aware rq-qos policies but they didn't d
block: don't merge across cgroup boundaries if blkcg is enabled
commit 6b2b04590b51aa4cf395fcd185ce439cab5961dc upstream.
blk-iocost and iolatency are cgroup aware rq-qos policies but they didn't disable merges across different cgroups. This obviously can lead to accounting and control errors but more importantly to priority inversions - e.g. an IO which belongs to a higher priority cgroup or IO class may end up getting throttled incorrectly because it gets merged to an IO issued from a low priority cgroup.
Fix it by adding blk_cgroup_mergeable() which is called from merge paths and rejects cross-cgroup and cross-issue_as_root merges.
Signed-off-by: Tejun Heo <tj@kernel.org> Fixes: d70675121546 ("block: introduce blk-iolatency io controller") Cc: stable@vger.kernel.org # v4.19+ Cc: Josef Bacik <jbacik@fb.com> Link: https://lore.kernel.org/r/Yi/eE/6zFNyWJ+qd@slm.duckdns.org Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|