#
e9cd19c0 |
| 06-Jun-2019 |
Christoph Hellwig <hch@lst.de> |
block: simplify blk_recalc_rq_segments
Return the segement and let the callers assign them, which makes the code a littler more obvious. Also pass the request instead of q plus bio chain, allowing
block: simplify blk_recalc_rq_segments
Return the segement and let the callers assign them, which makes the code a littler more obvious. Also pass the request instead of q plus bio chain, allowing for the use of rq_for_each_bvec.
Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
14ccb66b |
| 06-Jun-2019 |
Christoph Hellwig <hch@lst.de> |
block: remove the bi_phys_segments field in struct bio
We only need the number of segments in the blk-mq submission path. Remove the field from struct bio, and return it from a variant of blk_queue_
block: remove the bi_phys_segments field in struct bio
We only need the number of segments in the blk-mq submission path. Remove the field from struct bio, and return it from a variant of blk_queue_split instead of that it can passed as an argument to those functions that need the value.
This also means we stop recounting segments except for cloning and partial segments.
To keep the number of arguments in this how path down remove pointless struct request_queue arguments from any of the functions that had it and grew a nr_segs argument.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.1.7, v5.1.6, v5.1.5, v5.1.4 |
|
#
6869875f |
| 21-May-2019 |
Christoph Hellwig <hch@lst.de> |
block: remove the bi_seg_{front,back}_size fields in struct bio
At this point these fields aren't used for anything, so we can remove them.
Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by:
block: remove the bi_seg_{front,back}_size fields in struct bio
At this point these fields aren't used for anything, so we can remove them.
Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
200a9aff |
| 21-May-2019 |
Christoph Hellwig <hch@lst.de> |
block: remove the segment size check in bio_will_gap
We fundamentally do not have a maximum segement size for devices with a virt boundary. So don't bother checking it, especially given that the ex
block: remove the segment size check in bio_will_gap
We fundamentally do not have a maximum segement size for devices with a virt boundary. So don't bother checking it, especially given that the existing checks didn't properly work to start with as we never fully update the front/back segment size and miss the bi_seg_front_size that wuld have been required for some cases.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
eded341c |
| 21-May-2019 |
Christoph Hellwig <hch@lst.de> |
block: don't decrement nr_phys_segments for physically contigous segments
Currently ll_merge_requests_fn, unlike all other merge functions, reduces nr_phys_segments by one if the last segment of the
block: don't decrement nr_phys_segments for physically contigous segments
Currently ll_merge_requests_fn, unlike all other merge functions, reduces nr_phys_segments by one if the last segment of the previous, and the first segment of the next segement are contigous. While this seems like a nice solution to avoid building smaller than possible requests it causes a mismatch between the segments actually present in the request and those iterated over by the bvec iterators, including __rq_for_each_bio. This can for example mistrigger the single segment optimization in the nvme-pci driver, and might lead to mismatching nr_phys_segments number when recalculating the number of request when inserting a cloned request.
We could possibly work around this by making the bvec iterators take the front and back segment size into account, but that would require moving them from the bio to the bio_iter and spreading this mess over all users of bvecs. Or we could simply remove this optimization under the assumption that most users already build good enough bvecs, and that the bio merge patch never cared about this optimization either. The latter is what this patch does.
dff824b2aadb ("nvme-pci: optimize mapping of small single segment requests"). Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.1.3, v5.1.2, v5.1.1, v5.0.14, v5.1, v5.0.13, v5.0.12, v5.0.11, v5.0.10, v5.0.9 |
|
#
f9f76879 |
| 19-Apr-2019 |
Christoph Hellwig <hch@lst.de> |
block: avoid scatterlist offsets > PAGE_SIZE
While we generally allow scatterlists to have offsets larger than page size for an entry, and other subsystems like the crypto code make use of that, the
block: avoid scatterlist offsets > PAGE_SIZE
While we generally allow scatterlists to have offsets larger than page size for an entry, and other subsystems like the crypto code make use of that, the block layer isn't quite ready for that. Flip the switch back to avoid them for now, and revisit that decision early in a merge window once the known offenders are fixed.
Fixes: 8a96a0e40810 ("block: rewrite blk_bvec_map_sg to avoid a nth_page call") Reviewed-by: Ming Lei <ming.lei@redhat.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.0.8 |
|
#
8a96a0e4 |
| 11-Apr-2019 |
Christoph Hellwig <hch@lst.de> |
block: rewrite blk_bvec_map_sg to avoid a nth_page call
The offset in scatterlists is allowed to be larger than the page size, so don't go to great length to avoid that case and simplify the arithme
block: rewrite blk_bvec_map_sg to avoid a nth_page call
The offset in scatterlists is allowed to be larger than the page size, so don't go to great length to avoid that case and simplify the arithmetics.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.0.7, v5.0.6 |
|
#
b21e11c5 |
| 01-Apr-2019 |
Ming Lei <ming.lei@redhat.com> |
block: fix build warning in merging bvecs
Commit f6970f83ef79 ("block: don't check if adjacent bvecs in one bio can be mergeable") changes bvec merge by only considering two bvecs from different bio
block: fix build warning in merging bvecs
Commit f6970f83ef79 ("block: don't check if adjacent bvecs in one bio can be mergeable") changes bvec merge by only considering two bvecs from different bios. However, if the former bio doesn't inlcude any io bvec, then the following warning may be triggered:
warning: ‘bvec.bv_offset’ may be used uninitialized in this function [-Wmaybe-uninitialized]
In practice, it shouldn't be triggered.
Fixes it by adding check on former bio, the check shouldn't add any cost given 'bio->bi_iter' can be hit in cache.
Reported-by: Jens Axboe <axboe@kernel.dk> Fixes: f6970f83ef79 ("block: don't check if adjacent bvecs in one bio can be mergeable") Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.0.5, v5.0.4, v5.0.3 |
|
#
f6970f83 |
| 17-Mar-2019 |
Ming Lei <ming.lei@redhat.com> |
block: don't check if adjacent bvecs in one bio can be mergeable
Now both passthrough and FS IO have supported multi-page bvec, and bvec merging has been handled actually when adding page to bio, th
block: don't check if adjacent bvecs in one bio can be mergeable
Now both passthrough and FS IO have supported multi-page bvec, and bvec merging has been handled actually when adding page to bio, then adjacent bvecs won't be mergeable any more if they belong to same bio.
So only try to merge bvecs if they are from different bios.
Cc: Omar Sandoval <osandov@fb.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
16e3e418 |
| 17-Mar-2019 |
Ming Lei <ming.lei@redhat.com> |
block: reuse __blk_bvec_map_sg() for mapping page sized bvec
Inside __blk_segment_map_sg(), page sized bvec mapping is optimized a bit with one standalone branch.
So reuse __blk_bvec_map_sg() to do
block: reuse __blk_bvec_map_sg() for mapping page sized bvec
Inside __blk_segment_map_sg(), page sized bvec mapping is optimized a bit with one standalone branch.
So reuse __blk_bvec_map_sg() to do that.
Cc: Omar Sandoval <osandov@fb.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
cae6c2e5 |
| 17-Mar-2019 |
Ming Lei <ming.lei@redhat.com> |
block: remove argument of 'request_queue' from __blk_bvec_map_sg
The argument of 'request_queue' isn't used by __blk_bvec_map_sg(), so remove it.
Cc: Omar Sandoval <osandov@fb.com> Cc: Christoph He
block: remove argument of 'request_queue' from __blk_bvec_map_sg
The argument of 'request_queue' isn't used by __blk_bvec_map_sg(), so remove it.
Cc: Omar Sandoval <osandov@fb.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
fd7d8d42 |
| 17-Mar-2019 |
Ming Lei <ming.lei@redhat.com> |
block: don't merge adjacent bvecs to one segment in bio blk_queue_split
For normal filesystem IO, each page is added via blk_add_page(), in which bvec(page) merge has been handled already, and basic
block: don't merge adjacent bvecs to one segment in bio blk_queue_split
For normal filesystem IO, each page is added via blk_add_page(), in which bvec(page) merge has been handled already, and basically not possible to merge two adjacent bvecs in one bio.
So not try to merge two adjacent bvecs in blk_queue_split().
Cc: Omar Sandoval <osandov@fb.com> Cc: Christoph Hellwig <hch@lst.de> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v4.19.29, v5.0.2, v4.19.28, v5.0.1, v4.19.27, v5.0 |
|
#
05b700ba |
| 03-Mar-2019 |
Ming Lei <ming.lei@redhat.com> |
block: fix segment calculation for passthrough IO
blk_recount_segments() can be called in bio_add_pc_page() for calculating how many segments this bio will has after one page is added to this bio. I
block: fix segment calculation for passthrough IO
blk_recount_segments() can be called in bio_add_pc_page() for calculating how many segments this bio will has after one page is added to this bio. If the resulted segment number is beyond the queue limit, the added page will be removed.
The try-and-fix policy requires blk_recount_segments(__blk_recalc_rq_segments) to not consider the segment number limit. Unfortunately bvec_split_segs() does check this limit, and causes small segment number returned to bio_add_pc_page(), then page still may be added to the bio even though segment number limit becomes broken.
Fixes this issue by not considering segment number limit when calcualting bio's segment number.
Fixes: dcebd755926b ("block: use bio_for_each_bvec() to compute multi-page bvec count") Cc: Christoph Hellwig <hch@lst.de> Cc: Omar Sandoval <osandov@fb.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
aaeee62c |
| 02-Mar-2019 |
Ming Lei <ming.lei@redhat.com> |
block: fix updating bio's front segment size
When the current bvec can be merged to the 1st segment, the bio's front segment size has to be updated.
However, dcebd755926b doesn't consider that case
block: fix updating bio's front segment size
When the current bvec can be merged to the 1st segment, the bio's front segment size has to be updated.
However, dcebd755926b doesn't consider that case, then bio's front segment size may not be correct.
This patch fixes this issue.
Cc: Christoph Hellwig <hch@lst.de> Cc: Omar Sandoval <osandov@fb.com> Fixes: dcebd755926b ("block: use bio_for_each_bvec() to compute multi-page bvec count") Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
bbcbbd56 |
| 27-Feb-2019 |
Ming Lei <ming.lei@redhat.com> |
block: optimize blk_bio_segment_split for single-page bvec
Introduce a fast path for single-page bvec IO, then we can avoid to call bvec_split_segs() unnecessarily.
Signed-off-by: Ming Lei <ming.le
block: optimize blk_bio_segment_split for single-page bvec
Introduce a fast path for single-page bvec IO, then we can avoid to call bvec_split_segs() unnecessarily.
Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
48d7727c |
| 27-Feb-2019 |
Ming Lei <ming.lei@redhat.com> |
block: optimize __blk_segment_map_sg() for single-page bvec
Introduce a fast path for single-page bvec IO, then blk_bvec_map_sg() can be avoided.
Signed-off-by: Ming Lei <ming.lei@redhat.com> Signe
block: optimize __blk_segment_map_sg() for single-page bvec
Introduce a fast path for single-page bvec IO, then blk_bvec_map_sg() can be avoided.
Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
4d633062 |
| 27-Feb-2019 |
Ming Lei <ming.lei@redhat.com> |
block: introduce bvec_nth_page()
Single-page bvec can often be seen in small BS workloads, so introduce bvec_nth_page() for avoiding to call nth_page() unnecessarily, which looks not cheap.
Signed-
block: introduce bvec_nth_page()
Single-page bvec can often be seen in small BS workloads, so introduce bvec_nth_page() for avoiding to call nth_page() unnecessarily, which looks not cheap.
Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v4.19.26, v4.19.25, v4.19.24 |
|
#
49b1f22b |
| 18-Feb-2019 |
Ming Lei <ming.lei@redhat.com> |
block: avoid to READ fields of null bio
rq->bio can be NULL sometimes, such as flush request, so don't read bio->bi_seg_front_size until this 'bio' is checked as valid.
Cc: Bart Van Assche <bvanass
block: avoid to READ fields of null bio
rq->bio can be NULL sometimes, such as flush request, so don't read bio->bi_seg_front_size until this 'bio' is checked as valid.
Cc: Bart Van Assche <bvanassche@acm.org> Reported-by: Bart Van Assche <bvanassche@acm.org> Fixes: dcebd755926b0f39dd1e ("block: use bio_for_each_bvec() to compute multi-page bvec count") Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
2705c937 |
| 15-Feb-2019 |
Ming Lei <ming.lei@redhat.com> |
block: kill QUEUE_FLAG_NO_SG_MERGE
Since bdced438acd83ad83a6c ("block: setup bi_phys_segments after splitting"), physical segment number is mainly figured out in blk_queue_split() for fast path, and
block: kill QUEUE_FLAG_NO_SG_MERGE
Since bdced438acd83ad83a6c ("block: setup bi_phys_segments after splitting"), physical segment number is mainly figured out in blk_queue_split() for fast path, and the flag of BIO_SEG_VALID is set there too.
Now only blk_recount_segments() and blk_recalc_rq_segments() use this flag.
Basically blk_recount_segments() is bypassed in fast path given BIO_SEG_VALID is set in blk_queue_split().
For another user of blk_recalc_rq_segments():
- run in partial completion branch of blk_update_request, which is an unusual case
- run in blk_cloned_rq_check_limits(), still not a big problem if the flag is killed since dm-rq is the only user.
Multi-page bvec is enabled now, not doing S/G merging is rather pointless with the current setup of the I/O path, as it isn't going to save you a significant amount of cycles.
Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
862e5a5e |
| 15-Feb-2019 |
Ming Lei <ming.lei@redhat.com> |
block: use bio_for_each_bvec() to map sg
It is more efficient to use bio_for_each_bvec() to map sg, meantime we have to consider splitting multipage bvec as done in blk_bio_segment_split().
Reviewe
block: use bio_for_each_bvec() to map sg
It is more efficient to use bio_for_each_bvec() to map sg, meantime we have to consider splitting multipage bvec as done in blk_bio_segment_split().
Reviewed-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
dcebd755 |
| 15-Feb-2019 |
Ming Lei <ming.lei@redhat.com> |
block: use bio_for_each_bvec() to compute multi-page bvec count
First it is more efficient to use bio_for_each_bvec() in both blk_bio_segment_split() and __blk_recalc_rq_segments() to compute how ma
block: use bio_for_each_bvec() to compute multi-page bvec count
First it is more efficient to use bio_for_each_bvec() in both blk_bio_segment_split() and __blk_recalc_rq_segments() to compute how many multi-page bvecs there are in the bio.
Secondly once bio_for_each_bvec() is used, the bvec may need to be splitted because its length can be very longer than max segment size, so we have to split the big bvec into several segments.
Thirdly when splitting multi-page bvec into segments, the max segment limit may be reached, so the bio split need to be considered under this situation too.
Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
1a67356e |
| 15-Feb-2019 |
Ming Lei <ming.lei@redhat.com> |
block: don't use bio->bi_vcnt to figure out segment number
It is wrong to use bio->bi_vcnt to figure out how many segments there are in the bio even though CLONED flag isn't set on this bio, because
block: don't use bio->bi_vcnt to figure out segment number
It is wrong to use bio->bi_vcnt to figure out how many segments there are in the bio even though CLONED flag isn't set on this bio, because this bio may be splitted or advanced.
So always use bio_segments() in blk_recount_segments(), and it shouldn't cause any performance loss now because the physical segment number is figured out in blk_queue_split() and BIO_SEG_VALID is set meantime since bdced438acd83ad83a6c ("block: setup bi_phys_segments after splitting").
Reviewed-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Fixes: 76d8137a3113 ("blk-merge: recaculate segment if it isn't less than max segments") Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v4.19.23, v4.19.22, v4.19.21, v4.19.20, v4.19.19 |
|
#
947b7ac1 |
| 27-Jan-2019 |
Jens Axboe <axboe@kernel.dk> |
Revert "block: cover another queue enter recursion via BIO_QUEUE_ENTERED"
We can't touch a bio after ->make_request_fn(), for all we know it could already have been completed by the time this functi
Revert "block: cover another queue enter recursion via BIO_QUEUE_ENTERED"
We can't touch a bio after ->make_request_fn(), for all we know it could already have been completed by the time this function returns.
This reverts commit 698cef173983b086977e633e46476e0f925ca01e.
Reported-by: syzbot+4df6ca820108fd248943@syzkaller.appspotmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v4.19.18, v4.19.17 |
|
#
698cef17 |
| 22-Jan-2019 |
Ming Lei <ming.lei@redhat.com> |
block: cover another queue enter recursion via BIO_QUEUE_ENTERED
Except for blk_queue_split(), bio_split() is used for splitting bio too, then the remained bio is often resubmit to queue via generic
block: cover another queue enter recursion via BIO_QUEUE_ENTERED
Except for blk_queue_split(), bio_split() is used for splitting bio too, then the remained bio is often resubmit to queue via generic_make_request(). So the same queue enter recursion exits in this case too. Unfortunatley commit cd4a4ae4683dc2 doesn't help this case.
This patch covers the above case by setting BIO_QUEUE_ENTERED before calling q->make_request_fn.
In theory the per-bio flag is used to simulate one stack variable, it is just fine to clear it after q->make_request_fn is returned. Especially the same bio can't be submitted from another context.
Fixes: cd4a4ae4683dc2 ("block: don't use blocking queue entered for recursive bio submits") Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: NeilBrown <neilb@suse.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v4.19.16, v4.19.15, v4.19.14, v4.19.13, v4.19.12, v4.19.11, v4.19.10 |
|
#
38417468 |
| 13-Dec-2018 |
Christoph Hellwig <hch@lst.de> |
scsi: block: remove the cluster flag
Now that the the SCSI layer replaced the use of the cluster flag with segment size limits and the DMA boundary we can remove the cluster flag from the block laye
scsi: block: remove the cluster flag
Now that the the SCSI layer replaced the use of the cluster flag with segment size limits and the DMA boundary we can remove the cluster flag from the block layer.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
show more ...
|