#
ed00aabd |
| 01-Jul-2020 |
Christoph Hellwig <hch@lst.de> |
block: rename generic_make_request to submit_bio_noacct
generic_make_request has always been very confusingly misnamed, so rename it to submit_bio_noacct to make it clear that it is submit_bio minus
block: rename generic_make_request to submit_bio_noacct
generic_make_request has always been very confusingly misnamed, so rename it to submit_bio_noacct to make it clear that it is submit_bio minus accounting and a few checks.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
f695ca38 |
| 01-Jul-2020 |
Christoph Hellwig <hch@lst.de> |
block: remove the request_queue argument from blk_queue_split
The queue can be trivially derived from the bio, so pass one less argument.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by
block: remove the request_queue argument from blk_queue_split
The queue can be trivially derived from the bio, so pass one less argument.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
f3bdc62f |
| 17-Jun-2020 |
Jan Kara <jack@suse.cz> |
blktrace: Provide event for request merging
Currently blk-mq does not report any event when two requests get merged in the elevator. This then results in difficult to understand sequence of events l
blktrace: Provide event for request merging
Currently blk-mq does not report any event when two requests get merged in the elevator. This then results in difficult to understand sequence of events like:
... 8,0 34 1579 0.608765271 2718 I WS 215023504 + 40 [dbench] 8,0 34 1584 0.609184613 2719 A WS 215023544 + 56 <- (8,4) 2160568 8,0 34 1585 0.609184850 2719 Q WS 215023544 + 56 [dbench] 8,0 34 1586 0.609188524 2719 G WS 215023544 + 56 [dbench] 8,0 3 602 0.609684162 773 D WS 215023504 + 96 [kworker/3:1H] 8,0 34 1591 0.609843593 0 C WS 215023504 + 96 [0]
and you can only guess (after quite some headscratching since the above excerpt is intermixed with a lot of other IO) that request 215023544+56 got merged to request 215023504+40. Provide proper event for request merging like we used to do in the legacy block layer.
Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.4.44, v5.7, v5.4.43 |
|
#
524f9ffd |
| 27-May-2020 |
Christoph Hellwig <hch@lst.de> |
block: reduce part_stat_lock() scope
We only need the stats lock (aka preempt_disable()) for updating the states, not for looking up or dropping the hd_struct reference.
Signed-off-by: Christoph He
block: reduce part_stat_lock() scope
We only need the stats lock (aka preempt_disable()) for updating the states, not for looking up or dropping the hd_struct reference.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
b9c54f56 |
| 27-May-2020 |
Konstantin Khlebnikov <khlebnikov@yandex-team.ru> |
block: account merge of two requests
Also rename blk_account_io_merge() into blk_account_io_merge_request() to distinguish it from merging request and bio.
[hch: rebased]
Signed-off-by: Konstantin
block: account merge of two requests
Also rename blk_account_io_merge() into blk_account_io_merge_request() to distinguish it from merging request and bio.
[hch: rebased]
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.4.42, v5.4.41 |
|
#
76268f3a |
| 13-May-2020 |
Christoph Hellwig <hch@lst.de> |
block: don't call part_{inc,dec}_in_flight for blk-mq devices
part_inc_in_flight and part_dec_in_flight are no-ops for blk-mq queues, so remove the calls in purely blk-mq callers.
Signed-off-by: Ch
block: don't call part_{inc,dec}_in_flight for blk-mq devices
part_inc_in_flight and part_dec_in_flight are no-ops for blk-mq queues, so remove the calls in purely blk-mq callers.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
a892c8d5 |
| 13-May-2020 |
Satya Tangirala <satyat@google.com> |
block: Inline encryption support for blk-mq
We must have some way of letting a storage device driver know what encryption context it should use for en/decrypting a request. However, it's the upper l
block: Inline encryption support for blk-mq
We must have some way of letting a storage device driver know what encryption context it should use for en/decrypting a request. However, it's the upper layers (like the filesystem/fscrypt) that know about and manages encryption contexts. As such, when the upper layer submits a bio to the block layer, and this bio eventually reaches a device driver with support for inline encryption, the device driver will need to have been told the encryption context for that bio.
We want to communicate the encryption context from the upper layer to the storage device along with the bio, when the bio is submitted to the block layer. To do this, we add a struct bio_crypt_ctx to struct bio, which can represent an encryption context (note that we can't use the bi_private field in struct bio to do this because that field does not function to pass information across layers in the storage stack). We also introduce various functions to manipulate the bio_crypt_ctx and make the bio/request merging logic aware of the bio_crypt_ctx.
We also make changes to blk-mq to make it handle bios with encryption contexts. blk-mq can merge many bios into the same request. These bios need to have contiguous data unit numbers (the necessary changes to blk-merge are also made to ensure this) - as such, it suffices to keep the data unit number of just the first bio, since that's all a storage driver needs to infer the data unit number to use for each data block in each bio in a request. blk-mq keeps track of the encryption context to be used for all the bios in a request with the request's rq_crypt_ctx. When the first bio is added to an empty request, blk-mq will program the encryption context of that bio into the request_queue's keyslot manager, and store the returned keyslot in the request's rq_crypt_ctx. All the functions to operate on encryption contexts are in blk-crypto.c.
Upper layers only need to call bio_crypt_set_ctx with the encryption key, algorithm and data_unit_num; they don't have to worry about getting a keyslot for each encryption context, as blk-mq/blk-crypto handles that. Blk-crypto also makes it possible for request-based layered devices like dm-rq to make use of inline encryption hardware by cloning the rq_crypt_ctx and programming a keyslot in the new request_queue when necessary.
Note that any user of the block layer can submit bios with an encryption context, such as filesystems, device-mapper targets, etc.
Signed-off-by: Satya Tangirala <satyat@google.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.4.40, v5.4.39, v5.4.38, v5.4.37, v5.4.36 |
|
#
0376e9ef |
| 28-Apr-2020 |
Christoph Hellwig <hch@lst.de> |
block: replace BIO_QUEUE_ENTERED with BIO_CGROUP_ACCT
BIO_QUEUE_ENTERED is only used for cgroup accounting now, so rename the flag and move setting it into the cgroup code.
Signed-off-by: Christoph
block: replace BIO_QUEUE_ENTERED with BIO_CGROUP_ACCT
BIO_QUEUE_ENTERED is only used for cgroup accounting now, so rename the flag and move setting it into the cgroup code.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.4.35, v5.4.34, v5.4.33 |
|
#
bdf8710d |
| 14-Apr-2020 |
Christoph Hellwig <hch@lst.de> |
block: move dma_pad handling from blk_rq_map_sg into the callers
There are only two callers of blk_rq_map_sg/__blk_rq_map_sg that set the dma_pad value in the queue. Move the handling into those ca
block: move dma_pad handling from blk_rq_map_sg into the callers
There are only two callers of blk_rq_map_sg/__blk_rq_map_sg that set the dma_pad value in the queue. Move the handling into those callers instead of burdening the common code, and move the ->extra_len field from struct request to struct scsi_cmnd.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
cc97923a |
| 14-Apr-2020 |
Christoph Hellwig <hch@lst.de> |
block: move dma drain handling to scsi
Don't burden the common block code with with specifics of the libata DMA draining mechanism. Instead move most of the code to the scsi midlayer.
That also me
block: move dma drain handling to scsi
Don't burden the common block code with with specifics of the libata DMA draining mechanism. Instead move most of the code to the scsi midlayer.
That also means the nr_phys_segments adjustments in the blk-mq fast path can go away entirely, given that SCSI never looks at nr_phys_segments after mapping the request to a scatterlist.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
89de1504 |
| 14-Apr-2020 |
Christoph Hellwig <hch@lst.de> |
block: provide a blk_rq_map_sg variant that returns the last element
To be able to move some of the special purpose hacks in blk_rq_map_sg into the callers we need a variant that returns the last ma
block: provide a blk_rq_map_sg variant that returns the last element
To be able to move some of the special purpose hacks in blk_rq_map_sg into the callers we need a variant that returns the last mapped S/G list element to the caller. Add that variant as __blk_rq_map_sg and make blk_rq_map_sg a trivial inline wrapper around it.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
e64a0e16 |
| 14-Apr-2020 |
Christoph Hellwig <hch@lst.de> |
block: remove RQF_COPY_USER
The RQF_COPY_USER is set for bio where the passthrough request mapping helpers decided that bounce buffering is required. It is then used to pad scatterlist for drivers
block: remove RQF_COPY_USER
The RQF_COPY_USER is set for bio where the passthrough request mapping helpers decided that bounce buffering is required. It is then used to pad scatterlist for drivers that required it. But given that non-passthrough requests are per definition aligned, and directly mapped pass-through request must be aligned it is not actually required at all.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.4.32, v5.4.31, v5.4.30, v5.4.29, v5.6, v5.4.28, v5.4.27, v5.4.26, v5.4.25, v5.4.24, v5.4.23, v5.4.22, v5.4.21, v5.4.20, v5.4.19, v5.4.18, v5.4.17, v5.4.16, v5.5, v5.4.15, v5.4.14, v5.4.13, v5.4.12, v5.4.11 |
|
#
4a2f704e |
| 11-Jan-2020 |
Ming Lei <ming.lei@redhat.com> |
block: fix get_max_segment_size() overflow on 32bit arch
Commit 429120f3df2d starts to take account of segment's start dma address when computing max segment size, and data type of 'unsigned long' i
block: fix get_max_segment_size() overflow on 32bit arch
Commit 429120f3df2d starts to take account of segment's start dma address when computing max segment size, and data type of 'unsigned long' is used to do that. However, the segment mask may be 0xffffffff, so the figured out segment size may be overflowed in case of zero physical address on 32bit arch.
Fix the issue by returning queue_max_segment_size() directly when that happens.
Fixes: 429120f3df2d ("block: fix splitting segments on boundary masks") Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Cc: Christoph Hellwig <hch@lst.de> Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.4.10, v5.4.9, v5.4.8, v5.4.7 |
|
#
429120f3 |
| 28-Dec-2019 |
Ming Lei <ming.lei@redhat.com> |
block: fix splitting segments on boundary masks
We ran into a problem with a mpt3sas based controller, where we would see random (and hard to reproduce) file corruption). The issue seemed specific t
block: fix splitting segments on boundary masks
We ran into a problem with a mpt3sas based controller, where we would see random (and hard to reproduce) file corruption). The issue seemed specific to this controller, but wasn't specific to the file system. After a lot of debugging, we find out that it's caused by segments spanning a 4G memory boundary. This shouldn't happen, as the default setting for segment boundary masks is 4G.
Turns out there are two issues in get_max_segment_size():
1) The default segment boundary mask is bypassed
2) The segment start address isn't taken into account when checking segment boundary limit
Fix these two issues by removing the bypass of the segment boundary check even if the mask is set to the default value, and taking into account the actual start address of the request when checking if a segment needs splitting.
Cc: stable@vger.kernel.org # v5.1+ Reviewed-by: Chris Mason <clm@fb.com> Tested-by: Chris Mason <clm@fb.com> Fixes: dcebd755926b ("block: use bio_for_each_bvec() to compute multi-page bvec count") Signed-off-by: Ming Lei <ming.lei@redhat.com>
Dropped const on the page pointer, ppc page_to_phys() doesn't mark the page as const...
Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.4.6, v5.4.5, v5.4.4, v5.4.3, v5.3.15, v5.4.2, v5.4.1, v5.3.14, v5.4, v5.3.13 |
|
#
1e279153 |
| 21-Nov-2019 |
Jens Axboe <axboe@kernel.dk> |
Revert "block: split bio if the only bvec's length is > SZ_4K"
We really don't need this, as the slow path will do the right thing anyway.
This reverts commit 6952a7f8446ee85ea9d10ab87b64797a031eaa
Revert "block: split bio if the only bvec's length is > SZ_4K"
We really don't need this, as the slow path will do the right thing anyway.
This reverts commit 6952a7f8446ee85ea9d10ab87b64797a031eaae3.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.3.12, v5.3.11, v5.3.10 |
|
#
6952a7f8 |
| 08-Nov-2019 |
Ming Lei <ming.lei@redhat.com> |
block: split bio if the only bvec's length is > SZ_4K
64K PAGE_SIZE is popular on ARM64 or other ARCHs, and 64K has been big enough to break some devices probably, so change the logic to split bio i
block: split bio if the only bvec's length is > SZ_4K
64K PAGE_SIZE is popular on ARM64 or other ARCHs, and 64K has been big enough to break some devices probably, so change the logic to split bio if the only bvec's length is > SZ_4K instead of PAGE_SIZE.
Fixes: fa5322872187 (block: avoid blk_bio_segment_split for small I/O operations) Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
59db8ba2 |
| 08-Nov-2019 |
Ming Lei <ming.lei@redhat.com> |
block: still try to split bio if the bvec crosses pages
Some device may set segment boundary as PAGE_SIZE - 1. If the bvec crosses pages, and meantime its length is <= PAGE_SIZE, we still need to sp
block: still try to split bio if the bvec crosses pages
Some device may set segment boundary as PAGE_SIZE - 1. If the bvec crosses pages, and meantime its length is <= PAGE_SIZE, we still need to split the bvec into 2 segments.
Fixes this issue by still splitting bio if the single bvec crosses pages.
Reported-by: kernel test robot <lkp@intel.com> Fixes: fa5322872187 (block: avoid blk_bio_segment_split for small I/O operations) Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.3.9 |
|
#
fa532287 |
| 04-Nov-2019 |
Christoph Hellwig <hch@lst.de> |
block: avoid blk_bio_segment_split for small I/O operations
__blk_queue_split() adds significant overhead for small I/O operations. Add a shortcut to avoid it for cases where we know we never need t
block: avoid blk_bio_segment_split for small I/O operations
__blk_queue_split() adds significant overhead for small I/O operations. Add a shortcut to avoid it for cases where we know we never need to split.
Based on a patch from Ming Lei.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.3.8, v5.3.7, v5.3.6, v5.3.5, v5.3.4, v5.3.3, v5.3.2, v5.3.1, v5.3, v5.2.14, v5.3-rc8, v5.2.13, v5.2.12, v5.2.11, v5.2.10, v5.2.9, v5.2.8, v5.2.7, v5.2.6 |
|
#
9cc5169c |
| 01-Aug-2019 |
Bart Van Assche <bvanassche@acm.org> |
block: Improve physical block alignment of split bios
Consider the following example: * The logical block size is 4 KB. * The physical block size is 8 KB. * max_sectors equals (16 KB >> 9) sectors.
block: Improve physical block alignment of split bios
Consider the following example: * The logical block size is 4 KB. * The physical block size is 8 KB. * max_sectors equals (16 KB >> 9) sectors. * A non-aligned 4 KB and an aligned 64 KB bio are merged into a single non-aligned 68 KB bio.
The current behavior is to split such a bio into (16 KB + 16 KB + 16 KB + 16 KB + 4 KB). The start of none of these five bio's is aligned to a physical block boundary.
This patch ensures that such a bio is split into four aligned and one non-aligned bio instead of being split into five non-aligned bios. This improves performance because most block devices can handle aligned requests faster than non-aligned requests.
Since the physical block size is larger than or equal to the logical block size, this patch preserves the guarantee that the returned value is a multiple of the logical block size.
Cc: Christoph Hellwig <hch@infradead.org> Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
708b25b3 |
| 01-Aug-2019 |
Bart Van Assche <bvanassche@acm.org> |
block: Simplify blk_bio_segment_split()
Move the max_sectors check into bvec_split_segs() such that a single call to that function can do all the necessary checks. This patch optimizes the fast path
block: Simplify blk_bio_segment_split()
Move the max_sectors check into bvec_split_segs() such that a single call to that function can do all the necessary checks. This patch optimizes the fast path further, namely if a bvec fits in a page.
Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
ff9811b3 |
| 01-Aug-2019 |
Bart Van Assche <bvanassche@acm.org> |
block: Simplify bvec_split_segs()
Simplify this function by by removing two if-tests. Other than requiring that the @sectors pointer is not NULL, this patch does not change the behavior of bvec_spli
block: Simplify bvec_split_segs()
Simplify this function by by removing two if-tests. Other than requiring that the @sectors pointer is not NULL, this patch does not change the behavior of bvec_split_segs().
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
dad77584 |
| 01-Aug-2019 |
Bart Van Assche <bvanassche@acm.org> |
block: Document the bio splitting functions
Since what the bio splitting functions do is nontrivial, document these functions.
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Cc: Christoph Hel
block: Document the bio splitting functions
Since what the bio splitting functions do is nontrivial, document these functions.
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Cc: Christoph Hellwig <hch@infradead.org> Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
af2c68fe |
| 01-Aug-2019 |
Bart Van Assche <bvanassche@acm.org> |
block: Declare several function pointer arguments 'const'
Make it clear to the compiler and also to humans that the functions that query request queue properties do not modify any member of the requ
block: Declare several function pointer arguments 'const'
Make it clear to the compiler and also to humans that the functions that query request queue properties do not modify any member of the request_queue data structure.
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Cc: Christoph Hellwig <hch@infradead.org> Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.2.5, v5.2.4, v5.2.3, v5.2.2, v5.2.1, v5.2 |
|
#
d665e12a |
| 03-Jul-2019 |
Christoph Hellwig <hch@lst.de> |
block: nr_phys_segments needs to be zero for REQ_OP_WRITE_ZEROES
Fix a regression introduced when removing bi_phys_segments for Write Zeroes requests, which need to have a segment count of zero, as
block: nr_phys_segments needs to be zero for REQ_OP_WRITE_ZEROES
Fix a regression introduced when removing bi_phys_segments for Write Zeroes requests, which need to have a segment count of zero, as they don't have a payload.
Fixes: 14ccb66b3f58 ("block: remove the bi_phys_segments field in struct bio") Reported-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.1.16, v5.1.15, v5.1.14, v5.1.13, v5.1.12, v5.1.11, v5.1.10, v5.1.9, v5.1.8 |
|
#
d627065d |
| 06-Jun-2019 |
Christoph Hellwig <hch@lst.de> |
block: untangle the end of blk_bio_segment_split
Now that we don't need to assign the front/back segment sizes, we can duplicating the segs assignment for the split vs no-split case and remove a who
block: untangle the end of blk_bio_segment_split
Now that we don't need to assign the front/back segment sizes, we can duplicating the segs assignment for the split vs no-split case and remove a whole chunk of boilerplate code.
Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|