#
d2250f10 |
| 14-Dec-2016 |
Song Liu <songliubraving@fb.com> |
md/r5cache: assign conf->log before r5l_load_log() r5l_load_log() calls functions that requires a proper conf->log, for example, r5c_is_writeback(). Therefore, we should set conf->lo
md/r5cache: assign conf->log before r5l_load_log() r5l_load_log() calls functions that requires a proper conf->log, for example, r5c_is_writeback(). Therefore, we should set conf->log before calling r5l_load_log(). If r5l_load_log() fails, conf->log is set back to NULL. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Shaohua Li <shli@fb.com>
show more ...
|
#
3c66abba |
| 14-Dec-2016 |
Song Liu <songliubraving@fb.com> |
md/r5cache: simplify handling of sh->log_start in recovery We only need to update sh->log_start at the end of recovery, which is r5c_recovery_rewrite_data_only_stripes(), so it is not
md/r5cache: simplify handling of sh->log_start in recovery We only need to update sh->log_start at the end of recovery, which is r5c_recovery_rewrite_data_only_stripes(), so it is not necessary to set it before that. In this patch, log_start is removed from r5c_recovery_alloc_stripe(). After updating all sh->log_start, rewrite_data_only_stripes() also updates log->next_checkpoints to the last sh->log_start. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Shaohua Li <shli@fb.com>
show more ...
|
#
28ca833e |
| 12-Dec-2016 |
JackieLiu <liuyun01@kylinos.cn> |
md/raid5-cache: removes unnecessary write-through mode judgments The write-through mode has been returned in front of the function, do not need to do it again. Signed-off-by: Ja
md/raid5-cache: removes unnecessary write-through mode judgments The write-through mode has been returned in front of the function, do not need to do it again. Signed-off-by: JackieLiu <liuyun01@kylinos.cn> Reviewed-by: Song Liu <songliubraving@fb.com> Signed-off-by: Shaohua Li <shli@fb.com>
show more ...
|
#
20737738 |
| 13-Dec-2016 |
Shaohua Li <shli@fb.com> |
Merge branch 'md-next' into md-linus
|
#
36869cb9 |
| 13-Dec-2016 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge branch 'for-4.10/block' of git://git.kernel.dk/linux-block Pull block layer updates from Jens Axboe: "This is the main block pull request this series. Contrary to previous r
Merge branch 'for-4.10/block' of git://git.kernel.dk/linux-block Pull block layer updates from Jens Axboe: "This is the main block pull request this series. Contrary to previous release, I've kept the core and driver changes in the same branch. We always ended up having dependencies between the two for obvious reasons, so makes more sense to keep them together. That said, I'll probably try and keep more topical branches going forward, especially for cycles that end up being as busy as this one. The major parts of this pull request is: - Improved support for O_DIRECT on block devices, with a small private implementation instead of using the pig that is fs/direct-io.c. From Christoph. - Request completion tracking in a scalable fashion. This is utilized by two components in this pull, the new hybrid polling and the writeback queue throttling code. - Improved support for polling with O_DIRECT, adding a hybrid mode that combines pure polling with an initial sleep. From me. - Support for automatic throttling of writeback queues on the block side. This uses feedback from the device completion latencies to scale the queue on the block side up or down. From me. - Support from SMR drives in the block layer and for SD. From Hannes and Shaun. - Multi-connection support for nbd. From Josef. - Cleanup of request and bio flags, so we have a clear split between which are bio (or rq) private, and which ones are shared. From Christoph. - A set of patches from Bart, that improve how we handle queue stopping and starting in blk-mq. - Support for WRITE_ZEROES from Chaitanya. - Lightnvm updates from Javier/Matias. - Supoort for FC for the nvme-over-fabrics code. From James Smart. - A bunch of fixes from a whole slew of people, too many to name here" * 'for-4.10/block' of git://git.kernel.dk/linux-block: (182 commits) blk-stat: fix a few cases of missing batch flushing blk-flush: run the queue when inserting blk-mq flush elevator: make the rqhash helpers exported blk-mq: abstract out blk_mq_dispatch_rq_list() helper blk-mq: add blk_mq_start_stopped_hw_queue() block: improve handling of the magic discard payload blk-wbt: don't throttle discard or write zeroes nbd: use dev_err_ratelimited in io path nbd: reset the setup task for NBD_CLEAR_SOCK nvme-fabrics: Add FC LLDD loopback driver to test FC-NVME nvme-fabrics: Add target support for FC transport nvme-fabrics: Add host support for FC transport nvme-fabrics: Add FC transport LLDD api definitions nvme-fabrics: Add FC transport FC-NVME definitions nvme-fabrics: Add FC transport error codes to nvme.h Add type 0x28 NVME type code to scsi fc headers nvme-fabrics: patch target code in prep for FC transport support nvme-fabrics: set sqe.command_id in core not transports parser: add u64 number parser nvme-rdma: align to generic ib_event logging helper ...
show more ...
|
Revision tags: v4.9 |
|
#
2953079c |
| 08-Dec-2016 |
Shaohua Li <shli@fb.com> |
md: separate flags for superblock changes The mddev->flags are used for different purposes. There are a lot of places we check/change the flags without masking unrelated flags, we co
md: separate flags for superblock changes The mddev->flags are used for different purposes. There are a lot of places we check/change the flags without masking unrelated flags, we could check/change unrelated flags. These usage are most for superblock write, so spearate superblock related flags. This should make the code clearer and also fix real bugs. Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Shaohua Li <shli@fb.com>
show more ...
|
#
3c6edc66 |
| 07-Dec-2016 |
Song Liu <songliubraving@fb.com> |
md/r5cache: after recovery, increase journal seq by 10000 Currently, we increase journal entry seq by 10 after recovery. However, this is not sufficient in the following case. A
md/r5cache: after recovery, increase journal seq by 10000 Currently, we increase journal entry seq by 10 after recovery. However, this is not sufficient in the following case. After crash the journal looks like | seq+0 | +1 | +2 | +3 | +4 | +5 | +6 | +7 | ... | +11 | +12 | If +1 is not valid, we dropped all entries from +1 to +12; and write seq+10: | seq+0 | +10 | +2 | +3 | +4 | +5 | +6 | +7 | ... | +11 | +12 | However, if we write a big journal entry with seq+11, it will connect with some stale journal entry: | seq+0 | +10 | +11 | +12 | To reduce the risk of this issue, we increase seq by 10000 instead. Shaohua: use 10000 instead of 1000. The risk should be very unlikely. The total stripe cache size is less than 2k typically, and several stripes can fit into one meta data block. So the total inflight meta data blocks would be quite small, which means the the total sequence number used should be quite small. The 10000 sequence number increase should be far more than safe. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Shaohua Li <shli@fb.com>
show more ...
|
#
5c88f403 |
| 07-Dec-2016 |
Song Liu <songliubraving@fb.com> |
md/raid5-cache: fix crc in rewrite_data_only_stripes() r5l_recovery_create_empty_meta_block() creates crc for the empty metablock. After the metablock is updated, we need clear the c
md/raid5-cache: fix crc in rewrite_data_only_stripes() r5l_recovery_create_empty_meta_block() creates crc for the empty metablock. After the metablock is updated, we need clear the checksum before recalculate it. Shaohua: moved checksum calculation out of r5l_recovery_create_empty_meta_block. We should calculate it after all fields are updated. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Shaohua Li <shli@fb.com>
show more ...
|
#
d30dfeb9 |
| 07-Dec-2016 |
JackieLiu <liuyun01@kylinos.cn> |
md/raid5-cache: no recovery is required when create super-block When create the super-block information, We do not need to do this recovery stage, only need to initialize some variables.
md/raid5-cache: no recovery is required when create super-block When create the super-block information, We do not need to do this recovery stage, only need to initialize some variables. Signed-off-by: JackieLiu <liuyun01@kylinos.cn> Reviewed-by: Song Liu <songliubraving@fb.com> Signed-off-by: Shaohua Li <shli@fb.com>
show more ...
|
#
3d7e7e1d |
| 04-Dec-2016 |
Zhengyuan Liu <liuzhengyuan@kylinos.cn> |
md/r5cache: do r5c_update_log_state after log recovery We should update log state after we did a log recovery, current completion may get wrong log state since log->log_start wasn't init
md/r5cache: do r5c_update_log_state after log recovery We should update log state after we did a log recovery, current completion may get wrong log state since log->log_start wasn't initalized until we called r5l_recovery_log. At log recovery stage, no lock needed as there is no race conditon. next_checkpoint field will be initialized in r5l_recovery_log too. Signed-off-by: Zhengyuan Liu <liuzhengyuan@kylinos.cn> Signed-off-by: Shaohua Li <shli@fb.com>
show more ...
|
#
43b96748 |
| 04-Dec-2016 |
JackieLiu <liuyun01@kylinos.cn> |
md/raid5-cache: adjust the write position of the empty block if no data blocks When recovery is complete, we write an empty block and record his position first, then make the data-only s
md/raid5-cache: adjust the write position of the empty block if no data blocks When recovery is complete, we write an empty block and record his position first, then make the data-only stripes rewritten done, the location of the empty block as the last checkpoint position to write into the super block. And we should update last_checkpoint to this empty block position. ------------------------------------------------------------------ | old log | empty block | data only stripes | invalid log | ------------------------------------------------------------------ ^ ^ ^ | |- log->last_checkpoint |- log->log_start | |- log->last_cp_seq |- log->next_checkpoint |- log->seq=n |- log->seq=10+n At the same time, if there is no data-only stripes, this scene may appear, | meta1 | meta2 | meta3 | meta 1 is valid, meta 2 is invalid. meta 3 could be valid. so we should The solution is we create a new meta in meta2 with its seq == meta1's seq + 10 and let superblock points to meta2. Signed-off-by: JackieLiu <liuyun01@kylinos.cn> Reviewed-by: Zhengyuan Liu <liuzhengyuan@kylinos.cn> Reviewed-by: Song Liu <songliubraving@fb.com> Signed-off-by: Shaohua Li <shli@fb.com>
show more ...
|
#
f687a33e |
| 30-Nov-2016 |
Song Liu <songliubraving@fb.com> |
md/r5cache: run_no_space_stripes() when R5C_LOG_CRITICAL == 0 With writeback cache, we define log space critical as free_space < 2 * reclaim_required_space So the deasse
md/r5cache: run_no_space_stripes() when R5C_LOG_CRITICAL == 0 With writeback cache, we define log space critical as free_space < 2 * reclaim_required_space So the deassert of R5C_LOG_CRITICAL could happen when 1. free_space increases 2. reclaim_required_space decreases Currently, run_no_space_stripes() is called when 1 happens, but not (always) when 2 happens. With this patch, run_no_space_stripes() is call when R5C_LOG_CRITICAL is cleared. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Shaohua Li <shli@fb.com>
show more ...
|
#
1a0ec5c3 |
| 28-Nov-2016 |
JackieLiu <liuyun01@kylinos.cn> |
md/raid5-cache: do not need to set STRIPE_PREREAD_ACTIVE repeatedly R5c_make_stripe_write_out has set this flag, do not need to set again. Signed-off-by: JackieLiu <liuyun01@kylinos
md/raid5-cache: do not need to set STRIPE_PREREAD_ACTIVE repeatedly R5c_make_stripe_write_out has set this flag, do not need to set again. Signed-off-by: JackieLiu <liuyun01@kylinos.cn> Signed-off-by: Shaohua Li <shli@fb.com>
show more ...
|
#
dbd22c8d |
| 28-Nov-2016 |
JackieLiu <liuyun01@kylinos.cn> |
md/raid5-cache: remove the unnecessary next_cp_seq field from the r5l_log The next_cp_seq field is useless, remove it. Signed-off-by: JackieLiu <liuyun01@kylinos.cn> Signed-off-
md/raid5-cache: remove the unnecessary next_cp_seq field from the r5l_log The next_cp_seq field is useless, remove it. Signed-off-by: JackieLiu <liuyun01@kylinos.cn> Signed-off-by: Shaohua Li <shli@fb.com>
show more ...
|
#
bc8f167f |
| 28-Nov-2016 |
JackieLiu <liuyun01@kylinos.cn> |
md/raid5-cache: release the stripe_head at the appropriate location If we released the 'stripe_head' in r5c_recovery_flush_log, ctx->cached_list will both release the data-parity stripes
md/raid5-cache: release the stripe_head at the appropriate location If we released the 'stripe_head' in r5c_recovery_flush_log, ctx->cached_list will both release the data-parity stripes and data-only stripes, which will become empty. And we also need to use the data-only stripes in r5c_recovery_rewrite_data_only_stripes, so we should wait util rewrite data-only stripes is done before releasing them. Reviewed-by: Zhengyuan Liu <liuzhengyuan@kylinos.cn> Reviewed-by: Song Liu <songliubraving@fb.com> Signed-off-by: JackieLiu <liuyun01@kylinos.cn> Signed-off-by: Shaohua Li <shli@fb.com>
show more ...
|
#
fc833c2a |
| 28-Nov-2016 |
JackieLiu <liuyun01@kylinos.cn> |
md/raid5-cache: use ring add to prevent overflow 'write_pos' must be protected with 'r5l_ring_add', or it may overflow Signed-off-by: JackieLiu <liuyun01@kylinos.cn> Reviewed-by
md/raid5-cache: use ring add to prevent overflow 'write_pos' must be protected with 'r5l_ring_add', or it may overflow Signed-off-by: JackieLiu <liuyun01@kylinos.cn> Reviewed-by: Song Liu <songliubraving@fb.com> Signed-off-by: Shaohua Li <shli@fb.com>
show more ...
|
#
9b69173e |
| 28-Nov-2016 |
JackieLiu <liuyun01@kylinos.cn> |
md/raid5-cache: remove unnecessary function parameters The function parameter 'recovery_list' is not used in body, we can delete it Signed-off-by: JackieLiu <liuyun01@kylinos.cn
md/raid5-cache: remove unnecessary function parameters The function parameter 'recovery_list' is not used in body, we can delete it Signed-off-by: JackieLiu <liuyun01@kylinos.cn> Reviewed-by: Song Liu <songliubraving@fb.com> Signed-off-by: Shaohua Li <shli@fb.com>
show more ...
|
#
462eb7d8 |
| 25-Nov-2016 |
Zhengyuan Liu <liuzhengyuan@kylinos.cn> |
raid5-cache: don't set STRIPE_R5C_PARTIAL_STRIPE flag while load stripe into cache r5c_recovery_load_one_stripe should not set STRIPE_R5C_PARTIAL_STRIPE flag,as the data-only stripe may
raid5-cache: don't set STRIPE_R5C_PARTIAL_STRIPE flag while load stripe into cache r5c_recovery_load_one_stripe should not set STRIPE_R5C_PARTIAL_STRIPE flag,as the data-only stripe may be STRIPE_R5C_FULL_STRIPE stripe. The state machine would release the stripe later and add it into neither r5c_cached_full_stripes list or r5c_cached_partial_stripes list and set correct flag. Reviewed-by: JackieLiu <liuyun01@kylinos.cn> Signed-off-by: Zhengyuan Liu <liuzhengyuan@kylinos.cn> Signed-off-by: Shaohua Li <shli@fb.com>
show more ...
|
#
f7b7bee7 |
| 25-Nov-2016 |
Zhengyuan Liu <liuzhengyuan@kylinos.cn> |
raid5-cache: add another check conditon before replaying one stripe New stripe that was just allocated has no STRIPE_R5C_CACHING state too, add this check condition could avoid unnecessa
raid5-cache: add another check conditon before replaying one stripe New stripe that was just allocated has no STRIPE_R5C_CACHING state too, add this check condition could avoid unnecessary replaying for empty stripe. r5l_recovery_replay_one_stripe would reset stripe for any case, delete it to make code more clean. Signed-off-by: Zhengyuan Liu <liuzhengyuan@kylinos.cn> Signed-off-by: Shaohua Li <shli@fb.com>
show more ...
|
#
d3014e21 |
| 24-Nov-2016 |
Dan Carpenter <dan.carpenter@oracle.com> |
md/r5cache: enable IRQs on error path We need to re-enable the IRQs here before returning. Fixes: a39f7afde358 ("md/r5cache: write-out phase and reclaim support") Signed-off-by:
md/r5cache: enable IRQs on error path We need to re-enable the IRQs here before returning. Fixes: a39f7afde358 ("md/r5cache: write-out phase and reclaim support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Shaohua Li <shli@fb.com>
show more ...
|
#
d7bd398e |
| 24-Nov-2016 |
Song Liu <songliubraving@fb.com> |
md/r5cache: handle alloc_page failure RMW of r5c write back cache uses an extra page to store old data for prexor. handle_stripe_dirtying() allocates this page by calling alloc_page(
md/r5cache: handle alloc_page failure RMW of r5c write back cache uses an extra page to store old data for prexor. handle_stripe_dirtying() allocates this page by calling alloc_page(). However, alloc_page() may fail. To handle alloc_page() failures, this patch adds an extra page to disk_info. When alloc_page fails, handle_stripe() trys to use these pages. When these pages are used by other stripe (R5C_EXTRA_PAGE_IN_USE), the stripe is added to delayed_list. Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Shaohua Li <shli@fb.com>
show more ...
|
#
ce1ccd07 |
| 21-Nov-2016 |
Shaohua Li <shli@fb.com> |
raid5-cache: suspend reclaim thread instead of shutdown There is mechanism to suspend a kernel thread. Use it instead of playing create/destroy game. Signed-off-by: Shaohua Li <
raid5-cache: suspend reclaim thread instead of shutdown There is mechanism to suspend a kernel thread. Use it instead of playing create/destroy game. Signed-off-by: Shaohua Li <shli@fb.com> Reviewed-by: NeilBrown <neilb@suse.de> Cc: Song Liu <songliubraving@fb.com>
show more ...
|
#
3a83f467 |
| 22-Nov-2016 |
Ming Lei <tom.leiming@gmail.com> |
block: bio: pass bvec table to bio_init() Some drivers often use external bvec table, so introduce this helper for this case. It is always safe to access the bio->bi_io_vec in this w
block: bio: pass bvec table to bio_init() Some drivers often use external bvec table, so introduce this helper for this case. It is always safe to access the bio->bi_io_vec in this way for this case. After converting to this usage, it will becomes a bit easier to evaluate the remaining direct access to bio->bi_io_vec, so it can help to prepare for the following multipage bvec support. Signed-off-by: Ming Lei <tom.leiming@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Fixed up the new O_DIRECT cases. Signed-off-by: Jens Axboe <axboe@fb.com>
show more ...
|
Revision tags: openbmc-4.4-20161121-1 |
|
#
3bddb7f8 |
| 18-Nov-2016 |
Song Liu <songliubraving@fb.com> |
md/r5cache: handle FLUSH and FUA With raid5 cache, we committing data from journal device. When there is flush request, we need to flush journal device's cache. This was not needed i
md/r5cache: handle FLUSH and FUA With raid5 cache, we committing data from journal device. When there is flush request, we need to flush journal device's cache. This was not needed in raid5 journal, because we will flush the journal before committing data to raid disks. This is similar to FUA, except that we also need flush journal for FUA. Otherwise, corruptions in earlier meta data will stop recovery from reaching FUA data. slightly changed the code by Shaohua Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Shaohua Li <shli@fb.com>
show more ...
|
Revision tags: v4.4.33 |
|
#
5aabf7c4 |
| 17-Nov-2016 |
Song Liu <songliubraving@fb.com> |
md/r5cache: r5cache recovery: part 2 1. In previous patch, we: - add new data to r5l_recovery_ctx - add new functions to recovery write-back cache The new function
md/r5cache: r5cache recovery: part 2 1. In previous patch, we: - add new data to r5l_recovery_ctx - add new functions to recovery write-back cache The new functions are not used in this patch, so this patch does not change the behavior of recovery. 2. In this patchpatch, we: - modify main recovery procedure r5l_recovery_log() to call new functions - remove old functions Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Shaohua Li <shli@fb.com>
show more ...
|