Revision tags: v6.6.25, v6.6.24, v6.6.23 |
|
#
dc9702ac |
| 05-Mar-2024 |
Yu Kuai <yukuai3@huawei.com> |
dm-raid: fix lockdep waring in "pers->hot_add_disk"
[ Upstream commit 95009ae904b1e9dca8db6f649f2d7c18a6e42c75 ]
The lockdep assert is added by commit a448af25becf ("md/raid10: remove rcu protectio
dm-raid: fix lockdep waring in "pers->hot_add_disk"
[ Upstream commit 95009ae904b1e9dca8db6f649f2d7c18a6e42c75 ]
The lockdep assert is added by commit a448af25becf ("md/raid10: remove rcu protection to access rdev from conf") in print_conf(). And I didn't notice that dm-raid is calling "pers->hot_add_disk" without holding 'reconfig_mutex'.
"pers->hot_add_disk" read and write many fields that is protected by 'reconfig_mutex', and raid_resume() already grab the lock in other contex. Hence fix this problem by protecting "pers->host_add_disk" with the lock.
Fixes: 9092c02d9435 ("DM RAID: Add ability to restore transiently failed devices on resume") Fixes: a448af25becf ("md/raid10: remove rcu protection to access rdev from conf") Cc: stable@vger.kernel.org # v6.7+ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Xiao Ni <xni@redhat.com> Acked-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240305072306.2562024-10-yukuai1@huaweicloud.com Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v6.6.25, v6.6.24, v6.6.23 |
|
#
dc9702ac |
| 05-Mar-2024 |
Yu Kuai <yukuai3@huawei.com> |
dm-raid: fix lockdep waring in "pers->hot_add_disk"
[ Upstream commit 95009ae904b1e9dca8db6f649f2d7c18a6e42c75 ]
The lockdep assert is added by commit a448af25becf ("md/raid10: remove rcu protectio
dm-raid: fix lockdep waring in "pers->hot_add_disk"
[ Upstream commit 95009ae904b1e9dca8db6f649f2d7c18a6e42c75 ]
The lockdep assert is added by commit a448af25becf ("md/raid10: remove rcu protection to access rdev from conf") in print_conf(). And I didn't notice that dm-raid is calling "pers->hot_add_disk" without holding 'reconfig_mutex'.
"pers->hot_add_disk" read and write many fields that is protected by 'reconfig_mutex', and raid_resume() already grab the lock in other contex. Hence fix this problem by protecting "pers->host_add_disk" with the lock.
Fixes: 9092c02d9435 ("DM RAID: Add ability to restore transiently failed devices on resume") Fixes: a448af25becf ("md/raid10: remove rcu protection to access rdev from conf") Cc: stable@vger.kernel.org # v6.7+ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Xiao Ni <xni@redhat.com> Acked-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240305072306.2562024-10-yukuai1@huaweicloud.com Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
9f926ba2 |
| 11-Mar-2024 |
Ming Lei <ming.lei@redhat.com> |
dm raid: fix false positive for requeue needed during reshape
[ Upstream commit b25b8f4b8ecef0f48c05f0c3572daeabefe16526 ]
An empty flush doesn't have a payload, so it should never be looked at whe
dm raid: fix false positive for requeue needed during reshape
[ Upstream commit b25b8f4b8ecef0f48c05f0c3572daeabefe16526 ]
An empty flush doesn't have a payload, so it should never be looked at when considering to possibly requeue a bio for the case when a reshape is in progress.
Fixes: 9dbd1aa3a81c ("dm raid: add reshaping support to the target") Reported-by: Patrick Plenefisch <simonpatp@gmail.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32, v6.1.31 |
|
#
a865b96c |
| 29-May-2023 |
Yu Kuai <yukuai3@huawei.com> |
Revert "md: unlock mddev before reap sync_thread in action_store"
This reverts commit 9dfbdafda3b34e262e43e786077bab8e476a89d1.
Because it will introduce a defect that sync_thread can be running wh
Revert "md: unlock mddev before reap sync_thread in action_store"
This reverts commit 9dfbdafda3b34e262e43e786077bab8e476a89d1.
Because it will introduce a defect that sync_thread can be running while MD_RECOVERY_RUNNING is cleared, which will cause some unexpected problems, for example:
list_add corruption. prev->next should be next (ffff0001ac1daba0), but was ffff0000ce1a02a0. (prev=ffff0000ce1a02a0). Call trace: __list_add_valid+0xfc/0x140 insert_work+0x78/0x1a0 __queue_work+0x500/0xcf4 queue_work_on+0xe8/0x12c md_check_recovery+0xa34/0xf30 raid10d+0xb8/0x900 [raid10] md_thread+0x16c/0x2cc kthread+0x1a4/0x1ec ret_from_fork+0x10/0x18
This is because work is requeued while it's still inside workqueue:
t1: t2: action_store mddev_lock if (mddev->sync_thread) mddev_unlock md_unregister_thread // first sync_thread is done md_check_recovery mddev_try_lock /* * once MD_RECOVERY_DONE is set, new sync_thread * can start. */ set_bit(MD_RECOVERY_RUNNING, &mddev->recovery) INIT_WORK(&mddev->del_work, md_start_sync) queue_work(md_misc_wq, &mddev->del_work) test_and_set_bit(WORK_STRUCT_PENDING_BIT, ...) // set pending bit insert_work list_add_tail mddev_unlock mddev_lock_nointr md_reap_sync_thread // MD_RECOVERY_RUNNING is cleared mddev_unlock
t3:
// before queued work started from t2 md_check_recovery // MD_RECOVERY_RUNNING is not set, a new sync_thread can be started INIT_WORK(&mddev->del_work, md_start_sync) work->data = 0 // work pending bit is cleared queue_work(md_misc_wq, &mddev->del_work) insert_work list_add_tail // list is corrupted
The above commit is reverted to fix the problem, the deadlock this commit tries to fix will be fixed in following patches.
Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230529132037.2124527-2-yukuai1@huaweicloud.com
show more ...
|
#
7d5fff89 |
| 08-Jul-2023 |
Yu Kuai <yukuai3@huawei.com> |
dm raid: protect md_stop() with 'reconfig_mutex'
__md_stop_writes() and __md_stop() will modify many fields that are protected by 'reconfig_mutex', and all the callers will grab 'reconfig_mutex' exc
dm raid: protect md_stop() with 'reconfig_mutex'
__md_stop_writes() and __md_stop() will modify many fields that are protected by 'reconfig_mutex', and all the callers will grab 'reconfig_mutex' except for md_stop().
Also, update md_stop() to make certain 'reconfig_mutex' is held using lockdep_assert_held().
Fixes: 9d09e663d550 ("dm: raid456 basic support") Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
show more ...
|
#
e74c874e |
| 08-Jul-2023 |
Yu Kuai <yukuai3@huawei.com> |
dm raid: clean up four equivalent goto tags in raid_ctr()
There are four equivalent goto tags in raid_ctr(), clean them up to use just one.
There is no functional change and this is preparation to
dm raid: clean up four equivalent goto tags in raid_ctr()
There are four equivalent goto tags in raid_ctr(), clean them up to use just one.
There is no functional change and this is preparation to fix raid_ctr()'s unprotected md_stop().
Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
show more ...
|
#
bae30287 |
| 08-Jul-2023 |
Yu Kuai <yukuai3@huawei.com> |
dm raid: fix missing reconfig_mutex unlock in raid_ctr() error paths
In the error paths 'bad_stripe_cache' and 'bad_check_reshape', 'reconfig_mutex' is still held after raid_ctr() returns.
Fixes: 9
dm raid: fix missing reconfig_mutex unlock in raid_ctr() error paths
In the error paths 'bad_stripe_cache' and 'bad_check_reshape', 'reconfig_mutex' is still held after raid_ctr() returns.
Fixes: 9dbd1aa3a81c ("dm raid: add reshaping support to the target") Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
show more ...
|
Revision tags: v6.1.30 |
|
#
955a257d |
| 22-May-2023 |
Yu Kuai <yukuai3@huawei.com> |
dm-raid: remove useless checking in raid_message()
md_wakeup_thread() handle the case that pass in md_thread is NULL, there is no need to check this.
Signed-off-by: Yu Kuai <yukuai3@huawei.com> Sig
dm-raid: remove useless checking in raid_message()
md_wakeup_thread() handle the case that pass in md_thread is NULL, there is no need to check this.
Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230523021017.3048783-3-yukuai1@huaweicloud.com
show more ...
|
Revision tags: v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24 |
|
#
3664ff82 |
| 09-Apr-2023 |
Yangtao Li <frank.li@vivo.com> |
dm: add helper macro for simple DM target module init and exit
Eliminate duplicate boilerplate code for simple modules that contain a single DM target driver without any additional setup code.
Add
dm: add helper macro for simple DM target module init and exit
Eliminate duplicate boilerplate code for simple modules that contain a single DM target driver without any additional setup code.
Add a new module_dm() macro, which replaces the module_init() and module_exit() with template functions that call dm_register_target() and dm_unregister_target() respectively.
Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
show more ...
|
Revision tags: v6.1.23 |
|
#
306fbc2e |
| 30-Mar-2023 |
Tom Rix <trix@redhat.com> |
dm raid: remove unused d variable
clang with W=1 reports drivers/md/dm-raid.c:2212:15: error: variable 'd' set but not used [-Werror,-Wunused-but-set-variable] unsigned int d;
dm raid: remove unused d variable
clang with W=1 reports drivers/md/dm-raid.c:2212:15: error: variable 'd' set but not used [-Werror,-Wunused-but-set-variable] unsigned int d; ^ This variable is not used so remove it.
Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
show more ...
|
Revision tags: v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11 |
|
#
23fda2ef |
| 07-Feb-2023 |
Heinz Mauelshagen <heinzm@redhat.com> |
dm: fix suspect indent whitespace
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
#
b30f1607 |
| 07-Feb-2023 |
Heinz Mauelshagen <heinzm@redhat.com> |
dm: add missing blank line after declarations/fix those
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
Revision tags: v6.1.10, v6.1.9 |
|
#
a4a82ce3 |
| 26-Jan-2023 |
Heinz Mauelshagen <heinzm@redhat.com> |
dm: correct block comments format.
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
#
255e2646 |
| 25-Jan-2023 |
Heinz Mauelshagen <heinzm@redhat.com> |
dm: address indent/space issues
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
#
2f06cd12 |
| 30-Jan-2023 |
Heinz Mauelshagen <heinzm@redhat.com> |
dm: avoid initializing static variables
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
#
86a3238c |
| 25-Jan-2023 |
Heinz Mauelshagen <heinzm@redhat.com> |
dm: change "unsigned" to "unsigned int"
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
#
3bd94003 |
| 25-Jan-2023 |
Heinz Mauelshagen <heinzm@redhat.com> |
dm: add missing SPDX-License-Indentifiers
'GPL-2.0-only' is used instead of 'GPL-2.0' because SPDX has deprecated its use.
Suggested-by: John Wiele <jwiele@redhat.com> Signed-off-by: Heinz Mauelsha
dm: add missing SPDX-License-Indentifiers
'GPL-2.0-only' is used instead of 'GPL-2.0' because SPDX has deprecated its use.
Suggested-by: John Wiele <jwiele@redhat.com> Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
show more ...
|
#
efdd3c33 |
| 05-Feb-2023 |
Yu Zhe <yuzhe@nfschina.com> |
dm raid: fix some spelling mistakes in comments
Signed-off-by: Yu Zhe <yuzhe@nfschina.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
Revision tags: v6.1.8, v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80, v6.0.9, v5.15.79, v6.0.8, v5.15.78, v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4, v6.0.3, v6.0.2, v5.15.74, v5.15.73, v6.0.1, v5.15.72, v6.0, v5.15.71, v5.15.70, v5.15.69, v5.15.68, v5.15.67, v5.15.66, v5.15.65 |
|
#
96fccdce |
| 04-Sep-2022 |
Jiangshan Yi <yijiangshan@kylinos.cn> |
dm raid: fix typo in analyse_superblocks code comment
Reported-by: k2ci <kernel-bot@kylinos.cn> Signed-off-by: Jiangshan Yi <yijiangshan@kylinos.cn> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
Revision tags: v5.15.64 |
|
#
cea44663 |
| 30-Aug-2022 |
Jilin Yuan <yuanjilin@cdjrlc.com> |
dm raid: delete the redundant word 'that' in comment
Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
Revision tags: v5.15.63, v5.15.62, v5.15.61, v5.15.60, v5.15.59, v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55, v5.15.54, v5.15.53, v5.15.52, v5.15.51, v5.15.50, v5.15.49 |
|
#
9dfbdafd |
| 20-Jun-2022 |
Guoqing Jiang <guoqing.jiang@linux.dev> |
md: unlock mddev before reap sync_thread in action_store
Since the bug which commit 8b48ec23cc51a ("md: don't unregister sync_thread with reconfig_mutex held") fixed is related with action_store pat
md: unlock mddev before reap sync_thread in action_store
Since the bug which commit 8b48ec23cc51a ("md: don't unregister sync_thread with reconfig_mutex held") fixed is related with action_store path, other callers which reap sync_thread didn't need to be changed.
Let's pull md_unregister_thread from md_reap_sync_thread, then fix previous bug with belows.
1. unlock mddev before md_reap_sync_thread in action_store. 2. save reshape_position before unlock, then restore it to ensure position not changed accidentally by others.
Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev> Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
9dd1cd32 |
| 20-Jul-2022 |
Mike Snitzer <snitzer@kernel.org> |
dm: fix dm-raid crash if md_handle_request() splits bio
Commit ca522482e3eaf ("dm: pass NULL bdev to bio_alloc_clone") introduced the optimization to _not_ perform bio_associate_blkg()'s relatively
dm: fix dm-raid crash if md_handle_request() splits bio
Commit ca522482e3eaf ("dm: pass NULL bdev to bio_alloc_clone") introduced the optimization to _not_ perform bio_associate_blkg()'s relatively costly work when DM core clones its bio. But in doing so it exposed the possibility for DM's cloned bio to alter DM target behavior (e.g. crash) if a target were to issue IO without first calling bio_set_dev().
The DM raid target can trigger an MD crash due to its need to split the DM bio that is passed to md_handle_request(). The split will recurse to submit_bio_noacct() using a bio with an uninitialized ->bi_blkg. This NULL bio->bi_blkg causes blk_throtl_bio() to dereference a NULL blkg_to_tg(bio->bi_blkg).
Fix this in DM core by adding a new 'needs_bio_set_dev' target flag that will make alloc_tio() call bio_set_dev() on behalf of the target. dm-raid is the only target that requires this flag. bio_set_dev() initializes the DM cloned bio's ->bi_blkg, using bio_associate_blkg, before passing the bio to md_handle_request().
Long-term fix would be to audit and refactor MD code to rely on DM to split its bio, using dm_accept_partial_bio(), but there are MD raid personalities (e.g. raid1 and raid10) whose implementation are tightly coupled to handling the bio splitting inline.
Fixes: ca522482e3eaf ("dm: pass NULL bdev to bio_alloc_clone") Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer <snitzer@kernel.org>
show more ...
|
#
7dad24db |
| 24-Jul-2022 |
Mikulas Patocka <mpatocka@redhat.com> |
dm raid: fix address sanitizer warning in raid_resume
There is a KASAN warning in raid_resume when running the lvm test lvconvert-raid.sh. The reason for the warning is that mddev->raid_disks is gre
dm raid: fix address sanitizer warning in raid_resume
There is a KASAN warning in raid_resume when running the lvm test lvconvert-raid.sh. The reason for the warning is that mddev->raid_disks is greater than rs->raid_disks, so the loop touches one entry beyond the allocated length.
Cc: stable@vger.kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
show more ...
|
#
1fbeea21 |
| 24-Jul-2022 |
Mikulas Patocka <mpatocka@redhat.com> |
dm raid: fix address sanitizer warning in raid_status
There is this warning when using a kernel with the address sanitizer and running this testsuite: https://gitlab.com/cki-project/kernel-tests/-/t
dm raid: fix address sanitizer warning in raid_status
There is this warning when using a kernel with the address sanitizer and running this testsuite: https://gitlab.com/cki-project/kernel-tests/-/tree/main/storage/swraid/scsi_raid
================================================================== BUG: KASAN: slab-out-of-bounds in raid_status+0x1747/0x2820 [dm_raid] Read of size 4 at addr ffff888079d2c7e8 by task lvcreate/13319 CPU: 0 PID: 13319 Comm: lvcreate Not tainted 5.18.0-0.rc3.<snip> #1 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 Call Trace: <TASK> dump_stack_lvl+0x6a/0x9c print_address_description.constprop.0+0x1f/0x1e0 print_report.cold+0x55/0x244 kasan_report+0xc9/0x100 raid_status+0x1747/0x2820 [dm_raid] dm_ima_measure_on_table_load+0x4b8/0xca0 [dm_mod] table_load+0x35c/0x630 [dm_mod] ctl_ioctl+0x411/0x630 [dm_mod] dm_ctl_ioctl+0xa/0x10 [dm_mod] __x64_sys_ioctl+0x12a/0x1a0 do_syscall_64+0x5b/0x80
The warning is caused by reading conf->max_nr_stripes in raid_status. The code in raid_status reads mddev->private, casts it to struct r5conf and reads the entry max_nr_stripes.
However, if we have different raid type than 4/5/6, mddev->private doesn't point to struct r5conf; it may point to struct r0conf, struct r1conf, struct r10conf or struct mpconf. If we cast a pointer to one of these structs to struct r5conf, we will be reading invalid memory and KASAN warns about it.
Fix this bug by reading struct r5conf only if raid type is 4, 5 or 6.
Cc: stable@vger.kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
show more ...
|
#
4ce4c73f |
| 14-Jul-2022 |
Bart Van Assche <bvanassche@acm.org> |
md/core: Combine two sync_page_io() arguments
Improve uniformity in the kernel of handling of request operation and flags by passing these as a single argument.
Cc: Song Liu <song@kernel.org> Signe
md/core: Combine two sync_page_io() arguments
Improve uniformity in the kernel of handling of request operation and flags by passing these as a single argument.
Cc: Song Liu <song@kernel.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20220714180729.1065367-32-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|