#
a0e286b6 |
| 30-Mar-2022 |
Christoph Hellwig <hch@lst.de> |
loop: remove lo_refcount and avoid lo_mutex in ->open / ->release
lo_refcount counts how many openers a loop device has, but that count is already provided by the block layer in the bd_openers field
loop: remove lo_refcount and avoid lo_mutex in ->open / ->release
lo_refcount counts how many openers a loop device has, but that count is already provided by the block layer in the bd_openers field of the whole-disk block_device. Remove lo_refcount and allow opens to succeed even on devices beeing deleted - now that ->free_disk is implemented we can handle that race gracefull and all I/O on it will just fail. Similarly there is a small race window now where loop_control_remove does not synchronize the delete vs the remove due do bd_openers not being under lo_mutex protection, but we can handle that just as gracefully.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220330052917.2566582-15-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
158eaeba |
| 30-Mar-2022 |
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> |
loop: avoid loop_validate_mutex/lo_mutex in ->release
Since ->release is called with disk->open_mutex held, and __loop_clr_fd() from lo_release() is called via ->release when disk_openers() == 0, w
loop: avoid loop_validate_mutex/lo_mutex in ->release
Since ->release is called with disk->open_mutex held, and __loop_clr_fd() from lo_release() is called via ->release when disk_openers() == 0, we are guaranteed that "struct file" which will be passed to loop_validate_file() via fget() cannot be the loop device __loop_clr_fd(lo, true) will clear. Thus, there is no need to hold loop_validate_mutex from __loop_clr_fd() if release == true.
When I made commit 3ce6e1f662a91097 ("loop: reintroduce global lock for safe loop_validate_file() traversal"), I wrote "It is acceptable for loop_validate_file() to succeed, for actual clear operation has not started yet.". But now I came to feel why it is acceptable to succeed.
It seems that the loop driver was added in Linux 1.3.68, and
if (lo->lo_refcnt > 1) return -EBUSY;
check in loop_clr_fd() was there from the beginning. The intent of this check was unclear. But now I think that current
disk_openers(lo->lo_disk) > 1
form is there for three reasons.
(1) Avoid I/O errors when some process which opens and reads from this loop device in response to uevent notification (e.g. systemd-udevd), as described in commit a1ecac3b0656a682 ("loop: Make explicit loop device destruction lazy"). This opener is short-lived because it is likely that the file descriptor used by that process is closed soon.
(2) Avoid I/O errors caused by underlying layer of stacked loop devices (i.e. ioctl(some_loop_fd, LOOP_SET_FD, other_loop_fd)) being suddenly disappeared. This opener is long-lived because this reference is associated with not a file descriptor but lo->lo_backing_file.
(3) Avoid I/O errors caused by underlying layer of mounted loop device (i.e. mount(some_loop_device, some_mount_point)) being suddenly disappeared. This opener is long-lived because this reference is associated with not a file descriptor but mount.
While race in (1) might be acceptable, (2) and (3) should be checked racelessly. That is, make sure that __loop_clr_fd() will not run if loop_validate_file() succeeds, by doing refcount check with global lock held when explicit loop device destruction is requested.
As a result of no longer waiting for lo->lo_mutex after setting Lo_rundown, we can remove pointless BUG_ON(lo->lo_state != Lo_rundown) check.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220330052917.2566582-14-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
498ef5c7 |
| 30-Mar-2022 |
Christoph Hellwig <hch@lst.de> |
loop: suppress uevents while reconfiguring the device
Currently, udev change event is generated for a loop device before the device is ready for IO. Due to serialization on lo->lo_mutex in lo_open()
loop: suppress uevents while reconfiguring the device
Currently, udev change event is generated for a loop device before the device is ready for IO. Due to serialization on lo->lo_mutex in lo_open() this does not matter because anybody is able to open the device and do IO only after the configuration is finished. However this synchronization in lo_open() is going away so make sure userspace reacting to the change event will see the new device state by generating the event only when the device is setup.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220330052917.2566582-13-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
d2c7f56f |
| 30-Mar-2022 |
Christoph Hellwig <hch@lst.de> |
loop: implement ->free_disk
Ensure that the lo_device which is stored in the gendisk private data is valid until the gendisk is freed. Currently the loop driver uses a lot of effort to make sure a
loop: implement ->free_disk
Ensure that the lo_device which is stored in the gendisk private data is valid until the gendisk is freed. Currently the loop driver uses a lot of effort to make sure a device is not freed when it is still in use, but to to fix a potential deadlock this will be relaxed a bit soon.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220330052917.2566582-12-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
1fe0b1ac |
| 30-Mar-2022 |
Christoph Hellwig <hch@lst.de> |
loop: only freeze the queue in __loop_clr_fd when needed
->release is only called after all outstanding I/O has completed, so only freeze the queue when clearing the backing file of a live loop devi
loop: only freeze the queue in __loop_clr_fd when needed
->release is only called after all outstanding I/O has completed, so only freeze the queue when clearing the backing file of a live loop device.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Tested-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20220330052917.2566582-11-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
46dc9674 |
| 30-Mar-2022 |
Christoph Hellwig <hch@lst.de> |
loop: don't freeze the queue in lo_release
By the time the final ->release is called there can't be outstanding I/O. For non-final ->release there is no need for driver action at all.
Thus remove t
loop: don't freeze the queue in lo_release
By the time the final ->release is called there can't be outstanding I/O. For non-final ->release there is no need for driver action at all.
Thus remove the useless queue freeze.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Tested-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20220330052917.2566582-10-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
98ded54a |
| 30-Mar-2022 |
Christoph Hellwig <hch@lst.de> |
loop: remove the racy bd_inode->i_mapping->nrpages asserts
Nothing prevents a file system or userspace opener of the block device from redirtying the page right afte sync_blockdev returned. Fortuna
loop: remove the racy bd_inode->i_mapping->nrpages asserts
Nothing prevents a file system or userspace opener of the block device from redirtying the page right afte sync_blockdev returned. Fortunately data in the page cache during a block device change is mostly harmless anyway.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Tested-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20220330052917.2566582-9-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
b15ed546 |
| 30-Mar-2022 |
Christoph Hellwig <hch@lst.de> |
loop: initialize the worker tracking fields once
There is no need to reinitialize idle_worker_list, worker_tree and timer every time a loop device is configured. Just initialize them once at alloca
loop: initialize the worker tracking fields once
There is no need to reinitialize idle_worker_list, worker_tree and timer every time a loop device is configured. Just initialize them once at allocation time.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Tested-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20220330052917.2566582-8-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
2cf429b5 |
| 30-Mar-2022 |
Christoph Hellwig <hch@lst.de> |
loop: de-duplicate the idle worker freeing code
Use a common helper for both timer based and uncoditional freeing of idle workers.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kar
loop: de-duplicate the idle worker freeing code
Use a common helper for both timer based and uncoditional freeing of idle workers.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Tested-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20220330052917.2566582-7-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
7b47ef52 |
| 14-Apr-2022 |
Christoph Hellwig <hch@lst.de> |
block: add a bdev_discard_granularity helper
Abstract away implementation details from file systems by providing a block_device based helper to retrieve the discard granularity.
Signed-off-by: Chri
block: add a bdev_discard_granularity helper
Abstract away implementation details from file systems by providing a block_device based helper to retrieve the discard granularity.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd] Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Acked-by: David Sterba <dsterba@suse.com> [btrfs] Link: https://lore.kernel.org/r/20220415045258.199825-26-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
70200574 |
| 14-Apr-2022 |
Christoph Hellwig <hch@lst.de> |
block: remove QUEUE_FLAG_DISCARD
Just use a non-zero max_discard_sectors as an indicator for discard support, similar to what is done for write zeroes.
The only places where needs special attention
block: remove QUEUE_FLAG_DISCARD
Just use a non-zero max_discard_sectors as an indicator for discard support, similar to what is done for write zeroes.
The only places where needs special attention is the RAID5 driver, which must clear discard support for security reasons by default, even if the default stacking rules would allow for it.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd] Acked-by: Jan Höppner <hoeppner@linux.ibm.com> [s390] Acked-by: Coly Li <colyli@suse.de> [bcache] Acked-by: David Sterba <dsterba@suse.com> [btrfs] Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220415045258.199825-25-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
10f0d2a5 |
| 14-Apr-2022 |
Christoph Hellwig <hch@lst.de> |
block: add a bdev_nonrot helper
Add a helper to check the nonrot flag based on the block_device instead of having to poke into the block layer internal request_queue.
Signed-off-by: Christoph Hellw
block: add a bdev_nonrot helper
Add a helper to check the nonrot flag based on the block_device instead of having to poke into the block layer internal request_queue.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: David Sterba <dsterba@suse.com> [btrfs] Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220415045258.199825-12-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
f941c51e |
| 29-Mar-2022 |
Carlos Llamas <cmllamas@google.com> |
loop: fix ioctl calls using compat_loop_info
Support for cryptoloop was deleted in commit 47e9624616c8 ("block: remove support for cryptoloop and the xor transfer"), making the usage of loop_info->l
loop: fix ioctl calls using compat_loop_info
Support for cryptoloop was deleted in commit 47e9624616c8 ("block: remove support for cryptoloop and the xor transfer"), making the usage of loop_info->lo_encrypt_type obsolete. However, this member was also removed from the compat_loop_info definition and this breaks userspace ioctl calls for 32-bit binaries and CONFIG_COMPAT=y.
This patch restores the compat_loop_info->lo_encrypt_type member and marks it obsolete as well as in the uapi header definitions.
Fixes: 47e9624616c8 ("block: remove support for cryptoloop and the xor transfer") Signed-off-by: Carlos Llamas <cmllamas@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220329201815.1347500-1-cmllamas@google.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.15.32, v5.15.31, v5.17, v5.15.30, v5.15.29 |
|
#
ce8d7861 |
| 15-Mar-2022 |
Christoph Hellwig <hch@lst.de> |
nvme: warn about shared namespaces without CONFIG_NVME_MULTIPATH
Start warning about exposing a namespace as multiple block devices, and set a fixed deprecation release.
Signed-off-by: Christoph He
nvme: warn about shared namespaces without CONFIG_NVME_MULTIPATH
Start warning about exposing a namespace as multiple block devices, and set a fixed deprecation release.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org>
show more ...
|
Revision tags: v5.15.28, v5.15.27, v5.15.26, v5.15.25, v5.15.24 |
|
#
ef44c508 |
| 15-Feb-2022 |
Chaitanya Kulkarni <kch@nvidia.com> |
loop: allow user to set the queue depth
Instead of hardcoding queue depth allow user to set the hw queue depth using module parameter. Set default value to 128 to retain the existing behavior.
Sign
loop: allow user to set the queue depth
Instead of hardcoding queue depth allow user to set the hw queue depth using module parameter. Set default value to 128 to retain the existing behavior.
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Link: https://lore.kernel.org/r/20220215213310.7264-5-kch@nvidia.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
9c64e38c |
| 15-Feb-2022 |
Chaitanya Kulkarni <kch@nvidia.com> |
loop: remove extra variable in lo_req_flush
The local variable file is used to pass it to the vfs_fsync(). We can get away with using lo->lo_backing_file instead of storing in a local variable which
loop: remove extra variable in lo_req_flush
The local variable file is used to pass it to the vfs_fsync(). We can get away with using lo->lo_backing_file instead of storing in a local variable which is not used anywhere else.
No functional change in this patch.
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Link: https://lore.kernel.org/r/20220215213310.7264-4-kch@nvidia.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
0aab29b8 |
| 15-Feb-2022 |
Chaitanya Kulkarni <kch@nvidia.com> |
loop: remove extra variable in lo_fallocate()
The local variable q is used to pass it to the blk_queue_discard(). We can get away with using lo->lo_queue instead of storing in a local variable which
loop: remove extra variable in lo_fallocate()
The local variable q is used to pass it to the blk_queue_discard(). We can get away with using lo->lo_queue instead of storing in a local variable which is not used anywhere else.
No functional change in this patch.
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Link: https://lore.kernel.org/r/20220215213310.7264-3-kch@nvidia.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
b27824d3 |
| 15-Feb-2022 |
Chaitanya Kulkarni <kch@nvidia.com> |
loop: use sysfs_emit() in the sysfs xxx show()
sprintf does not know the PAGE_SIZE maximum of the temporary buffer used for outputting sysfs content and it's possible to overrun the PAGE_SIZE buffer
loop: use sysfs_emit() in the sysfs xxx show()
sprintf does not know the PAGE_SIZE maximum of the temporary buffer used for outputting sysfs content and it's possible to overrun the PAGE_SIZE buffer length.
Use a generic sysfs_emit function that knows the size of the temporary buffer and ensures that no overrun is done for offset attribute in loop_attr_[offset|sizelimit|autoclear|partscan|dio]_show() callbacks.
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Link: https://lore.kernel.org/r/20220215213310.7264-2-kch@nvidia.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.15.23, v5.15.22 |
|
#
d9a74051 |
| 08-Feb-2022 |
Colin Ian King <colin.i.king@gmail.com> |
loop: clean up grammar in warning message
The phrase "has still" should be "still has" to clean up the grammar.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://lore.kernel.org/
loop: clean up grammar in warning message
The phrase "has still" should be "still has" to clean up the grammar.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://lore.kernel.org/r/20220208114656.61629-1-colin.i.king@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.15.21, v5.15.20, v5.15.19, v5.15.18, v5.15.17 |
|
#
06582bc8 |
| 25-Jan-2022 |
Ming Lei <ming.lei@redhat.com> |
block: loop:use kstatfs.f_bsize of backing file to set discard granularity
If backing file's filesystem has implemented ->fallocate(), we think the loop device can support discard, then pass sb->s_b
block: loop:use kstatfs.f_bsize of backing file to set discard granularity
If backing file's filesystem has implemented ->fallocate(), we think the loop device can support discard, then pass sb->s_blocksize as discard_granularity. However, some underlying FS, such as overlayfs, doesn't set sb->s_blocksize, and causes discard_granularity to be set as zero, then the warning in __blkdev_issue_discard() is triggered.
Christoph suggested to pass kstatfs.f_bsize as discard granularity, and this way is fine because kstatfs.f_bsize means 'Optimal transfer block size', which still matches with definition of discard granularity.
So fix the issue by setting discard_granularity as kstatfs.f_bsize if it is available, otherwise claims discard isn't supported.
Cc: Christoph Hellwig <hch@lst.de> Cc: Vivek Goyal <vgoyal@redhat.com> Reported-by: Pei Zhang <pezhang@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220126035830.296465-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
bf23747e |
| 11-Feb-2022 |
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> |
loop: revert "make autoclear operation asynchronous"
The kernel test robot is reporting that xfstest which does
umount ext2 on xfs umount xfs
sequence started failing, for commit 322c4293ecc58
loop: revert "make autoclear operation asynchronous"
The kernel test robot is reporting that xfstest which does
umount ext2 on xfs umount xfs
sequence started failing, for commit 322c4293ecc58110 ("loop: make autoclear operation asynchronous") removed a guarantee that fput() of backing file is processed before lo_release() from close() returns to user mode.
And syzbot is reporting that deferring destroy_workqueue() from __loop_clr_fd() to a WQ context did not help [1]. Revert that commit.
Link: https://syzkaller.appspot.com/bug?extid=831661966588c802aae9 [1] Reported-by: kernel test robot <oliver.sang@intel.com> Acked-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Reported-by: syzbot <syzbot+831661966588c802aae9@syzkaller.appspotmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Link: https://lore.kernel.org/r/20220211071554.3424-1-penguin-kernel@I-love.SAKURA.ne.jp Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.4.173, v5.15.16, v5.15.15 |
|
#
413ec805 |
| 12-Jan-2022 |
Colin Ian King <colin.i.king@gmail.com> |
loop: remove redundant initialization of pointer node
The pointer node is being initialized with a value that is never read, it is being re-assigned the same value a little futher on. Remove the red
loop: remove redundant initialization of pointer node
The pointer node is being initialized with a value that is never read, it is being re-assigned the same value a little futher on. Remove the redundant initialization. Cleans up clang scan warning:
drivers/block/loop.c:823:19: warning: Value stored to 'node' during its initialization is never read [deadcode.DeadStores]
Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://lore.kernel.org/r/20220113001432.1331871-1-colin.i.king@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.16, v5.15.10, v5.15.9, v5.15.8 |
|
#
322c4293 |
| 13-Dec-2021 |
Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> |
loop: make autoclear operation asynchronous
syzbot is reporting circular locking problem at __loop_clr_fd() [1], for commit 87579e9b7d8dc36e ("loop: use worker per cgroup instead of kworker") is cal
loop: make autoclear operation asynchronous
syzbot is reporting circular locking problem at __loop_clr_fd() [1], for commit 87579e9b7d8dc36e ("loop: use worker per cgroup instead of kworker") is calling destroy_workqueue() with disk->open_mutex held.
This circular dependency cannot be broken unless we call __loop_clr_fd() without holding disk->open_mutex. Therefore, defer __loop_clr_fd() from lo_release() to a WQ context.
Link: https://syzkaller.appspot.com/bug?extid=643e4ce4b6ad1347d372 [1] Reported-by: syzbot <syzbot+643e4ce4b6ad1347d372@syzkaller.appspotmail.com> Suggested-by: Christoph Hellwig <hch@infradead.org> Cc: Jan Kara <jack@suse.cz> Tested-by: syzbot+643e4ce4b6ad1347d372@syzkaller.appspotmail.com Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/1ed7df28-ebd6-71fb-70e5-1c2972e05ddb@i-love.sakura.ne.jp Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.15.7, v5.15.6 |
|
#
e3f9387a |
| 29-Nov-2021 |
Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> |
loop: Use pr_warn_once() for loop_control_remove() warning
kernel test robot reported that RCU stall via printk() flooding is possible [1] when stress testing.
Link: https://lkml.kernel.org/r/20211
loop: Use pr_warn_once() for loop_control_remove() warning
kernel test robot reported that RCU stall via printk() flooding is possible [1] when stress testing.
Link: https://lkml.kernel.org/r/20211129073709.GA18483@xsang-OptiPlex-9020 [1] Reported-by: kernel test robot <oliver.sang@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.15.5 |
|
#
6050fa4c |
| 24-Nov-2021 |
Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> |
loop: don't hold lo_mutex during __loop_clr_fd()
syzbot is reporting circular locking problem at __loop_clr_fd() [1], for commit 87579e9b7d8dc36e ("loop: use worker per cgroup instead of kworker") i
loop: don't hold lo_mutex during __loop_clr_fd()
syzbot is reporting circular locking problem at __loop_clr_fd() [1], for commit 87579e9b7d8dc36e ("loop: use worker per cgroup instead of kworker") is calling destroy_workqueue() with lo->lo_mutex held.
Since all functions where lo->lo_state matters are already checking lo->lo_state with lo->lo_mutex held (in order to avoid racing with e.g. ioctl(LOOP_CTL_REMOVE)), and __loop_clr_fd() can be called from either ioctl(LOOP_CLR_FD) xor close(), lo->lo_state == Lo_rundown is considered as an exclusive lock for __loop_clr_fd(). Therefore, hold lo->lo_mutex inside __loop_clr_fd() only when asserting/updating lo->lo_state.
Since ioctl(LOOP_CLR_FD) depends on lo->lo_state == Lo_bound, a valid lo->lo_backing_file must have been assigned by ioctl(LOOP_SET_FD) or ioctl(LOOP_CONFIGURE). Thus, we can remove lo->lo_backing_file test, and convert __loop_clr_fd() into a void function.
Link: https://syzkaller.appspot.com/bug?extid=63614029dfb79abd4383 [1] Reported-by: syzbot <syzbot+63614029dfb79abd4383@syzkaller.appspotmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/8ebe3b2e-8975-7f26-0620-7144a3b8b8cd@i-love.sakura.ne.jp Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|