Revision tags: v6.1.28 |
|
#
00d873c1 |
| 04-May-2023 |
Jan Kara <jack@suse.cz> |
ext4: avoid deadlock in fs reclaim with page writeback
Ext4 has a filesystem wide lock protecting ext4_writepages() calls to avoid races with switching of journalled data flag or inode format. This
ext4: avoid deadlock in fs reclaim with page writeback
Ext4 has a filesystem wide lock protecting ext4_writepages() calls to avoid races with switching of journalled data flag or inode format. This lock can however cause a deadlock like:
CPU0 CPU1
ext4_writepages() percpu_down_read(sbi->s_writepages_rwsem); ext4_change_inode_journal_flag() percpu_down_write(sbi->s_writepages_rwsem); - blocks, all readers block from now on ext4_do_writepages() ext4_init_io_end() kmem_cache_zalloc(io_end_cachep, GFP_KERNEL) fs_reclaim frees dentry... dentry_unlink_inode() iput() - last ref => iput_final() - inode dirty => write_inode_now()... ext4_writepages() tries to acquire sbi->s_writepages_rwsem and blocks forever
Make sure we cannot recurse into filesystem reclaim from writeback code to avoid the deadlock.
Reported-by: syzbot+6898da502aef574c5f8a@syzkaller.appspotmail.com Link: https://lore.kernel.org/all/0000000000004c66b405fa108e27@google.com Fixes: c8585c6fcaf2 ("ext4: fix races between changing inode journal mode and ext4_writepages") CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230504124723.20205-1-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu>
show more ...
|
Revision tags: v6.1.27 |
|
#
5354b2af |
| 28-Apr-2023 |
Theodore Ts'o <tytso@mit.edu> |
ext4: allow ext4_get_group_info() to fail
Previously, ext4_get_group_info() would treat an invalid group number as BUG(), since in theory it should never happen. However, if a malicious attaker (or
ext4: allow ext4_get_group_info() to fail
Previously, ext4_get_group_info() would treat an invalid group number as BUG(), since in theory it should never happen. However, if a malicious attaker (or fuzzer) modifies the superblock via the block device while it is the file system is mounted, it is possible for s_first_data_block to get set to a very large number. In that case, when calculating the block group of some block number (such as the starting block of a preallocation region), could result in an underflow and very large block group number. Then the BUG_ON check in ext4_get_group_info() would fire, resutling in a denial of service attack that can be triggered by root or someone with write access to the block device.
For a quality of implementation perspective, it's best that even if the system administrator does something that they shouldn't, that it will not trigger a BUG. So instead of BUG'ing, ext4_get_group_info() will call ext4_error and return NULL. We also add fallback code in all of the callers of ext4_get_group_info() that it might NULL.
Also, since ext4_get_group_info() was already borderline to be an inline function, un-inline it. The results in a next reduction of the compiled text size of ext4 by roughly 2k.
Cc: stable@kernel.org Link: https://lore.kernel.org/r/20230430154311.579720-2-tytso@mit.edu Reported-by: syzbot+e2efa3efc15a1c9e95c3@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=69b28112e098b070f639efb356393af3ffec4220 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>
show more ...
|
Revision tags: v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23 |
|
#
519fe1ba |
| 01-Apr-2023 |
Josh Triplett <josh@joshtriplett.org> |
ext4: Add a uapi header for ext4 userspace APIs
Create a uapi header include/uapi/linux/ext4.h, move the ioctls and associated data structures to the uapi header, and include it from fs/ext4/ext4.h.
ext4: Add a uapi header for ext4 userspace APIs
Create a uapi header include/uapi/linux/ext4.h, move the ioctls and associated data structures to the uapi header, and include it from fs/ext4/ext4.h.
Signed-off-by: Josh Triplett <josh@joshtriplett.org> Link: https://lore.kernel.org/r/680175260970d977d16b5cc7e7606483ec99eb63.1680402881.git.josh@joshtriplett.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>
show more ...
|
Revision tags: v6.1.22 |
|
#
951cafa6 |
| 29-Mar-2023 |
Jan Kara <jack@suse.cz> |
ext4: Simplify handling of journalled data in ext4_bmap()
Now that ext4_writepages() gets journalled data into its final location we just use filemap_write_and_wait() instead of special handling of
ext4: Simplify handling of journalled data in ext4_bmap()
Now that ext4_writepages() gets journalled data into its final location we just use filemap_write_and_wait() instead of special handling of journalled data in ext4_bmap(). We can also drop EXT4_STATE_JDATA flag as it is not used anymore.
Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230329154950.19720-11-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu>
show more ...
|
#
c0be8e6f |
| 24-Mar-2023 |
Matthew Wilcox <willy@infradead.org> |
ext4: Convert ext4_mpage_readpages() to work on folios
This definitely doesn't include support for large folios; there are all kinds of assumptions about the number of buffers attached to a folio.
ext4: Convert ext4_mpage_readpages() to work on folios
This definitely doesn't include support for large folios; there are all kinds of assumptions about the number of buffers attached to a folio. But it does remove several calls to compound_head().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/20230324180129.1220691-24-willy@infradead.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>
show more ...
|
#
3edde93e |
| 24-Mar-2023 |
Matthew Wilcox <willy@infradead.org> |
ext4: Convert ext4_readpage_inline() to take a folio
Use the folio API in this function, saves a few calls to compound_head().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-
ext4: Convert ext4_readpage_inline() to take a folio
Use the folio API in this function, saves a few calls to compound_head().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20230324180129.1220691-10-willy@infradead.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>
show more ...
|
#
e8d6062c |
| 24-Mar-2023 |
Matthew Wilcox <willy@infradead.org> |
ext4: Convert ext4_bio_write_page() to ext4_bio_write_folio()
The only caller now has a folio so pass it in directly and avoid the call to page_folio() at the beginning.
Signed-off-by: Matthew Wilc
ext4: Convert ext4_bio_write_page() to ext4_bio_write_folio()
The only caller now has a folio so pass it in directly and avoid the call to page_folio() at the beginning.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20230324180129.1220691-9-willy@infradead.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>
show more ...
|
#
361eb69f |
| 25-Mar-2023 |
Ojaswin Mujoo <ojaswin@linux.ibm.com> |
ext4: Remove the logic to trim inode PAs
Earlier, inode PAs were stored in a linked list. This caused a need to periodically trim the list down inorder to avoid growing it to a very large size, as t
ext4: Remove the logic to trim inode PAs
Earlier, inode PAs were stored in a linked list. This caused a need to periodically trim the list down inorder to avoid growing it to a very large size, as this would severly affect performance during list iteration.
Recent patches changed this list to an rbtree, and since the tree scales up much better, we no longer need to have the trim functionality, hence remove it.
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/c409addceaa3ade4b40328e28e3b54b2f259689e.1679731817.git.ojaswin@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
show more ...
|
#
38727786 |
| 25-Mar-2023 |
Ojaswin Mujoo <ojaswin@linux.ibm.com> |
ext4: Use rbtrees to manage PAs instead of inode i_prealloc_list
Currently, the kernel uses i_prealloc_list to hold all the inode preallocations. This is known to cause degradation in performance in
ext4: Use rbtrees to manage PAs instead of inode i_prealloc_list
Currently, the kernel uses i_prealloc_list to hold all the inode preallocations. This is known to cause degradation in performance in workloads which perform large number of sparse writes on a single file. This is mainly because functions like ext4_mb_normalize_request() and ext4_mb_use_preallocated() iterate over this complete list, resulting in slowdowns when large number of PAs are present.
Patch 27bc446e2 partially fixed this by enforcing a limit of 512 for the inode preallocation list and adding logic to continually trim the list if it grows above the threshold, however our testing revealed that a hardcoded value is not suitable for all kinds of workloads.
To optimize this, add an rbtree to the inode and hold the inode preallocations in this rbtree. This will make iterating over inode PAs faster and scale much better than a linked list. Additionally, we also had to remove the LRU logic that was added during trimming of the list (in ext4_mb_release_context()) as it will add extra overhead in rbtree. The discards now happen in the lowest-logical-offset-first order.
** Locking notes **
With the introduction of rbtree to maintain inode PAs, we can't use RCU to walk the tree for searching since it can result in partial traversals which might miss some nodes(or entire subtrees) while discards happen in parallel (which happens under a lock). Hence this patch converts the ei->i_prealloc_lock spin_lock to rw_lock.
Almost all the codepaths that read/modify the PA rbtrees are protected by the higher level inode->i_data_sem (except ext4_mb_discard_group_preallocations() and ext4_clear_inode()) IIUC, the only place we need lock protection is when one thread is reading "searching" the PA rbtree (earlier protected under rcu_read_lock()) and another is "deleting" the PAs in ext4_mb_discard_group_preallocations() function (which iterates all the PAs using the grp->bb_prealloc_list and deletes PAs from the tree without taking any inode lock (i_data_sem)).
So, this patch converts all rcu_read_lock/unlock() paths for inode list PA to use read_lock() and all places where we were using ei->i_prealloc_lock spinlock will now be using write_lock().
Note that this makes the fast path (searching of the right PA e.g. ext4_mb_use_preallocated() or ext4_mb_normalize_request()), now use read_lock() instead of rcu_read_lock/unlock(). Ths also will now block due to slow discard path (ext4_mb_discard_group_preallocations()) which uses write_lock().
But this is not as bad as it looks. This is because -
1. The slow path only occurs when the normal allocation failed and we can say that we are low on disk space. One can argue this scenario won't be much frequent.
2. ext4_mb_discard_group_preallocations(), locks and unlocks the rwlock for deleting every individual PA. This gives enough opportunity for the fast path to acquire the read_lock for searching the PA inode list.
Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/4137bce8f6948fedd8bae134dabae24acfe699c6.1679731817.git.ojaswin@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
show more ...
|
Revision tags: v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13 |
|
#
1df9bde4 |
| 21-Feb-2023 |
Kemeng Shi <shikemeng@huaweicloud.com> |
ext4: remove unused group parameter in ext4_block_bitmap_csum_set
Remove unused group parameter in ext4_block_bitmap_csum_set. After this, group parameter in ext4_set_bitmap_checksums is also not us
ext4: remove unused group parameter in ext4_block_bitmap_csum_set
Remove unused group parameter in ext4_block_bitmap_csum_set. After this, group parameter in ext4_set_bitmap_checksums is also not used, just remove it too.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230221203027.2359920-5-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
show more ...
|
#
82483dfe |
| 21-Feb-2023 |
Kemeng Shi <shikemeng@huaweicloud.com> |
ext4: remove unused group parameter in ext4_block_bitmap_csum_verify
Remove unused group parameter in ext4_block_bitmap_csum_verify.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-b
ext4: remove unused group parameter in ext4_block_bitmap_csum_verify
Remove unused group parameter in ext4_block_bitmap_csum_verify.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230221203027.2359920-4-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
show more ...
|
#
4fd873c8 |
| 21-Feb-2023 |
Kemeng Shi <shikemeng@huaweicloud.com> |
ext4: remove unused group parameter in ext4_inode_bitmap_csum_set
Remove unused group parameter in ext4_inode_bitmap_csum_set.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Jan
ext4: remove unused group parameter in ext4_inode_bitmap_csum_set
Remove unused group parameter in ext4_inode_bitmap_csum_set.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230221203027.2359920-3-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
show more ...
|
#
b83acc77 |
| 21-Feb-2023 |
Kemeng Shi <shikemeng@huaweicloud.com> |
ext4: remove unused group parameter in ext4_inode_bitmap_csum_verify
Remove unused group parameter in ext4_inode_bitmap_csum_verify.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-b
ext4: remove unused group parameter in ext4_inode_bitmap_csum_verify
Remove unused group parameter in ext4_inode_bitmap_csum_verify.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230221203027.2359920-2-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
show more ...
|
#
7fc1f5c2 |
| 01-Mar-2023 |
Tudor Ambarus <tudor.ambarus@linaro.org> |
ext4: Fix comment about the 64BIT feature
64BIT is part of the incompatible feature set, update the comment accordingly.
Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org> Reviewed-by: Darrick
ext4: Fix comment about the 64BIT feature
64BIT is part of the incompatible feature set, update the comment accordingly.
Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20230301133842.671821-1-tudor.ambarus@linaro.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>
show more ...
|
Revision tags: v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9 |
|
#
e3645d72 |
| 28-Jan-2023 |
Zhang Yi <yi.zhang@huawei.com> |
ext4: fix incorrect options show of original mount_opt and extend mount_opt2
Current _ext4_show_options() do not distinguish MOPT_2 flag, so it mixed extend sbi->s_mount_opt2 options with sbi->s_mou
ext4: fix incorrect options show of original mount_opt and extend mount_opt2
Current _ext4_show_options() do not distinguish MOPT_2 flag, so it mixed extend sbi->s_mount_opt2 options with sbi->s_mount_opt, it could lead to show incorrect options, e.g. show fc_debug_force if we mount with errors=continue mode and miss it if we set.
$ mkfs.ext4 /dev/pmem0 $ mount -o errors=remount-ro /dev/pmem0 /mnt $ cat /proc/fs/ext4/pmem0/options | grep fc_debug_force #empty $ mount -o remount,errors=continue /mnt $ cat /proc/fs/ext4/pmem0/options | grep fc_debug_force fc_debug_force $ mount -o remount,errors=remount-ro,fc_debug_force /mnt $ cat /proc/fs/ext4/pmem0/options | grep fc_debug_force #empty
Fixes: 995a3ed67fc8 ("ext4: add fast_commit feature and handling for extended mount options") Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230129034939.3702550-1-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
show more ...
|
Revision tags: v6.1.8, v6.1.7, v6.1.6 |
|
#
f2d40141 |
| 13-Jan-2023 |
Christian Brauner <brauner@kernel.org> |
fs: port inode_init_owner() to mnt_idmap
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is ju
fs: port inode_init_owner() to mnt_idmap
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap.
Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs.
Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap.
Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
show more ...
|
#
8782a9ae |
| 13-Jan-2023 |
Christian Brauner <brauner@kernel.org> |
fs: port ->fileattr_set() to pass mnt_idmap
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is
fs: port ->fileattr_set() to pass mnt_idmap
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap.
Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs.
Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap.
Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
show more ...
|
#
b74d24f7 |
| 13-Jan-2023 |
Christian Brauner <brauner@kernel.org> |
fs: port ->getattr() to pass mnt_idmap
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just
fs: port ->getattr() to pass mnt_idmap
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap.
Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs.
Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap.
Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
show more ...
|
#
c1632a0f |
| 13-Jan-2023 |
Christian Brauner <brauner@kernel.org> |
fs: port ->setattr() to pass mnt_idmap
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just
fs: port ->setattr() to pass mnt_idmap
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap.
Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs.
Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap.
Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
show more ...
|
Revision tags: v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12 |
|
#
59205c8d |
| 07-Dec-2022 |
Jan Kara <jack@suse.cz> |
ext4: switch to using ext4_do_writepages() for ordered data writeout
Use the standard writepages method (ext4_do_writepages()) to perform writeout of ordered data during journal commit.
Reviewed-by
ext4: switch to using ext4_do_writepages() for ordered data writeout
Use the standard writepages method (ext4_do_writepages()) to perform writeout of ordered data during journal commit.
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20221207112722.22220-8-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu>
show more ...
|
#
dff4ac75 |
| 07-Dec-2022 |
Jan Kara <jack@suse.cz> |
ext4: move keep_towrite handling to ext4_bio_write_page()
When we are writing back page but we cannot for some reason write all its buffers (e.g. because we cannot allocate blocks in current context
ext4: move keep_towrite handling to ext4_bio_write_page()
When we are writing back page but we cannot for some reason write all its buffers (e.g. because we cannot allocate blocks in current context) we have to keep TOWRITE tag set in the mapping as otherwise racing WB_SYNC_ALL writeback that could write these buffers can skip the page and result in data loss. We will need this logic for writeback during transaction commit so move the logic from ext4_writepage() to ext4_bio_write_page().
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20221207112722.22220-2-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu>
show more ...
|
Revision tags: v6.0.11, v6.0.10, v5.15.80, v6.0.9, v5.15.79, v6.0.8, v5.15.78 |
|
#
4c0d5778 |
| 06-Nov-2022 |
Eric Biggers <ebiggers@google.com> |
ext4: don't set up encryption key during jbd2 transaction
Commit a80f7fcf1867 ("ext4: fixup ext4_fc_track_* functions' signature") extended the scope of the transaction in ext4_unlink() too far, mak
ext4: don't set up encryption key during jbd2 transaction
Commit a80f7fcf1867 ("ext4: fixup ext4_fc_track_* functions' signature") extended the scope of the transaction in ext4_unlink() too far, making it include the call to ext4_find_entry(). However, ext4_find_entry() can deadlock when called from within a transaction because it may need to set up the directory's encryption key.
Fix this by restoring the transaction to its original scope.
Reported-by: syzbot+1a748d0007eeac3ab079@syzkaller.appspotmail.com Fixes: a80f7fcf1867 ("ext4: fixup ext4_fc_track_* functions' signature") Cc: <stable@vger.kernel.org> # v5.10+ Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20221106224841.279231-3-ebiggers@kernel.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>
show more ...
|
Revision tags: v6.0.7, v5.15.77 |
|
#
3bf678a0 |
| 31-Oct-2022 |
Gaosheng Cui <cuigaosheng1@huawei.com> |
ext4: fix undefined behavior in bit shift for ext4_check_flag_values
Shifting signed 32-bit value by 31 bits is undefined, so changing significant bit to unsigned. The UBSAN warning calltrace like b
ext4: fix undefined behavior in bit shift for ext4_check_flag_values
Shifting signed 32-bit value by 31 bits is undefined, so changing significant bit to unsigned. The UBSAN warning calltrace like below:
UBSAN: shift-out-of-bounds in fs/ext4/ext4.h:591:2 left shift of 1 by 31 places cannot be represented in type 'int' Call Trace: <TASK> dump_stack_lvl+0x7d/0xa5 dump_stack+0x15/0x1b ubsan_epilogue+0xe/0x4e __ubsan_handle_shift_out_of_bounds+0x1e7/0x20c ext4_init_fs+0x5a/0x277 do_one_initcall+0x76/0x430 kernel_init_freeable+0x3b3/0x422 kernel_init+0x24/0x1e0 ret_from_fork+0x1f/0x30 </TASK>
Fixes: 9a4c80194713 ("ext4: ensure Inode flags consistency are checked at build time") Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Link: https://lore.kernel.org/r/20221031055833.3966222-1-cuigaosheng1@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
show more ...
|
Revision tags: v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4 |
|
#
63b1e9bc |
| 25-Oct-2022 |
Baokun Li <libaokun1@huawei.com> |
ext4: add EXT4_IGET_BAD flag to prevent unexpected bad inode
There are many places that will get unhappy (and crash) when ext4_iget() returns a bad inode. However, if iget the boot loader inode, all
ext4: add EXT4_IGET_BAD flag to prevent unexpected bad inode
There are many places that will get unhappy (and crash) when ext4_iget() returns a bad inode. However, if iget the boot loader inode, allows a bad inode to be returned, because the inode may not be initialized. This mechanism can be used to bypass some checks and cause panic. To solve this problem, we add a special iget flag EXT4_IGET_BAD. Only with this flag we'd be returning bad inode from ext4_iget(), otherwise we always return the error code if the inode is bad inode.(suggested by Jan Kara)
Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20221026042310.3839669-4-libaokun1@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
show more ...
|
Revision tags: v6.0.3, v6.0.2, v5.15.74, v5.15.73, v6.0.1, v5.15.72, v6.0, v5.15.71 |
|
#
7ff5fdda |
| 23-Sep-2022 |
Ye Bin <yebin10@huawei.com> |
ext4: factor out ext4_free_ext_path()
Factor out ext4_free_ext_path() to free extent path. As after previous patch 'ext4_ext_drop_refs()' is only used in 'extents.c', so make it static.
Signed-off-
ext4: factor out ext4_free_ext_path()
Factor out ext4_free_ext_path() to free extent path. As after previous patch 'ext4_ext_drop_refs()' is only used in 'extents.c', so make it static.
Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220924021211.3831551-3-yebin10@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
show more ...
|