#
fbcd51e0 |
| 24-Jun-2024 |
Chao Yu <chao@kernel.org> |
f2fs: fix to update user block counts in block_operations()
[ Upstream commit f06c0f82e38bbda7264d6ef3c90045ad2810e0f3 ]
Commit 59c9081bc86e ("f2fs: allow write page cache when writting cp") allows
f2fs: fix to update user block counts in block_operations()
[ Upstream commit f06c0f82e38bbda7264d6ef3c90045ad2810e0f3 ]
Commit 59c9081bc86e ("f2fs: allow write page cache when writting cp") allows write() to write data to page cache during checkpoint, so block count fields like .total_valid_block_count, .alloc_valid_block_count and .rf_node_block_count may encounter race condition as below:
CP Thread A - write_checkpoint - block_operations - f2fs_down_write(&sbi->node_change) - __prepare_cp_block : ckpt->valid_block_count = .total_valid_block_count - f2fs_up_write(&sbi->node_change) - write - f2fs_preallocate_blocks - f2fs_map_blocks(,F2FS_GET_BLOCK_PRE_AIO) - f2fs_map_lock - f2fs_down_read(&sbi->node_change) - f2fs_reserve_new_blocks - inc_valid_block_count : percpu_counter_add(&sbi->alloc_valid_block_count, count) : sbi->total_valid_block_count += count - f2fs_up_read(&sbi->node_change) - do_checkpoint : sbi->last_valid_block_count = sbi->total_valid_block_count : percpu_counter_set(&sbi->alloc_valid_block_count, 0) : percpu_counter_set(&sbi->rf_node_block_count, 0) - fsync - need_do_checkpoint - f2fs_space_for_roll_forward : alloc_valid_block_count was reset to zero, so, it may missed last data during checkpoint
Let's change to update .total_valid_block_count, .alloc_valid_block_count and .rf_node_block_count in block_operations(), then their access can be protected by .node_change and .cp_rwsem lock, so that it can avoid above race condition.
Fixes: 59c9081bc86e ("f2fs: allow write page cache when writting cp") Cc: Yunlei He <heyunlei@oppo.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
f0248ba6 |
| 06-Feb-2024 |
Jaegeuk Kim <jaegeuk@kernel.org> |
f2fs: use BLKS_PER_SEG, BLKS_PER_SEC, and SEGS_PER_SEC
[ Upstream commit a60108f7dfb5867da1ad9c777d2fbbe47e4dbdd7 ]
No functional change.
Reviewed-by: Daeho Jeong <daehojeong@google.com> Reviewed-
f2fs: use BLKS_PER_SEG, BLKS_PER_SEC, and SEGS_PER_SEC
[ Upstream commit a60108f7dfb5867da1ad9c777d2fbbe47e4dbdd7 ]
No functional change.
Reviewed-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Stable-dep-of: aa4074e8fec4 ("f2fs: fix block migration when section is not aligned to pow2") Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
c92f2927 |
| 07-Mar-2024 |
Chao Yu <chao@kernel.org> |
f2fs: fix to truncate meta inode pages forcely
[ Upstream commit 9f0c4a46be1fe9b97dbe66d49204c1371e3ece65 ]
Below race case can cause data corruption:
Thread A GC thread - gc_data_segment
f2fs: fix to truncate meta inode pages forcely
[ Upstream commit 9f0c4a46be1fe9b97dbe66d49204c1371e3ece65 ]
Below race case can cause data corruption:
Thread A GC thread - gc_data_segment - ra_data_block - locked meta_inode page - f2fs_inplace_write_data - invalidate_mapping_pages : fail to invalidate meta_inode page due to lock failure or dirty|writeback status - f2fs_submit_page_bio : write last dirty data to old blkaddr - move_data_block - load old data from meta_inode page - f2fs_submit_page_write : write old data to new blkaddr
Because invalidate_mapping_pages() will skip invalidating page which has unclear status including locked, dirty, writeback and so on, so we need to use truncate_inode_pages_range() instead of invalidate_mapping_pages() to make sure meta_inode page will be dropped.
Fixes: 6aa58d8ad20a ("f2fs: readahead encrypted block during GC") Fixes: e3b49ea36802 ("f2fs: invalidate META_MAPPING before IPU/DIO write") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
eb61c2cc |
| 07-Aug-2023 |
Chao Yu <chao@kernel.org> |
f2fs: fix to account cp stats correctly
cp_foreground_calls sysfs entry shows total CP call count rather than foreground CP call count, fix it.
Fixes: fc7100ea2a52 ("f2fs: Add f2fs stats to sysfs")
f2fs: fix to account cp stats correctly
cp_foreground_calls sysfs entry shows total CP call count rather than foreground CP call count, fix it.
Fixes: fc7100ea2a52 ("f2fs: Add f2fs stats to sysfs") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
show more ...
|
#
b62e71be |
| 23-Apr-2023 |
Chao Yu <chao@kernel.org> |
f2fs: support errors=remount-ro|continue|panic mountoption
This patch supports errors=remount-ro|continue|panic mount option for f2fs.
f2fs behaves as below in three different modes: mode continu
f2fs: support errors=remount-ro|continue|panic mountoption
This patch supports errors=remount-ro|continue|panic mount option for f2fs.
f2fs behaves as below in three different modes: mode continue remount-ro panic access ops normal noraml N/A syscall errors -EIO -EROFS N/A mount option rw ro N/A pending dir write keep keep N/A pending non-dir write drop keep N/A pending node write drop keep N/A pending meta write keep keep N/A
By default it uses "continue" mode.
[Yangtao helps to clean up function's name] Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
show more ...
|
#
bd90c5cd |
| 06-Apr-2023 |
Jaegeuk Kim <jaegeuk@kernel.org> |
f2fs: relax sanity check if checkpoint is corrupted
1. extent_cache - let's drop the largest extent_cache 2. invalidate_block - don't show the warnings
Reviewed-by: Chao Yu <chao@kernel.org> Sign
f2fs: relax sanity check if checkpoint is corrupted
1. extent_cache - let's drop the largest extent_cache 2. invalidate_block - don't show the warnings
Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
show more ...
|
#
d09bd853 |
| 27-Mar-2023 |
Yohan Joung <jyh429@gmail.com> |
f2fs: add radix_tree_preload_end in error case
To prevent excessive increase in preemption count add radix_tree_preload_end in retry
Signed-off-by: Yohan Joung <yohan.joung@sk.com> Signed-off-by: J
f2fs: add radix_tree_preload_end in error case
To prevent excessive increase in preemption count add radix_tree_preload_end in retry
Signed-off-by: Yohan Joung <yohan.joung@sk.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
show more ...
|
#
e1bb7d3d |
| 02-Apr-2023 |
Chao Yu <chao@kernel.org> |
f2fs: fix to recover quota data correctly
With -O quota mkfs option, xfstests generic/417 fails due to fsck detects data corruption on quota inodes.
[ASSERT] (fsck_chk_quota_files:2051) --> Quota
f2fs: fix to recover quota data correctly
With -O quota mkfs option, xfstests generic/417 fails due to fsck detects data corruption on quota inodes.
[ASSERT] (fsck_chk_quota_files:2051) --> Quota file is missing or invalid quota file content found.
The root cause is there is a hole f2fs doesn't hold quota inodes, so all recovered quota data will be dropped due to SBI_POR_DOING flag was set. - f2fs_fill_super - f2fs_recover_orphan_inodes - f2fs_enable_quota_files - f2fs_quota_off_umount <--- quota inodes were dropped ---> - f2fs_recover_fsync_data - f2fs_enable_quota_files - f2fs_quota_off_umount
This patch tries to eliminate the hole by holding quota inodes during entire recovery flow as below: - f2fs_fill_super - f2fs_recover_quota_begin - f2fs_recover_orphan_inodes - f2fs_recover_fsync_data - f2fs_recover_quota_end
Then, recovered quota data can be persisted after SBI_POR_DOING is cleared.
Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
show more ...
|
#
c9b3649a |
| 09-Apr-2023 |
Chao Yu <chao@kernel.org> |
f2fs: fix to drop all dirty pages during umount() if cp_error is set
xfstest generic/361 reports a bug as below:
f2fs_bug_on(sbi, sbi->fsync_node_num);
kernel BUG at fs/f2fs/super.c:1627! RIP: 001
f2fs: fix to drop all dirty pages during umount() if cp_error is set
xfstest generic/361 reports a bug as below:
f2fs_bug_on(sbi, sbi->fsync_node_num);
kernel BUG at fs/f2fs/super.c:1627! RIP: 0010:f2fs_put_super+0x3a8/0x3b0 Call Trace: generic_shutdown_super+0x8c/0x1b0 kill_block_super+0x2b/0x60 kill_f2fs_super+0x87/0x110 deactivate_locked_super+0x39/0x80 deactivate_super+0x46/0x50 cleanup_mnt+0x109/0x170 __cleanup_mnt+0x16/0x20 task_work_run+0x65/0xa0 exit_to_user_mode_prepare+0x175/0x190 syscall_exit_to_user_mode+0x25/0x50 do_syscall_64+0x4c/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc
During umount(), if cp_error is set, f2fs_wait_on_all_pages() should not stop waiting all F2FS_WB_CP_DATA pages to be writebacked, otherwise, fsync_node_num can be non-zero after f2fs_wait_on_all_pages() causing this bug.
In this case, to avoid deadloop in f2fs_wait_on_all_pages(), it needs to drop all dirty pages rather than redirtying them.
Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
show more ...
|
#
68f0453d |
| 09-Apr-2023 |
Chao Yu <chao@kernel.org> |
f2fs: use f2fs_hw_is_readonly() instead of bdev_read_only()
f2fs has supported multi-device feature, to check devices' rw status, it should use f2fs_hw_is_readonly() rather than bdev_read_only(), fi
f2fs: use f2fs_hw_is_readonly() instead of bdev_read_only()
f2fs has supported multi-device feature, to check devices' rw status, it should use f2fs_hw_is_readonly() rather than bdev_read_only(), fix it.
Meanwhile, it removes f2fs_hw_is_readonly() check condition in: - f2fs_write_checkpoint() - f2fs_convert_inline_inode() As it has checked f2fs_readonly() condition, and if f2fs' devices were readonly, f2fs_readonly() must be true.
Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
show more ...
|
#
447286eb |
| 16-Feb-2023 |
Yangtao Li <frank.li@vivo.com> |
f2fs: convert to use bitmap API
Let's use BIT() and GENMASK() instead of open it.
Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <ja
f2fs: convert to use bitmap API
Let's use BIT() and GENMASK() instead of open it.
Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
show more ...
|
#
60630375 |
| 22-Feb-2023 |
Li Zetao <lizetao1@huawei.com> |
f2fs: make f2fs_sync_inode_meta() static
After commit 26b5a079197c ("f2fs: cleanup dirty pages if recover failed"), f2fs_sync_inode_meta() is only used in checkpoint.c, so f2fs_sync_inode_meta() sho
f2fs: make f2fs_sync_inode_meta() static
After commit 26b5a079197c ("f2fs: cleanup dirty pages if recover failed"), f2fs_sync_inode_meta() is only used in checkpoint.c, so f2fs_sync_inode_meta() should only be visible inside. Delete the declaration in the header file and change f2fs_sync_inode_meta() to static.
Signed-off-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
show more ...
|
#
146949de |
| 06-Feb-2023 |
Jinyoung CHOI <j-young.choi@samsung.com> |
f2fs: fix typos in comments
This patch is to fix typos in f2fs files.
Signed-off-by: Jinyoung Choi <j-young.choi@samsung.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaeg
f2fs: fix typos in comments
This patch is to fix typos in f2fs files.
Signed-off-by: Jinyoung Choi <j-young.choi@samsung.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
show more ...
|
#
580e7a49 |
| 04-Jan-2023 |
Vishal Moola (Oracle) <vishal.moola@gmail.com> |
f2fs: convert f2fs_sync_meta_pages() to use filemap_get_folios_tag()
Convert function to use folios throughout. This is in preparation for the removal of find_get_pages_range_tag(). This change re
f2fs: convert f2fs_sync_meta_pages() to use filemap_get_folios_tag()
Convert function to use folios throughout. This is in preparation for the removal of find_get_pages_range_tag(). This change removes 5 calls to compound_head().
Initially the function was checking if the previous page index is truly the previous page i.e. 1 index behind the current page. To convert to folios and maintain this check we need to make the check folio->index != prev + folio_nr_pages(previous folio) since we don't know how many pages are in a folio.
At index i == 0 the check is guaranteed to succeed, so to workaround indexing bounds we can simply ignore the check for that specific index. This makes the initial assignment of prev trivial, so I removed that as well.
Also modify a comment in commit_checkpoint for consistency.
Link: https://lkml.kernel.org/r/20230104211448.4804-17-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
#
2eae077e |
| 02-Feb-2023 |
Chao Yu <chao@kernel.org> |
f2fs: reduce stack memory cost by using bitfield in struct f2fs_io_info
This patch tries to use bitfield in struct f2fs_io_info to improve memory usage.
struct f2fs_io_info { ... unsigned int need
f2fs: reduce stack memory cost by using bitfield in struct f2fs_io_info
This patch tries to use bitfield in struct f2fs_io_info to improve memory usage.
struct f2fs_io_info { ... unsigned int need_lock:8; /* indicate we need to lock cp_rwsem */ unsigned int version:8; /* version of the node */ unsigned int submitted:1; /* indicate IO submission */ unsigned int in_list:1; /* indicate fio is in io_list */ unsigned int is_por:1; /* indicate IO is from recovery or not */ unsigned int retry:1; /* need to reallocate block address */ unsigned int encrypted:1; /* indicate file is encrypted */ unsigned int post_read:1; /* require post read */ ... };
After this patch, size of struct f2fs_io_info reduces from 136 to 120.
[Nathan: fix a compile warning (single-bit-bitfield-constant-conversion)] Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
show more ...
|
#
5a4fed7c |
| 19-Jan-2023 |
Christoph Hellwig <hch@lst.de> |
f2fs: simplify do_checkpoint
For each loop add a local curseg_info pointer insted of looking it up for each of the three fields.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <
f2fs: simplify do_checkpoint
For each loop add a local curseg_info pointer insted of looking it up for each of the three fields.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
show more ...
|
#
c40e15a9 |
| 20-Dec-2022 |
Yangtao Li <frank.li@vivo.com> |
f2fs: merge f2fs_show_injection_info() into time_to_inject()
There is no need to additionally use f2fs_show_injection_info() to output information. Concatenate time_to_inject() and __time_to_inject(
f2fs: merge f2fs_show_injection_info() into time_to_inject()
There is no need to additionally use f2fs_show_injection_info() to output information. Concatenate time_to_inject() and __time_to_inject() via a macro. In the new __time_to_inject() function, pass in the caller function name and parent function.
In this way, we no longer need the f2fs_show_injection_info() function, and let's remove it.
Suggested-by: Chao Yu <chao@kernel.org> Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
show more ...
|
#
146dbcbf |
| 27-Oct-2022 |
Yangtao Li <frank.li@vivo.com> |
f2fs: fix return val in f2fs_start_ckpt_thread()
Return PTR_ERR(cprc->f2fs_issue_ckpt) instead of -ENOMEM;
Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signe
f2fs: fix return val in f2fs_start_ckpt_thread()
Return PTR_ERR(cprc->f2fs_issue_ckpt) instead of -ENOMEM;
Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
show more ...
|
#
18792e64 |
| 06-Oct-2022 |
Chao Yu <chao@kernel.org> |
f2fs: support fault injection for f2fs_is_valid_blkaddr()
This patch supports to inject fault into f2fs_is_valid_blkaddr() to simulate accessing inconsistent data/meta block addressses from caller.
f2fs: support fault injection for f2fs_is_valid_blkaddr()
This patch supports to inject fault into f2fs_is_valid_blkaddr() to simulate accessing inconsistent data/meta block addressses from caller.
Usage: a) echo 262144 > /sys/fs/f2fs/<dev>/inject_type or b) mount -o fault_type=262144 <dev> <mountpoint>
Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
show more ...
|
#
a9cfee0e |
| 28-Sep-2022 |
Chao Yu <chao@kernel.org> |
f2fs: support recording stop_checkpoint reason into super_block
This patch supports to record stop_checkpoint error into f2fs_super_block.s_stop_reason[].
Signed-off-by: Chao Yu <chao@kernel.org> S
f2fs: support recording stop_checkpoint reason into super_block
This patch supports to record stop_checkpoint error into f2fs_super_block.s_stop_reason[].
Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
show more ...
|
#
d80afefb |
| 14-Sep-2022 |
Chao Yu <chao@kernel.org> |
f2fs: fix to account FS_CP_DATA_IO correctly
f2fs_inode_info.cp_task was introduced for FS_CP_DATA_IO accounting since commit b0af6d491a6b ("f2fs: add app/fs io stat").
However, cp_task usage cover
f2fs: fix to account FS_CP_DATA_IO correctly
f2fs_inode_info.cp_task was introduced for FS_CP_DATA_IO accounting since commit b0af6d491a6b ("f2fs: add app/fs io stat").
However, cp_task usage coverage has been increased due to below commits: commit 040d2bb318d1 ("f2fs: fix to avoid deadloop if data_flush is on") commit 186857c5a14a ("f2fs: fix potential recursive call when enabling data_flush")
So that, if data_flush mountoption is on, when data flush was triggered from background, the IO from data flush will be accounted as checkpoint IO type incorrectly.
In order to fix this issue, this patch splits cp_task into two: a) cp_task: used for IO accounting b) wb_task: used to avoid deadlock
Fixes: 040d2bb318d1 ("f2fs: fix to avoid deadloop if data_flush is on") Fixes: 186857c5a14a ("f2fs: fix potential recursive call when enabling data_flush") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
show more ...
|
#
0ef4ca04 |
| 12-Sep-2022 |
Chao Yu <chao@kernel.org> |
f2fs: fix to do sanity check on destination blkaddr during recovery
As Wenqing Liu reported in bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=216456
loop5: detected capacity change from 0 t
f2fs: fix to do sanity check on destination blkaddr during recovery
As Wenqing Liu reported in bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=216456
loop5: detected capacity change from 0 to 131072 F2FS-fs (loop5): recover_inode: ino = 6, name = hln, inline = 1 F2FS-fs (loop5): recover_data: ino = 6 (i_size: recover) err = 0 F2FS-fs (loop5): recover_inode: ino = 6, name = hln, inline = 1 F2FS-fs (loop5): recover_data: ino = 6 (i_size: recover) err = 0 F2FS-fs (loop5): recover_inode: ino = 6, name = hln, inline = 1 F2FS-fs (loop5): recover_data: ino = 6 (i_size: recover) err = 0 F2FS-fs (loop5): Bitmap was wrongly set, blk:5634 ------------[ cut here ]------------ WARNING: CPU: 3 PID: 1013 at fs/f2fs/segment.c:2198 RIP: 0010:update_sit_entry+0xa55/0x10b0 [f2fs] Call Trace: <TASK> f2fs_do_replace_block+0xa98/0x1890 [f2fs] f2fs_replace_block+0xeb/0x180 [f2fs] recover_data+0x1a69/0x6ae0 [f2fs] f2fs_recover_fsync_data+0x120d/0x1fc0 [f2fs] f2fs_fill_super+0x4665/0x61e0 [f2fs] mount_bdev+0x2cf/0x3b0 legacy_get_tree+0xed/0x1d0 vfs_get_tree+0x81/0x2b0 path_mount+0x47e/0x19d0 do_mount+0xce/0xf0 __x64_sys_mount+0x12c/0x1a0 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd
If we enable CONFIG_F2FS_CHECK_FS config, it will trigger a kernel panic instead of warning.
The root cause is: in fuzzed image, SIT table is inconsistent with inode mapping table, result in triggering such warning during SIT table update.
This patch introduces a new flag DATA_GENERIC_ENHANCE_UPDATE, w/ this flag, data block recovery flow can check destination blkaddr's validation in SIT table, and skip f2fs_replace_block() to avoid inconsistent status.
Cc: stable@vger.kernel.org Reported-by: Wenqing Liu <wenqingliu0120@gmail.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
show more ...
|
#
9b7eadd9 |
| 30-Aug-2022 |
Shuqi Zhang <zhangshuqi3@huawei.com> |
f2fs: fix wrong dirty page count when race between mmap and fallocate.
This is a BUG_ON issue as follows when running xfstest-generic-503: WARNING: CPU: 21 PID: 1385 at fs/f2fs/inode.c:762 f2fs_evic
f2fs: fix wrong dirty page count when race between mmap and fallocate.
This is a BUG_ON issue as follows when running xfstest-generic-503: WARNING: CPU: 21 PID: 1385 at fs/f2fs/inode.c:762 f2fs_evict_inode+0x847/0xaa0 Modules linked in: CPU: 21 PID: 1385 Comm: umount Not tainted 5.19.0-rc5+ #73 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-4.fc34 04/01/2014
Call Trace: evict+0x129/0x2d0 dispose_list+0x4f/0xb0 evict_inodes+0x204/0x230 generic_shutdown_super+0x5b/0x1e0 kill_block_super+0x29/0x80 kill_f2fs_super+0xe6/0x140 deactivate_locked_super+0x44/0xc0 deactivate_super+0x79/0x90 cleanup_mnt+0x114/0x1a0 __cleanup_mnt+0x16/0x20 task_work_run+0x98/0x100 exit_to_user_mode_prepare+0x3d0/0x3e0 syscall_exit_to_user_mode+0x12/0x30 do_syscall_64+0x42/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0
Function flow analysis when BUG occurs: f2fs_fallocate mmap do_page_fault pte_spinlock // ---lock_pte do_wp_page wp_page_shared pte_unmap_unlock // unlock_pte do_page_mkwrite f2fs_vm_page_mkwrite down_read(invalidate_lock) lock_page if (PageMappedToDisk(page)) goto out; // set_page_dirty --NOT RUN out: up_read(invalidate_lock); finish_mkwrite_fault // unlock_pte f2fs_collapse_range down_write(i_mmap_sem) truncate_pagecache unmap_mapping_pages i_mmap_lock_write // down_write(i_mmap_rwsem) ...... zap_pte_range pte_offset_map_lock // ---lock_pte set_page_dirty f2fs_dirty_data_folio if (!folio_test_dirty(folio)) { fault_dirty_shared_page set_page_dirty f2fs_dirty_data_folio if (!folio_test_dirty(folio)) { filemap_dirty_folio f2fs_update_dirty_folio // ++ } unlock_page filemap_dirty_folio f2fs_update_dirty_folio // page count++ } pte_unmap_unlock // --unlock_pte i_mmap_unlock_write // up_write(i_mmap_rwsem) truncate_inode_pages up_write(i_mmap_sem)
When race happens between mmap-do_page_fault-wp_page_shared and fallocate-truncate_pagecache-zap_pte_range, the zap_pte_range calls function set_page_dirty without page lock. Besides, though truncate_pagecache has immap and pte lock, wp_page_shared calls fault_dirty_shared_page without any. In this case, two threads race in f2fs_dirty_data_folio function. Page is set to dirty only ONCE, but the count is added TWICE by calling filemap_dirty_folio. Thus the count of dirty page cannot accord with the real dirty pages.
Following is the solution to in case of race happens without any lock. Since folio_test_set_dirty in filemap_dirty_folio is atomic, judge return value will not be at risk of race.
Signed-off-by: Shuqi Zhang <zhangshuqi3@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
show more ...
|
#
c7b58576 |
| 19-Aug-2022 |
Jaegeuk Kim <jaegeuk@kernel.org> |
f2fs: flush pending checkpoints when freezing super
This avoids -EINVAL when trying to freeze f2fs.
Cc: stable@vger.kernel.org Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jae
f2fs: flush pending checkpoints when freezing super
This avoids -EINVAL when trying to freeze f2fs.
Cc: stable@vger.kernel.org Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
show more ...
|
#
34a23525 |
| 19-Aug-2022 |
Chao Yu <chao.yu@oppo.com> |
f2fs: iostat: support accounting compressed IO
Previously, we supported to account FS_CDATA_READ_IO type IO only, in this patch, it adds to account more type IO for compressed file: - APP_BUFFERED_C
f2fs: iostat: support accounting compressed IO
Previously, we supported to account FS_CDATA_READ_IO type IO only, in this patch, it adds to account more type IO for compressed file: - APP_BUFFERED_CDATA_IO - APP_MAPPED_CDATA_IO - FS_CDATA_IO - APP_BUFFERED_CDATA_READ_IO - APP_MAPPED_CDATA_READ_IO
Signed-off-by: Chao Yu <chao.yu@oppo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
show more ...
|