Revision tags: v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32 |
|
#
2b2553f1 |
| 31-May-2023 |
Christoph Hellwig <hch@lst.de> |
btrfs: stop setting PageError in the data I/O path
PageError is not used by the VFS/MM and deprecated because it uses up a page bit and has no coherent rules. Instead read errors are usually propag
btrfs: stop setting PageError in the data I/O path
PageError is not used by the VFS/MM and deprecated because it uses up a page bit and has no coherent rules. Instead read errors are usually propagated by not setting or clearing the uptodate bit, and write errors are propagated through the address_space. Btrfs now only sets the flag and never clears it for data pages, so just remove all places setting it, and the subpage error bit.
Note that the error propagation for superblock writes that work on the block device mapping still uses PageError for now, but that will be addressed in a separate series.
Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
Revision tags: v6.1.31 |
|
#
75258f20 |
| 26-May-2023 |
Qu Wenruo <wqu@suse.com> |
btrfs: subpage: dump extra subpage bitmaps for debug
There is a bug report that assert_eb_page_uptodate() gets triggered for free space tree metadata.
Without proper dump for the subpage bitmaps it
btrfs: subpage: dump extra subpage bitmaps for debug
There is a bug report that assert_eb_page_uptodate() gets triggered for free space tree metadata.
Without proper dump for the subpage bitmaps it's much harder to debug.
Thus this patch would dump all the subpage bitmaps (split them into their own bitmaps) for a easier debugging.
The output would look like this: (Dumped after a tree block got read from disk)
page:000000006e34bf49 refcount:4 mapcount:0 mapping:0000000067661ac4 index:0x1d1 pfn:0x110e9 memcg:ffff0000d7d62000 aops:btree_aops [btrfs] ino:1 flags: 0x8000000000002002(referenced|private|zone=2) page_type: 0xffffffff() raw: 8000000000002002 0000000000000000 dead000000000122 ffff00000188bed0 raw: 00000000000001d1 ffff0000c7992700 00000004ffffffff ffff0000d7d62000 page dumped because: btrfs subpage dump BTRFS warning (device dm-1): start=30490624 len=16384 page=30474240 bitmaps: uptodate=4-7 error= dirty= writeback= ordered= checked=
Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
Revision tags: v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26 |
|
#
b5345d6c |
| 25-Apr-2023 |
Naohiro Aota <naota@elisp.net> |
btrfs: export bitmap_test_range_all_{set,zero}
bitmap_test_range_all_{set,zero} defined in subpage.c are useful for other components. Move them to misc.h and use them in zoned.c. Also, as find_next{
btrfs: export bitmap_test_range_all_{set,zero}
bitmap_test_range_all_{set,zero} defined in subpage.c are useful for other components. Move them to misc.h and use them in zoned.c. Also, as find_next{,_zero}_bit take/return "unsigned long" instead of "unsigned int", convert the type to "unsigned long".
While at it, also rewrite the "if (...) return true; else return false;" pattern and add const to the input bitmap.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
Revision tags: v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8, v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80, v6.0.9, v5.15.79, v6.0.8, v5.15.78, v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4, v6.0.3 |
|
#
9b569ea0 |
| 19-Oct-2022 |
Josef Bacik <josef@toxicpanda.com> |
btrfs: move the printk helpers out of ctree.h
We have a bunch of printk helpers that are in ctree.h. These have nothing to do with ctree.c, so move them into their own header. Subsequent patches wi
btrfs: move the printk helpers out of ctree.h
We have a bunch of printk helpers that are in ctree.h. These have nothing to do with ctree.c, so move them into their own header. Subsequent patches will cleanup the printk helpers.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
Revision tags: v6.0.2, v5.15.74, v5.15.73, v6.0.1, v5.15.72, v6.0, v5.15.71, v5.15.70, v5.15.69, v5.15.68, v5.15.67, v5.15.66, v5.15.65, v5.15.64, v5.15.63 |
|
#
47d55419 |
| 23-Aug-2022 |
Vishal Moola (Oracle) <vishal.moola@gmail.com> |
btrfs: convert process_page_range() to use filemap_get_folios_contig()
Converted function to use folios throughout. This is in preparation for the removal of find_get_pages_contig(). Now also supp
btrfs: convert process_page_range() to use filemap_get_folios_contig()
Converted function to use folios throughout. This is in preparation for the removal of find_get_pages_contig(). Now also supports large folios.
Since we may receive more than nr_pages pages, nr_pages may underflow. Since nr_pages > 0 is equivalent to index <= end_index, we replaced it with this check instead.
Also minor comment renaming for consistency in subpage.
Link: https://lkml.kernel.org/r/20220824004023.77310-5-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: David Sterba <dsterb@suse.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Chris Mason <clm@fb.com> Cc: David Sterba <dsterba@suse.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ryusuke Konishi <konishi.ryusuke@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
Revision tags: v5.15.62, v5.15.61, v5.15.60, v5.15.59, v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55, v5.15.54, v5.15.53, v5.15.52, v5.15.51, v5.15.50, v5.15.49 |
|
#
f3e90c1c |
| 21-Jun-2022 |
Christoph Hellwig <hch@lst.de> |
btrfs: remove extent writepage address space operation
Same as in commit 21b4ee7029c9 ("xfs: drop ->writepage completely"): we can remove the callback as it's only used in one place - single page wr
btrfs: remove extent writepage address space operation
Same as in commit 21b4ee7029c9 ("xfs: drop ->writepage completely"): we can remove the callback as it's only used in one place - single page writeback from memory reclaim and is not called for cgroup writeback at all.
We only allow such writeback from kswapd, not from direct memory reclaim, and so it is rarely used. When it comes from kswapd, it is effectively random dirty page shoot-down, which is horrible for IO patterns. We can rely on background writeback to clean all dirty pages in an efficient way and not let it be interrupted by kswapd.
Suggested-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
Revision tags: v5.15.48, v5.15.47, v5.15.46, v5.15.45, v5.15.44 |
|
#
143823cf |
| 25-May-2022 |
David Sterba <dsterba@suse.com> |
btrfs: fix typos in comments
Codespell has found a few typos.
Signed-off-by: David Sterba <dsterba@suse.com>
|
Revision tags: v5.15.43, v5.15.42, v5.18, v5.15.41, v5.15.40, v5.15.39, v5.15.38, v5.15.37, v5.15.36, v5.15.35, v5.15.34, v5.15.33 |
|
#
0d031dc4 |
| 31-Mar-2022 |
Yu Zhe <yuzhe@nfschina.com> |
btrfs: remove unnecessary type casts
Explicit type casts are not necessary when it's void* to another pointer type.
Signed-off-by: Yu Zhe <yuzhe@nfschina.com> Reviewed-by: David Sterba <dsterba@sus
btrfs: remove unnecessary type casts
Explicit type casts are not necessary when it's void* to another pointer type.
Signed-off-by: Yu Zhe <yuzhe@nfschina.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
Revision tags: v5.15.32, v5.15.31, v5.17, v5.15.30, v5.15.29, v5.15.28, v5.15.27, v5.15.26, v5.15.25, v5.15.24, v5.15.23, v5.15.22, v5.15.21, v5.15.20, v5.15.19, v5.15.18, v5.15.17, v5.4.173, v5.15.16, v5.15.15 |
|
#
fbca46eb |
| 12-Jan-2022 |
Qu Wenruo <wqu@suse.com> |
btrfs: make nodesize >= PAGE_SIZE case to reuse the non-subpage routine
The reason why we only support 64K page size for subpage is, for 64K page size we can ensure no matter what the nodesize is, w
btrfs: make nodesize >= PAGE_SIZE case to reuse the non-subpage routine
The reason why we only support 64K page size for subpage is, for 64K page size we can ensure no matter what the nodesize is, we can fit it into one page.
When other page size come, especially like 16K, the limitation is a bit limiting.
To remove such limitation, we allow nodesize >= PAGE_SIZE case to go the non-subpage routine. By this, we can allow 4K sectorsize on 16K page size.
Although this introduces another smaller limitation, the metadata can not cross page boundary, which is already met by most recent mkfs.
Another small improvement is, we can avoid the overhead for metadata if nodesize >= PAGE_SIZE. For 4K sector size and 64K page size/node size, or 4K sector size and 16K page size/node size, we don't need to allocate extra memory for the metadata pages.
Please note that, this patch will not yet enable other page size support yet.
Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
c992fa1f |
| 17-Feb-2022 |
Qu Wenruo <wqu@suse.com> |
btrfs: subpage: fix a wrong check on subpage->writers
[BUG] When looping btrfs/074 with 64K page size and 4K sectorsize, there is a low chance (1/50~1/100) to crash with the following ASSERT() trigg
btrfs: subpage: fix a wrong check on subpage->writers
[BUG] When looping btrfs/074 with 64K page size and 4K sectorsize, there is a low chance (1/50~1/100) to crash with the following ASSERT() triggered in btrfs_subpage_start_writer():
ret = atomic_add_return(nbits, &subpage->writers); ASSERT(ret == nbits); <<< This one <<<
[CAUSE] With more debugging output on the parameters of btrfs_subpage_start_writer(), it shows a very concerning error:
ret=29 nbits=13 start=393216 len=53248
For @nbits it's correct, but @ret which is the returned value from atomic_add_return(), it's not only larger than nbits, but also larger than max sectors per page value (for 64K page size and 4K sector size, it's 16).
This indicates that some call sites are not properly decreasing the value.
And that's exactly the case, in btrfs_page_unlock_writer(), due to the fact that we can have page locked either by lock_page() or process_one_page(), we have to check if the subpage has any writer.
If no writers, it's locked by lock_page() and we only need to unlock it.
But unfortunately the check for the writers are completely opposite:
if (atomic_read(&subpage->writers)) /* No writers, locked by plain lock_page() */ return unlock_page(page);
We directly unlock the page if it has writers, which is the completely opposite what we want.
Thankfully the affected call site is only limited to extent_write_locked_range(), so it's mostly affecting compressed write.
[FIX] Just fix the wrong check condition to fix the bug.
Fixes: e55a0de18572 ("btrfs: rework page locking in __extent_writepage()") CC: stable@vger.kernel.org # 5.16 Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
Revision tags: v5.16, v5.15.10, v5.15.9, v5.15.8, v5.15.7, v5.15.6, v5.15.5, v5.15.4, v5.15.3, v5.15.2, v5.15.1, v5.15, v5.14.14, v5.14.13, v5.14.12, v5.14.11, v5.14.10, v5.14.9 |
|
#
164674a7 |
| 27-Sep-2021 |
Qu Wenruo <wqu@suse.com> |
btrfs: handle page locking in btrfs_page_end_writer_lock with no writers
There are several call sites of extent_clear_unlock_delalloc() which get @locked_page = NULL. So that extent_clear_unlock_del
btrfs: handle page locking in btrfs_page_end_writer_lock with no writers
There are several call sites of extent_clear_unlock_delalloc() which get @locked_page = NULL. So that extent_clear_unlock_delalloc() will try to call process_one_page() to unlock every page even the first page is not locked by btrfs_page_start_writer_lock().
This will trigger an ASSERT() in btrfs_subpage_end_and_test_writer() as previously we require every page passed to btrfs_subpage_end_and_test_writer() to be locked by btrfs_page_start_writer_lock().
But compression path doesn't go that way.
Thankfully it's not hard to distinguish page locked by lock_page() and btrfs_page_start_writer_lock().
So do the check in btrfs_subpage_end_and_test_writer() so now it can handle both cases well.
Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
e55a0de1 |
| 27-Sep-2021 |
Qu Wenruo <wqu@suse.com> |
btrfs: rework page locking in __extent_writepage()
Pages passed to __extent_writepage() are always locked, but they may be locked by different functions.
There are two types of locked page for __ex
btrfs: rework page locking in __extent_writepage()
Pages passed to __extent_writepage() are always locked, but they may be locked by different functions.
There are two types of locked page for __extent_writepage():
- Page locked by plain lock_page() It should not have any subpage::writers count. Can be unlocked by unlock_page(). This is the most common locked page for __extent_writepage() called inside extent_write_cache_pages() or extent_write_full_page(). Rarer cases include the @locked_page from extent_write_locked_range().
- Page locked by lock_delalloc_pages() There is only one caller, all pages except @locked_page for extent_write_locked_range(). In this case, we have to call subpage helper to handle the case.
So here we introduce a helper, btrfs_page_unlock_writer(), to allow __extent_writepage() to unlock different locked pages.
And since for all other callers of __extent_writepage() their pages are ensured to be locked by lock_page(), also add an extra check for epd::extent_locked to unlock such pages directly.
Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
e4f94347 |
| 27-Sep-2021 |
Qu Wenruo <wqu@suse.com> |
btrfs: subpage: add bitmap for PageChecked flag
Although in btrfs we have very limited usage of PageChecked flag, it's still some page flag not yet subpage compatible.
Fix it by introducing btrfs_s
btrfs: subpage: add bitmap for PageChecked flag
Although in btrfs we have very limited usage of PageChecked flag, it's still some page flag not yet subpage compatible.
Fix it by introducing btrfs_subpage::checked_offset to do the convert.
For most call sites, especially for free-space cache, COW fixup and btrfs_invalidatepage(), they all work in full page mode anyway.
For other call sites, they work as subpage compatible mode.
Some call sites need extra modification:
- btrfs_drop_pages() Needs extra parameter to get the real range we need to clear checked flag.
Also since btrfs_drop_pages() will accept pages beyond the dirtied range, update btrfs_subpage_clamp_range() to handle such case by setting @len to 0 if the page is beyond target range.
- btrfs_invalidatepage() We need to call subpage helper before calling __btrfs_releasepage(), or it will trigger ASSERT() as page->private will be cleared.
- btrfs_verify_data_csum() In theory we don't need the io_bio->csum check anymore, but it's won't hurt. Just change the comment.
Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
Revision tags: v5.14.8, v5.14.7, v5.14.6, v5.10.67, v5.10.66, v5.14.5, v5.14.4, v5.10.65, v5.14.3, v5.10.64, v5.14.2, v5.10.63, v5.14.1, v5.10.62, v5.14, v5.10.61, v5.10.60 |
|
#
72a69cd0 |
| 17-Aug-2021 |
Qu Wenruo <wqu@suse.com> |
btrfs: subpage: pack all subpage bitmaps into a larger bitmap
Currently we use u16 bitmap to make 4k sectorsize work for 64K page size.
But this u16 bitmap is not large enough to contain larger pag
btrfs: subpage: pack all subpage bitmaps into a larger bitmap
Currently we use u16 bitmap to make 4k sectorsize work for 64K page size.
But this u16 bitmap is not large enough to contain larger page size like 128K, nor is space efficient for 16K page size.
To handle both cases, here we pack all subpage bitmaps into a larger bitmap, now btrfs_subpage::bitmaps[] will be the ultimate bitmap for subpage usage.
Each sub-bitmap will has its start bit number recorded in btrfs_subpage_info::*_start, and its bitmap length will be recorded in btrfs_subpage_info::bitmap_nr_bits.
All subpage bitmap operations will be converted from using direct u16 operations to bitmap operations, with above *_start calculated.
For 64K page size with 4K sectorsize, this should not cause much difference.
While for 16K page size, we will only need 1 unsigned long (u32) to store all the bitmaps, which saves quite some space.
Furthermore, this allows us to support larger page size like 128K and 258K.
Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
8481dd80 |
| 17-Aug-2021 |
Qu Wenruo <wqu@suse.com> |
btrfs: subpage: introduce btrfs_subpage_bitmap_info
Currently we use fixed size u16 bitmap for subpage bitmap. This is fine for 4K sectorsize with 64K page size.
But for 4K sectorsize and larger p
btrfs: subpage: introduce btrfs_subpage_bitmap_info
Currently we use fixed size u16 bitmap for subpage bitmap. This is fine for 4K sectorsize with 64K page size.
But for 4K sectorsize and larger page size, the bitmap is too small, while for smaller page size like 16K, u16 bitmaps waste too much space.
Here we introduce a new helper structure, btrfs_subpage_bitmap_info, to record the proper bitmap size, and where each bitmap should start at.
By this, we can later compact all subpage bitmaps into one u32 bitmap. This patch is the first step.
Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
651fb419 |
| 17-Aug-2021 |
Qu Wenruo <wqu@suse.com> |
btrfs: subpage: make btrfs_alloc_subpage() return btrfs_subpage directly
The existing calling convention of btrfs_alloc_subpage() is pretty awful. Change it to a more common pattern by returning st
btrfs: subpage: make btrfs_alloc_subpage() return btrfs_subpage directly
The existing calling convention of btrfs_alloc_subpage() is pretty awful. Change it to a more common pattern by returning struct btrfs_subpage directly and let the caller to determine if the call succeeded.
Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
fdf250db |
| 17-Aug-2021 |
Qu Wenruo <wqu@suse.com> |
btrfs: subpage: only call btrfs_alloc_subpage() when sectorsize is smaller than PAGE_SIZE
There are two call sites of btrfs_alloc_subpage():
- btrfs_attach_subpage() We have ensured sectorsize is
btrfs: subpage: only call btrfs_alloc_subpage() when sectorsize is smaller than PAGE_SIZE
There are two call sites of btrfs_alloc_subpage():
- btrfs_attach_subpage() We have ensured sectorsize is smaller than PAGE_SIZE
- alloc_extent_buffer() We call btrfs_alloc_subpage() unconditionally.
The alloc_extent_buffer() forces us to check the sectorsize size against page size inside btrfs_alloc_subpage().
Since the function name, btrfs_alloc_subpage(), already indicates it should only get called for subpage cases, do the check in alloc_extent_buffer() and add an ASSERT() in btrfs_alloc_subpage().
Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
7c11d0ae |
| 26-Jul-2021 |
Qu Wenruo <wqu@suse.com> |
btrfs: subpage: fix a potential use-after-free in writeback helper
[BUG] There is a possible use-after-free bug when running generic/095.
BUG: Unable to handle kernel data access on write at 0x6b6
btrfs: subpage: fix a potential use-after-free in writeback helper
[BUG] There is a possible use-after-free bug when running generic/095.
BUG: Unable to handle kernel data access on write at 0x6b6b6b6b6b6b725b Faulting instruction address: 0xc000000000283654 c000000000283078 do_raw_spin_unlock+0x88/0x230 c0000000012b1e14 _raw_spin_unlock_irqrestore+0x44/0x90 c000000000a918dc btrfs_subpage_clear_writeback+0xac/0xe0 c0000000009e0458 end_bio_extent_writepage+0x158/0x270 c000000000b6fd14 bio_endio+0x254/0x270 c0000000009fc0f0 btrfs_end_bio+0x1a0/0x200 c000000000b6fd14 bio_endio+0x254/0x270 c000000000b781fc blk_update_request+0x46c/0x670 c000000000b8b394 blk_mq_end_request+0x34/0x1d0 c000000000d82d1c lo_complete_rq+0x11c/0x140 c000000000b880a4 blk_complete_reqs+0x84/0xb0 c0000000012b2ca4 __do_softirq+0x334/0x680 c0000000001dd878 irq_exit+0x148/0x1d0 c000000000016f4c do_IRQ+0x20c/0x240 c000000000009240 hardware_interrupt_common_virt+0x1b0/0x1c0
[CAUSE] There is very small race window like the following in generic/095.
Thread 1 | Thread 2 --------------------------------+------------------------------------ end_bio_extent_writepage() | btrfs_releasepage() |- spin_lock_irqsave() | | |- end_page_writeback() | | | | |- if (PageWriteback() ||...) | | |- clear_page_extent_mapped() | | |- kfree(subpage); |- spin_unlock_irqrestore().
The race can also happen between writeback and btrfs_invalidatepage(), although that would be much harder as btrfs_invalidatepage() has much more work to do before the clear_page_extent_mapped() call.
[FIX] Here we "wait" for the subapge spinlock to be released before we detach subpage structure. So this patch will introduce a new function, wait_subpage_spinlock(), to do the "wait" by acquiring the spinlock and release it.
Since the caller has ensured the page is not dirty nor writeback, and page is already locked, the only way to hold the subpage spinlock is from endio function. Thus we only need to acquire the spinlock to wait for any existing holder.
Reported-by: Ritesh Harjani <riteshh@linux.ibm.com> Tested-by: Ritesh Harjani <riteshh@linux.ibm.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
cc1d0d93 |
| 26-Jul-2021 |
Qu Wenruo <wqu@suse.com> |
btrfs: subpage: fix writeback which does not have ordered extent
[BUG] When running fsstress with subpage RW support, there are random BUG_ON()s triggered with the following trace:
kernel BUG at f
btrfs: subpage: fix writeback which does not have ordered extent
[BUG] When running fsstress with subpage RW support, there are random BUG_ON()s triggered with the following trace:
kernel BUG at fs/btrfs/file-item.c:667! Internal error: Oops - BUG: 0 [#1] SMP CPU: 1 PID: 3486 Comm: kworker/u13:2 5.11.0-rc4-custom+ #43 Hardware name: Radxa ROCK Pi 4B (DT) Workqueue: btrfs-worker-high btrfs_work_helper [btrfs] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--) pc : btrfs_csum_one_bio+0x420/0x4e0 [btrfs] lr : btrfs_csum_one_bio+0x400/0x4e0 [btrfs] Call trace: btrfs_csum_one_bio+0x420/0x4e0 [btrfs] btrfs_submit_bio_start+0x20/0x30 [btrfs] run_one_async_start+0x28/0x44 [btrfs] btrfs_work_helper+0x128/0x1b4 [btrfs] process_one_work+0x22c/0x430 worker_thread+0x70/0x3a0 kthread+0x13c/0x140 ret_from_fork+0x10/0x30
[CAUSE] Above BUG_ON() means there is some bio range which doesn't have ordered extent, which indeed is worth a BUG_ON().
Unlike regular sectorsize == PAGE_SIZE case, in subpage we have extra subpage dirty bitmap to record which range is dirty and should be written back.
This means, if we submit bio for a subpage range, we do not only need to clear page dirty, but also need to clear subpage dirty bits.
In __extent_writepage_io(), we will call btrfs_page_clear_dirty() for any range we submit a bio.
But there is loophole, if we hit a range which is beyond i_size, we just call btrfs_writepage_endio_finish_ordered() to finish the ordered io, then break out, without clearing the subpage dirty.
This means, if we hit above branch, the subpage dirty bits are still there, if other range of the page get dirtied and we need to writeback that page again, we will submit bio for the old range, leaving a wild bio range which doesn't have ordered extent.
[FIX] Fix it by always calling btrfs_page_clear_dirty() in __extent_writepage_io().
Also to avoid such problem from happening again, add a new assert, btrfs_page_assert_not_dirty(), to make sure both page dirty and subpage dirty bits are cleared before exiting __extent_writepage_io().
Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
Revision tags: v5.10.53, v5.10.52, v5.10.51, v5.10.50, v5.10.49, v5.13, v5.10.46, v5.10.43 |
|
#
3d078efa |
| 07-Jun-2021 |
Qu Wenruo <wqu@suse.com> |
btrfs: subpage: fix a rare race between metadata endio and eb freeing
[BUG] There is a very rare ASSERT() triggering during full fstests run for subpage rw support.
No other reproducer so far.
The
btrfs: subpage: fix a rare race between metadata endio and eb freeing
[BUG] There is a very rare ASSERT() triggering during full fstests run for subpage rw support.
No other reproducer so far.
The ASSERT() gets triggered for metadata read in btrfs_page_set_uptodate() inside end_page_read().
[CAUSE] There is still a small race window for metadata only, the race could happen like this:
T1 | T2 ------------------------------------+----------------------------- end_bio_extent_readpage() | |- btrfs_validate_metadata_buffer() | | |- free_extent_buffer() | | Still have 2 refs | |- end_page_read() | |- if (unlikely(PagePrivate()) | | The page still has Private | | | free_extent_buffer() | | | Only one ref 1, will be | | | released | | |- detach_extent_buffer_page() | | |- btrfs_detach_subpage() |- btrfs_set_page_uptodate() | The page no longer has Private| >>> ASSERT() triggered <<< |
This race window is super small, thus pretty hard to hit, even with so many runs of fstests.
But the race window is still there, we have to go another way to solve it other than relying on random PagePrivate() check.
Data path is not affected, as it will lock the page before reading, while unlocking the page after the last read has finished, thus no race window.
[FIX] This patch will fix the bug by repurposing btrfs_subpage::readers.
Now btrfs_subpage::readers will be a member shared by both metadata and data.
For metadata path, we don't do the page unlock as metadata only relies on extent locking.
At the same time, teach page_range_has_eb() to take btrfs_subpage::readers into consideration.
So that even if the last eb of a page gets freed, page::private won't be detached as long as there still are pending end_page_read() calls.
By this we eliminate the race window, this will slight increase the metadata memory usage, as the page may not be released as frequently as usual. But it should not be a big deal.
The code got introduced in ("btrfs: submit read time repair only for each corrupted sector"), but the fix is in a separate patch to keep the problem description and the crash is rare so it should not hurt bisectability.
Signed-off-by: Qu Wegruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
Revision tags: v5.10.42 |
|
#
6f17400b |
| 31-May-2021 |
Qu Wenruo <wqu@suse.com> |
btrfs: introduce helpers for subpage ordered status
This patch introduces the following functions to handle btrfs subpage ordered (Private2) status:
- btrfs_subpage_set_ordered() - btrfs_subpage_cl
btrfs: introduce helpers for subpage ordered status
This patch introduces the following functions to handle btrfs subpage ordered (Private2) status:
- btrfs_subpage_set_ordered() - btrfs_subpage_clear_ordered() - btrfs_subpage_test_ordered() These helpers can only be called when the range is ensured to be inside the page.
- btrfs_page_set_ordered() - btrfs_page_clear_ordered() - btrfs_page_test_ordered() These helpers can handle both regular sector size and subpage without problem.
These functions are here to coordinate btrfs_invalidatepage() with btrfs_writepage_endio_finish_ordered(), to make sure only one of those functions can finish the ordered extent.
Tested-by: Ritesh Harjani <riteshh@linux.ibm.com> # [ppc64] Tested-by: Anand Jain <anand.jain@oracle.com> # [aarch64] Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
1e1de387 |
| 31-May-2021 |
Qu Wenruo <wqu@suse.com> |
btrfs: make process_one_page() to handle subpage locking
Introduce a new data inodes specific subpage member, writers, to record how many sectors are under page lock for delalloc writing.
This memb
btrfs: make process_one_page() to handle subpage locking
Introduce a new data inodes specific subpage member, writers, to record how many sectors are under page lock for delalloc writing.
This member acts pretty much the same as readers, except it's only for delalloc writes.
This is important for delalloc code to trace which page can really be freed, as we have cases like run_delalloc_nocow() where we may exit processing nocow range inside a page, but need to exit to do cow half way. In that case, we need a way to determine if we can really unlock a full page.
With the new btrfs_subpage::writers, there is a new requirement: - Page locked by process_one_page() must be unlocked by process_one_page() There are still tons of call sites manually lock and unlock a page, without updating btrfs_subpage::writers. So if we lock a page through process_one_page() then it must be unlocked by process_one_page() to keep btrfs_subpage::writers consistent.
This will be handled in next patch.
Tested-by: Ritesh Harjani <riteshh@linux.ibm.com> # [ppc64] Tested-by: Anand Jain <anand.jain@oracle.com> # [aarch64] Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
60e2d255 |
| 31-May-2021 |
Qu Wenruo <wqu@suse.com> |
btrfs: provide btrfs_page_clamp_*() helpers
In the coming subpage RW supports, there are a lot of page status update calls which need to be converted to subpage compatible version, which needs @star
btrfs: provide btrfs_page_clamp_*() helpers
In the coming subpage RW supports, there are a lot of page status update calls which need to be converted to subpage compatible version, which needs @start and @len.
Some call sites already have such @start/@len and are already in page range, like various endio functions.
But there are also call sites which need to clamp the range for subpage case, like btrfs_dirty_pagse() and __process_contig_pages().
Here we introduce new helpers, btrfs_page_clamp_*(), to do and only do the clamp for subpage version.
Although in theory all existing btrfs_page_*() calls can be converted to use btrfs_page_clamp_*() directly, but that would make us to do unnecessary clamp operations.
Tested-by: Ritesh Harjani <riteshh@linux.ibm.com> # [ppc64] Tested-by: Anand Jain <anand.jain@oracle.com> # [aarch64] Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
Revision tags: v5.10.41, v5.10.40, v5.10.39, v5.4.119, v5.10.36, v5.10.35, v5.10.34, v5.4.116, v5.10.33, v5.12, v5.10.32, v5.10.31, v5.10.30, v5.10.27, v5.10.26 |
|
#
894d1378 |
| 25-Mar-2021 |
Qu Wenruo <wqu@suse.com> |
btrfs: subpage: add overview comments
This patch adds an overview how btrfs subpage support works:
- limitations - behavior - basic implementation points
Signed-off-by: Qu Wenruo <wqu@suse.com> Re
btrfs: subpage: add overview comments
This patch adds an overview how btrfs subpage support works:
- limitations - behavior - basic implementation points
Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
3470da3b |
| 25-Mar-2021 |
Qu Wenruo <wqu@suse.com> |
btrfs: subpage: introduce helpers for writeback status
Introduces the following functions to handle subpage writeback status:
- btrfs_subpage_set_writeback() - btrfs_subpage_clear_writeback() - btr
btrfs: subpage: introduce helpers for writeback status
Introduces the following functions to handle subpage writeback status:
- btrfs_subpage_set_writeback() - btrfs_subpage_clear_writeback() - btrfs_subpage_test_writeback() These helpers can only be called when the range is ensured to be inside the page.
- btrfs_page_set_writeback() - btrfs_page_clear_writeback() - btrfs_page_test_writeback() These helpers can handle both regular sector size and subpage without problem.
Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|