Revision tags: v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37 |
|
#
c56cbe90 |
| 28-Jun-2023 |
Christoph Hellwig <hch@lst.de> |
btrfs: reduce the number of arguments to btrfs_run_delalloc_range
Instead of a separate page_started argument that tells the callers that btrfs_run_delalloc_range already started writeback by itself
btrfs: reduce the number of arguments to btrfs_run_delalloc_range
Instead of a separate page_started argument that tells the callers that btrfs_run_delalloc_range already started writeback by itself, overload the return value with a positive 1 in additio to 0 and a negative error code to indicate that is has already started writeback, and remove the nr_written argument as that caller can calculate it directly based on the range, and in fact already does so for the case where writeback wasn't started yet.
Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
6648cedd |
| 28-Jun-2023 |
Christoph Hellwig <hch@lst.de> |
btrfs: remove btrfs_writepage_endio_finish_ordered
btrfs_writepage_endio_finish_ordered is a small wrapper around btrfs_mark_ordered_io_finished that just changs the argument passing slightly, and a
btrfs: remove btrfs_writepage_endio_finish_ordered
btrfs_writepage_endio_finish_ordered is a small wrapper around btrfs_mark_ordered_io_finished that just changs the argument passing slightly, and adds a tracepoint.
Move the tracpoint to btrfs_mark_ordered_io_finished, which means it now also covers the error handling in btrfs_cleanup_ordered_extent and switch all callers to just call btrfs_mark_ordered_io_finished directly.
Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
Revision tags: v6.1.36, v6.4, v6.1.35 |
|
#
64425500 |
| 18-Jun-2023 |
Naohiro Aota <naohiro.aota@wdc.com> |
btrfs: tracepoints: also show actual number of the outstanding extents
The btrfs_inode_mod_outstanding_extents trace event only shows the modified number to the number of outstanding extents. It wou
btrfs: tracepoints: also show actual number of the outstanding extents
The btrfs_inode_mod_outstanding_extents trace event only shows the modified number to the number of outstanding extents. It would be helpful if we can see the resulting extent number as well.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
Revision tags: v6.1.34, v6.1.33, v6.1.32, v6.1.31, v6.1.30 |
|
#
71df088c |
| 24-May-2023 |
Christoph Hellwig <hch@lst.de> |
btrfs: defer splitting of ordered extents until I/O completion
The btrfs zoned completion code currently needs an ordered_extent and extent_map per bio so that it can account for the non-predictable
btrfs: defer splitting of ordered extents until I/O completion
The btrfs zoned completion code currently needs an ordered_extent and extent_map per bio so that it can account for the non-predictable write location from Zone Append. To archive that it currently splits the ordered_extent and extent_map at I/O submission time, and then records the actual physical address in the ->physical field of the ordered_extent.
This patch instead switches to record the "original" physical address that the btrfs allocator assigned in spare space in the btrfs_bio, and then rewrites the logical address in the btrfs_ordered_sum structure at I/O completion time. This allows the ordered extent completion handler to simply walk the list of ordered csums and split the ordered extent as needed. This removes an extra ordered extent and extent_map lookup and manipulation during the I/O submission path, and instead batches it in the I/O completion path where we need to touch these anyway.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
Revision tags: v6.1.29, v6.1.28, v6.1.27 |
|
#
f541833c |
| 29-Apr-2023 |
Josef Bacik <josef@toxicpanda.com> |
btrfs: move split_flags/combine_flags helpers to inode-item.h
These are more related to the inode item flags on disk than the in-memory btrfs_inode, move the helpers to inode-item.h.
Reviewed-by: J
btrfs: move split_flags/combine_flags helpers to inode-item.h
These are more related to the inode item flags on disk than the in-memory btrfs_inode, move the helpers to inode-item.h.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
e917ff56 |
| 03-May-2023 |
Christoph Hellwig <hch@lst.de> |
btrfs: determine synchronous writers from bio or writeback control
The writeback_control structure already passes down the information about a writeback being synchronous from the core VM code, and
btrfs: determine synchronous writers from bio or writeback control
The writeback_control structure already passes down the information about a writeback being synchronous from the core VM code, and thus information is propagated into the bio REQ_SYNC flag through the wbc_to_write_flags helper.
Use that information to decide if checksums calculation is offloaded to a workqueue instead of btrfs_inode::sync_writers field that not only bloats the inode but also has too wide scope, being inode wide instead of limited to the actual writeback request.
The sync writes were set in:
- btrfs_do_write_iter - regular IO, sync status is set - start_ordered_ops - ordered write start, writeback with WB_SYNC_ALL mode - btrfs_write_marked_extents - write marked extents, writeback with WB_SYNC_ALL mode
Reviewed-by: Chris Mason <clm@fb.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> [ update changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
Revision tags: v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23 |
|
#
fa4b8cb1 |
| 05-Apr-2023 |
Filipe Manana <fdmanana@suse.com> |
btrfs: avoid iterating over all indexes when logging directory
When logging a directory, after copying all directory index items from the subvolume tree to the log tree, we iterate over the subvolum
btrfs: avoid iterating over all indexes when logging directory
When logging a directory, after copying all directory index items from the subvolume tree to the log tree, we iterate over the subvolume tree to find all dir index items that are located in leaves COWed (or created) in the current transaction. If we keep logging a directory several times during the same transaction, we end up iterating over the same dir index items everytime we log the directory, wasting time and adding extra lock contention on the subvolume tree.
So just keep track of the last logged dir index offset in order to start the search for that index (+1) the next time the directory is logged, as dir index values (key offsets) come from a monotonically increasing counter.
The following test measures the difference before and after this change:
$ cat test.sh #!/bin/bash
DEV=/dev/nullb0 MNT=/mnt/nullb0
umount $DEV &> /dev/null mkfs.btrfs -f $DEV mount -o ssd $DEV $MNT
# Time values in milliseconds. declare -a fsync_times # Total number of files added to the test directory. num_files=1000000 # Fsync directory after every N files are added. fsync_period=100
mkdir $MNT/testdir
fsync_total_time=0 for ((i = 1; i <= $num_files; i++)); do echo -n > $MNT/testdir/file_$i
if [ $((i % fsync_period)) -eq 0 ]; then start=$(date +%s%N) xfs_io -c "fsync" $MNT/testdir end=$(date +%s%N) fsync_total_time=$((fsync_total_time + (end - start))) fsync_times[i]=$(( (end - start) / 1000000 )) echo -n -e "Progress $i / $num_files\r" fi done
echo -e "\nHistogram of directory fsync duration in ms:\n"
printf '%s\n' "${fsync_times[@]}" | \ perl -MStatistics::Histogram -e '@d = <>; print get_histogram(\@d);'
fsync_total_time=$((fsync_total_time / 1000000)) echo -e "\nTotal time spent in fsync: $fsync_total_time ms\n" echo
umount $MNT
The test was run on a non-debug kernel (Debian's default kernel config) against a 15G null block device.
Result before this change:
Histogram of directory fsync duration in ms:
Count: 10000 Range: 3.000 - 362.000; Mean: 34.556; Median: 31.000; Stddev: 25.751 Percentiles: 90th: 71.000; 95th: 77.000; 99th: 81.000 3.000 - 5.278: 1423 ################################# 5.278 - 8.854: 1173 ########################### 8.854 - 14.467: 591 ############## 14.467 - 23.277: 1025 ####################### 23.277 - 37.105: 1422 ################################# 37.105 - 58.809: 2036 ############################################### 58.809 - 92.876: 2316 ##################################################### 92.876 - 146.346: 6 | 146.346 - 230.271: 6 | 230.271 - 362.000: 2 |
Total time spent in fsync: 350527 ms
Result after this change:
Histogram of directory fsync duration in ms:
Count: 10000 Range: 3.000 - 1088.000; Mean: 8.704; Median: 8.000; Stddev: 12.576 Percentiles: 90th: 12.000; 95th: 14.000; 99th: 17.000 3.000 - 6.007: 3222 ################################# 6.007 - 11.276: 5197 ##################################################### 11.276 - 20.506: 1551 ################ 20.506 - 36.674: 24 | 36.674 - 201.552: 1 | 201.552 - 353.841: 4 | 353.841 - 1088.000: 1 |
Total time spent in fsync: 92114 ms
Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
Revision tags: v6.1.22 |
|
#
7edd339c |
| 28-Mar-2023 |
Christoph Hellwig <hch@lst.de> |
btrfs: pass an ordered_extent to btrfs_extract_ordered_extent
To prepare for a new caller that already has the ordered_extent available, change btrfs_extract_ordered_extent to take an argument for i
btrfs: pass an ordered_extent to btrfs_extract_ordered_extent
To prepare for a new caller that already has the ordered_extent available, change btrfs_extract_ordered_extent to take an argument for it. Add a wrapper for the bio case that still has to do the lookup (for now).
Reviewed-by: Josef Bacik <josef@toxicpanda.com> Tested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
Revision tags: v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8 |
|
#
35a8d7da |
| 21-Jan-2023 |
Christoph Hellwig <hch@lst.de> |
btrfs: remove now spurious bio submission helpers
Call btrfs_submit_bio and btrfs_submit_compressed_read directly from submit_one_bio now that all additional functionality has moved into btrfs_submi
btrfs: remove now spurious bio submission helpers
Call btrfs_submit_bio and btrfs_submit_compressed_read directly from submit_one_bio now that all additional functionality has moved into btrfs_submit_bio.
Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
69ccf3f4 |
| 21-Jan-2023 |
Christoph Hellwig <hch@lst.de> |
btrfs: handle recording of zoned writes in the storage layer
Move the code that splits the ordered extents and records the physical location for them to the storage layer so that the higher level co
btrfs: handle recording of zoned writes in the storage layer
Move the code that splits the ordered extents and records the physical location for them to the storage layer so that the higher level consumers don't have to care about physical block numbers at all. This will also allow to eventually remove accounting for the zone append write sizes in the upper layer with a little bit more block layer work.
Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
deb6216f |
| 21-Jan-2023 |
Christoph Hellwig <hch@lst.de> |
btrfs: open code the submit_bio_start helpers
The submit helpers are now trivial and can be called directly. Note that btree_csum_one_bio has to be moved up in the file a bit to avoid a forward dec
btrfs: open code the submit_bio_start helpers
The submit helpers are now trivial and can be called directly. Note that btree_csum_one_bio has to be moved up in the file a bit to avoid a forward declaration.
Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
0571b635 |
| 21-Jan-2023 |
Christoph Hellwig <hch@lst.de> |
btrfs: remove the io_failure_record infrastructure
struct io_failure_record and the io_failure_tree tree are unused now, so remove them. This in turn makes struct btrfs_inode smaller by 16 bytes.
R
btrfs: remove the io_failure_record infrastructure
struct io_failure_record and the io_failure_tree tree are unused now, so remove them. This in turn makes struct btrfs_inode smaller by 16 bytes.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
3d49d0d3 |
| 21-Jan-2023 |
Christoph Hellwig <hch@lst.de> |
btrfs: remove now unused checksumming helpers
Remove the unused btrfs_verify_data_csum helper, and fold btrfs_check_data_csum into its only caller.
Reviewed-by: Johannes Thumshirn <johannes.thumshi
btrfs: remove now unused checksumming helpers
Remove the unused btrfs_verify_data_csum helper, and fold btrfs_check_data_csum into its only caller.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
e5219044 |
| 21-Jan-2023 |
Christoph Hellwig <hch@lst.de> |
btrfs: add a btrfs_data_csum_ok helper
Add a new checksumming helper that wraps btrfs_check_data_csum and does all the checks to if we're dealing with some form of nodatacsum I/O. This helper will
btrfs: add a btrfs_data_csum_ok helper
Add a new checksumming helper that wraps btrfs_check_data_csum and does all the checks to if we're dealing with some form of nodatacsum I/O. This helper will be used by the new storage layer checksum validation and repair code.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
Revision tags: v6.1.7, v6.1.6 |
|
#
f2d40141 |
| 13-Jan-2023 |
Christian Brauner <brauner@kernel.org> |
fs: port inode_init_owner() to mnt_idmap
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is ju
fs: port inode_init_owner() to mnt_idmap
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap.
Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs.
Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap.
Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
show more ...
|
Revision tags: v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80, v6.0.9, v5.15.79, v6.0.8, v5.15.78, v6.0.7, v5.15.77, v5.15.76, v6.0.6 |
|
#
e55cf7ca |
| 27-Oct-2022 |
David Sterba <dsterba@suse.com> |
btrfs: pass btrfs_inode to btrfs_add_delayed_iput
The function is for internal interfaces so we should use the btrfs_inode.
Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Ster
btrfs: pass btrfs_inode to btrfs_add_delayed_iput
The function is for internal interfaces so we should use the btrfs_inode.
Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
bd54766e |
| 26-Oct-2022 |
David Sterba <dsterba@suse.com> |
btrfs: pass btrfs_inode to btrfs_clear_delalloc_extent
The function is for internal interfaces so we should use the btrfs_inode.
Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David
btrfs: pass btrfs_inode to btrfs_clear_delalloc_extent
The function is for internal interfaces so we should use the btrfs_inode.
Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
62798a49 |
| 26-Oct-2022 |
David Sterba <dsterba@suse.com> |
btrfs: pass btrfs_inode to btrfs_split_delalloc_extent
The function is for internal interfaces so we should use the btrfs_inode.
Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David
btrfs: pass btrfs_inode to btrfs_split_delalloc_extent
The function is for internal interfaces so we should use the btrfs_inode.
Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
4c5d166f |
| 26-Oct-2022 |
David Sterba <dsterba@suse.com> |
btrfs: pass btrfs_inode to btrfs_set_delalloc_extent
The function is for internal interfaces so we should use the btrfs_inode.
Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David S
btrfs: pass btrfs_inode to btrfs_set_delalloc_extent
The function is for internal interfaces so we should use the btrfs_inode.
Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
2454151c |
| 26-Oct-2022 |
David Sterba <dsterba@suse.com> |
btrfs: pass btrfs_inode to btrfs_merge_delalloc_extent
The function is for internal interfaces so we should use the btrfs_inode.
Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David
btrfs: pass btrfs_inode to btrfs_merge_delalloc_extent
The function is for internal interfaces so we should use the btrfs_inode.
Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
3c4f91e2 |
| 26-Oct-2022 |
David Sterba <dsterba@suse.com> |
btrfs: pass btrfs_inode to btrfs_delete_subvolume
The function is for internal interfaces so we should use the btrfs_inode.
Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Ster
btrfs: pass btrfs_inode to btrfs_delete_subvolume
The function is for internal interfaces so we should use the btrfs_inode.
Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
621af94a |
| 26-Oct-2022 |
David Sterba <dsterba@suse.com> |
btrfs: pass btrfs_inode to btrfs_check_data_csum
The function is for internal interfaces so we should use the btrfs_inode.
Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterb
btrfs: pass btrfs_inode to btrfs_check_data_csum
The function is for internal interfaces so we should use the btrfs_inode.
Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
e5d4d75b |
| 26-Oct-2022 |
David Sterba <dsterba@suse.com> |
btrfs: pass btrfs_inode to btrfs_inode_unlock
The function is for internal interfaces so we should use the btrfs_inode.
Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <
btrfs: pass btrfs_inode to btrfs_inode_unlock
The function is for internal interfaces so we should use the btrfs_inode.
Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
29b6352b |
| 26-Oct-2022 |
David Sterba <dsterba@suse.com> |
btrfs: pass btrfs_inode to btrfs_inode_lock
The function is for internal interfaces so we should use the btrfs_inode.
Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <ds
btrfs: pass btrfs_inode to btrfs_inode_lock
The function is for internal interfaces so we should use the btrfs_inode.
Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
d781c1c3 |
| 26-Oct-2022 |
David Sterba <dsterba@suse.com> |
btrfs: pass btrfs_inode to btrfs_submit_dio_repair_bio
The function is for internal interfaces so we should use the btrfs_inode.
Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David
btrfs: pass btrfs_inode to btrfs_submit_dio_repair_bio
The function is for internal interfaces so we should use the btrfs_inode.
Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|