#
55e43d6a |
| 05-Jan-2025 |
Andrew Jeffery <andrew@codeconstruct.com.au> |
Merge tag 'v6.6.68' into for/openbmc/dev-6.6
This is the 6.6.68 stable release
|
Revision tags: v6.6.69, v6.6.68, v6.6.67 |
|
#
1bee32f3 |
| 18-Dec-2024 |
Darrick J. Wong <djwong@kernel.org> |
xfs: fix file_path handling in tracepoints
commit 19ebc8f84ea12e18dd6c8d3ecaf87bcf4666eee1 upstream.
[backport: only apply fix for 3934e8ebb7cc6]
Since file_path() takes the output buffer as one o
xfs: fix file_path handling in tracepoints
commit 19ebc8f84ea12e18dd6c8d3ecaf87bcf4666eee1 upstream.
[backport: only apply fix for 3934e8ebb7cc6]
Since file_path() takes the output buffer as one of its arguments, we might as well have it format directly into the tracepoint's char array instead of wasting stack space.
Fixes: 3934e8ebb7cc6 ("xfs: create a big array data structure") Fixes: 5076a6040ca16 ("xfs: support in-memory buffer cache targets") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202403290419.HPcyvqZu-lkp@intel.com/ Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Catherine Hoang <catherine.hoang@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
16f6ccde |
| 19-Dec-2024 |
Andrew Jeffery <andrew@codeconstruct.com.au> |
Merge tag 'v6.6.67' into for/openbmc/dev-6.6
This is the 6.6.67 stable release
|
Revision tags: v6.6.66, v6.6.65, v6.6.64 |
|
#
08b1325d |
| 02-Dec-2024 |
Darrick J. Wong <djwong@kernel.org> |
xfs: fix scrub tracepoints when inode-rooted btrees are involved
commit ffc3ea4f3c1cc83a86b7497b0c4b0aee7de5480d upstream.
Fix a minor mistakes in the scrub tracepoints that can manifest when inode
xfs: fix scrub tracepoints when inode-rooted btrees are involved
commit ffc3ea4f3c1cc83a86b7497b0c4b0aee7de5480d upstream.
Fix a minor mistakes in the scrub tracepoints that can manifest when inode-rooted btrees are enabled. The existing code worked fine for bmap btrees, but we should tighten the code up to be less sloppy.
Cc: <stable@vger.kernel.org> # v5.7 Fixes: 92219c292af8dd ("xfs: convert btree cursor inode-private member names") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
Revision tags: v6.6.63, v6.6.62, v6.6.61, v6.6.60, v6.6.59, v6.6.58, v6.6.57, v6.6.56, v6.6.55, v6.6.54, v6.6.53, v6.6.52, v6.6.51, v6.6.50, v6.6.49, v6.6.48, v6.6.47, v6.6.46, v6.6.45, v6.6.44, v6.6.43, v6.6.42, v6.6.41, v6.6.40, v6.6.39, v6.6.38, v6.6.37, v6.6.36, v6.6.35, v6.6.34, v6.6.33, v6.6.32, v6.6.31, v6.6.30, v6.6.29, v6.6.28, v6.6.27, v6.6.26, v6.6.25 |
|
#
46eeaa11 |
| 03-Apr-2024 |
Andrew Jeffery <andrew@codeconstruct.com.au> |
Merge tag 'v6.6.24' into dev-6.6
This is the 6.6.24 stable release
|
Revision tags: v6.6.24 |
|
#
57a20b61 |
| 26-Mar-2024 |
Darrick J. Wong <djwong@kernel.org> |
xfs: convert rt bitmap extent lengths to xfs_rtbxlen_t
commit f29c3e745dc253bf9d9d06ddc36af1a534ba1dd0 upstream.
XFS uses xfs_rtblock_t for many different uses, which makes it much more difficult t
xfs: convert rt bitmap extent lengths to xfs_rtbxlen_t
commit f29c3e745dc253bf9d9d06ddc36af1a534ba1dd0 upstream.
XFS uses xfs_rtblock_t for many different uses, which makes it much more difficult to perform a unit analysis on the codebase. One of these (ab)uses is when we need to store the length of a free space extent as stored in the realtime bitmap. Because there can be up to 2^64 realtime extents in a filesystem, we need a new type that is larger than xfs_rtxlen_t for callers that are querying the bitmap directly. This means scrub and growfs.
Create this type as "xfs_rtbxlen_t" and use it to store 64-bit rtx lengths. 'b' stands for 'bitmap' or 'big'; reader's choice.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Catherine Hoang <catherine.hoang@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
Revision tags: v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3 |
|
#
c900529f |
| 12-Sep-2023 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-fixes into drm-misc-fixes
Forwarding to v6.6-rc1.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
Revision tags: v6.5.2, v6.1.51, v6.5.1 |
|
#
1ac731c5 |
| 30-Aug-2023 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge branch 'next' into for-linus
Prepare input updates for 6.6 merge window.
|
#
53ea7f62 |
| 30-Aug-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'xfs-6.6-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs updates from Chandan Babu:
- Chandan Babu will be taking over as the XFS release manager. He has reviewed a
Merge tag 'xfs-6.6-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs updates from Chandan Babu:
- Chandan Babu will be taking over as the XFS release manager. He has reviewed all the patches that are in this branch, though I'm signing the branch one last time since I'm still technically maintainer. :P
- Create a maintainer entry profile for XFS in which we lay out the various roles that I have played for many years. Aside from release manager, the remaining roles are as yet unfilled.
- Start merging online repair -- we now have in-memory pageable memory for staging btrees, a bunch of pending fixes, and we've started the process of refactoring the scrub support code to support more of repair. In particular, reaping of old blocks from damaged structures.
- Scrub the realtime summary file.
- Fix a bug where scrub's quota iteration only ever returned the root dquot. Oooops.
- Fix some typos.
[ Pull request from Chandan Babu, but signed tag and description from Darrick Wong, thus the first person singular above is Darrick, not Chandan ]
* tag 'xfs-6.6-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (37 commits) fs/xfs: Fix typos in comments xfs: fix dqiterate thinko xfs: don't check reflink iflag state when checking cow fork xfs: simplify returns in xchk_bmap xfs: rewrite xchk_inode_is_allocated to work properly xfs: hide xfs_inode_is_allocated in scrub common code xfs: fix agf_fllast when repairing an empty AGFL xfs: allow userspace to rebuild metadata structures xfs: clear pagf_agflreset when repairing the AGFL xfs: allow the user to cancel repairs before we start writing xfs: don't complain about unfixed metadata when repairs were injected xfs: implement online scrubbing of rtsummary info xfs: always rescan allegedly healthy per-ag metadata after repair xfs: move the realtime summary file scrubber to a separate source file xfs: wrap ilock/iunlock operations on sc->ip xfs: get our own reference to inodes that we want to scrub xfs: track usage statistics of online fsck xfs: improve xfarray quicksort pivot xfs: create scaffolding for creating debugfs entries xfs: cache pages used for xfarray quicksort convergence ...
show more ...
|
Revision tags: v6.1.50 |
|
#
511fb5ba |
| 28-Aug-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'v6.6-vfs.super' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull superblock updates from Christian Brauner: "This contains the super rework that was ready for this cycle. Th
Merge tag 'v6.6-vfs.super' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull superblock updates from Christian Brauner: "This contains the super rework that was ready for this cycle. The first part changes the order of how we open block devices and allocate superblocks, contains various cleanups, simplifications, and a new mechanism to wait on superblock state changes.
This unblocks work to ultimately limit the number of writers to a block device. Jan has already scheduled follow-up work that will be ready for v6.7 and allows us to restrict the number of writers to a given block device. That series builds on this work right here.
The second part contains filesystem freezing updates.
Overview:
The generic superblock changes are rougly organized as follows (ignoring additional minor cleanups):
(1) Removal of the bd_super member from struct block_device.
This was a very odd back pointer to struct super_block with unclear rules. For all relevant places we have other means to get the same information so just get rid of this.
(2) Simplify rules for superblock cleanup.
Roughly, everything that is allocated during fs_context initialization and that's stored in fs_context->s_fs_info needs to be cleaned up by the fs_context->free() implementation before the superblock allocation function has been called successfully.
After sget_fc() returned fs_context->s_fs_info has been transferred to sb->s_fs_info at which point sb->kill_sb() if fully responsible for cleanup. Adhering to these rules means that cleanup of sb->s_fs_info in fill_super() is to be avoided as it's brittle and inconsistent.
Cleanup shouldn't be duplicated between sb->put_super() as sb->put_super() is only called if sb->s_root has been set aka when the filesystem has been successfully born (SB_BORN). That complexity should be avoided.
This also means that block devices are to be closed in sb->kill_sb() instead of sb->put_super(). More details in the lower section.
(3) Make it possible to lookup or create a superblock before opening block devices
There's a subtle dependency on (2) as some filesystems did rely on fill_super() to be called in order to correctly clean up sb->s_fs_info. All these filesystems have been fixed.
(4) Switch most filesystem to follow the same logic as the generic mount code now does as outlined in (3).
(5) Use the superblock as the holder of the block device. We can now easily go back from block device to owning superblock.
(6) Export and extend the generic fs_holder_ops and use them as holder ops everywhere and remove the filesystem specific holder ops.
(7) Call from the block layer up into the filesystem layer when the block device is removed, allowing to shut down the filesystem without risk of deadlocks.
(8) Get rid of get_super().
We can now easily go back from the block device to owning superblock and can call up from the block layer into the filesystem layer when the device is removed. So no need to wade through all registered superblock to find the owning superblock anymore"
Link: https://lore.kernel.org/lkml/20230824-prall-intakt-95dbffdee4a0@brauner/
* tag 'v6.6-vfs.super' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (47 commits) super: use higher-level helper for {freeze,thaw} super: wait until we passed kill super super: wait for nascent superblocks super: make locking naming consistent super: use locking helpers fs: simplify invalidate_inodes fs: remove get_super block: call into the file system for ioctl BLKFLSBUF block: call into the file system for bdev_mark_dead block: consolidate __invalidate_device and fsync_bdev block: drop the "busy inodes on changed media" log message dasd: also call __invalidate_device when setting the device offline amiflop: don't call fsync_bdev in FDFMTBEG floppy: call disk_force_media_change when changing the format block: simplify the disk_force_media_change interface nbd: call blk_mark_disk_dead in nbd_clear_sock_ioctl xfs use fs_holder_ops for the log and RT devices xfs: drop s_umount over opening the log and RT devices ext4: use fs_holder_ops for the log device ext4: drop s_umount over opening the log device ...
show more ...
|
Revision tags: v6.5, v6.1.49, v6.1.48 |
|
#
cd4284cf |
| 23-Aug-2023 |
Christian Brauner <brauner@kernel.org> |
Merge tag 'vfs-6.6-merge-3' of ssh://gitolite.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs online fsck update from Darrick Wong:
New code for 6.6:
* Allow the kernel to initiate a freeze of a fil
Merge tag 'vfs-6.6-merge-3' of ssh://gitolite.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs online fsck update from Darrick Wong:
New code for 6.6:
* Allow the kernel to initiate a freeze of a filesystem. The kernel and userspace can both hold a freeze on a filesystem at the same time; the freeze is not lifted until /both/ holders lift it. This will enable us to fix a longstanding bug in XFS online fsck. * Use kernel-initated fsfreeze to fix some longstanding false negatives in online fsck of the free space and inode counters.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Message-Id: <20230822182604.GB11286@frogsfrogsfrogs> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
220c8d57 |
| 18-Aug-2023 |
Chandan Babu R <chandan.babu@oracle.com> |
Merge tag 'scrub-bmap-fixes-6.6_2023-08-10' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.6-mergeA
xfs: fixes for the block mapping checker
This series amends the f
Merge tag 'scrub-bmap-fixes-6.6_2023-08-10' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.6-mergeA
xfs: fixes for the block mapping checker
This series amends the file extent map checking code so that nonexistent cow/attr forks get the ENOENT return they're supposed to; and fixes some incorrect logic about the presence of a cow fork vs. reflink iflag.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
* tag 'scrub-bmap-fixes-6.6_2023-08-10' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux: xfs: don't check reflink iflag state when checking cow fork xfs: simplify returns in xchk_bmap xfs: rewrite xchk_inode_is_allocated to work properly xfs: hide xfs_inode_is_allocated in scrub common code
show more ...
|
#
5221002c |
| 18-Aug-2023 |
Chandan Babu R <chandan.babu@oracle.com> |
Merge tag 'repair-force-rebuild-6.6_2023-08-10' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.6-mergeA
xfs: force rebuilding of metadata
This patchset adds a new IF
Merge tag 'repair-force-rebuild-6.6_2023-08-10' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.6-mergeA
xfs: force rebuilding of metadata
This patchset adds a new IFLAG to the scrub ioctl so that userspace can force a rebuild of an otherwise consistent piece of metadata. This will eventually enable the use of online repair to relocate metadata during a filesystem reorganization (e.g. shrink). For now, it facilitates stress testing of online repair without needing the debugging knobs to be enabled.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
* tag 'repair-force-rebuild-6.6_2023-08-10' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux: xfs: allow userspace to rebuild metadata structures xfs: don't complain about unfixed metadata when repairs were injected
show more ...
|
#
df783323 |
| 18-Aug-2023 |
Chandan Babu R <chandan.babu@oracle.com> |
Merge tag 'scrub-rtsummary-6.6_2023-08-10' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.6-mergeA
xfs: online scrubbing of realtime summary files
This patchset impl
Merge tag 'scrub-rtsummary-6.6_2023-08-10' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.6-mergeA
xfs: online scrubbing of realtime summary files
This patchset implements an online checker for the realtime summary file. The first few changes are some general cleanups -- scrub should get its own references to all inodes, and we also wrap the inode lock functions so that we can standardize unlocking and releasing inodes that are the focus of a scrub.
With that out of the way, we move on to constructing a shadow copy of the rtsummary information from the rtbitmap, and compare the new copy against the ondisk copy.
This has been running on the djcloud for years with no problems. Enjoy!
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
* tag 'scrub-rtsummary-6.6_2023-08-10' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux: xfs: implement online scrubbing of rtsummary info xfs: move the realtime summary file scrubber to a separate source file xfs: wrap ilock/iunlock operations on sc->ip xfs: get our own reference to inodes that we want to scrub
show more ...
|
#
d668fc1f |
| 18-Aug-2023 |
Chandan Babu R <chandan.babu@oracle.com> |
Merge tag 'big-array-6.6_2023-08-10' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.6-mergeA
xfs: stage repair information in pageable memory
In general, online repa
Merge tag 'big-array-6.6_2023-08-10' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.6-mergeA
xfs: stage repair information in pageable memory
In general, online repair of an indexed record set walks the filesystem looking for records. These records are sorted and bulk-loaded into a new btree. To make this happen without pinning gigabytes of metadata in memory, first create an abstraction ('xfile') of memfd files so that kernel code can access paged memory, and then an array abstraction ('xfarray') based on xfiles so that online repair can create an array of new records without pinning memory.
These two data storage abstractions are critical for repair of space metadata -- the memory used is pageable, which helps us avoid pinning kernel memory and driving OOM problems; and they are byte-accessible enough that we can use them like (very slow and programmatic) memory buffers.
Later patchsets will build on this functionality to provide blob storage and btrees.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
* tag 'big-array-6.6_2023-08-10' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux: xfs: improve xfarray quicksort pivot xfs: cache pages used for xfarray quicksort convergence xfs: speed up xfarray sort by sorting xfile page contents directly xfs: teach xfile to pass back direct-map pages to caller xfs: convert xfarray insertion sort to heapsort using scratchpad memory xfs: enable sorting of xfile-backed arrays xfs: create a big array data structure
show more ...
|
#
81fbc5f9 |
| 18-Aug-2023 |
Chandan Babu R <chandan.babu@oracle.com> |
Merge tag 'repair-reap-fixes-6.6_2023-08-10' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.6-mergeA
xfs: fix online repair block reaping
These patches fix a few pro
Merge tag 'repair-reap-fixes-6.6_2023-08-10' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.6-mergeA
xfs: fix online repair block reaping
These patches fix a few problems that I noticed in the code that deals with old btree blocks after a successful repair.
First, I observed that it is possible for repair to incorrectly invalidate and delete old btree blocks if they were crosslinked. The solution here is to consult the reverse mappings for each block in the extent -- singly owned blocks are invalidated and freed, whereas for crosslinked blocks, we merely drop the incorrect reverse mapping.
A largeish change in this patchset is moving the reaping code to a separate file, because the code are mostly interrelated static functions. For now this also drops the ability to reap file blocks, which will return when we add the bmbt repair functions.
Second, we convert the reap function to use EFIs so that we can commit to freeing as many blocks in as few transactions as we dare. We would like to free as many old blocks as we can in the same transaction that commits the new structure to the ondisk filesystem to minimize the number of blocks that leak if the system crashes before the repair fully completes.
The third change made in this series is to avoid tripping buffer cache assertions if we're merely scanning the buffer cache for buffers to invalidate, and find a non-stale buffer of the wrong length. This is primarily cosmetic, but makes my life easier.
The fourth change restructures the reaping code to try to process as many blocks in one go as possible, to reduce logging traffic.
The last change switches the reaping mechanism to use per-AG bitmaps defined in a previous patchset. This should reduce type confusion when reading the source code.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
* tag 'repair-reap-fixes-6.6_2023-08-10' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux: xfs: use per-AG bitmaps to reap unused AG metadata blocks during repair xfs: reap large AG metadata extents when possible xfs: allow scanning ranges of the buffer cache for live buffers xfs: rearrange xrep_reap_block to make future code flow easier xfs: use deferred frees to reap old btree blocks xfs: only allow reaping of per-AG blocks in xrep_reap_extents xfs: only invalidate blocks if we're going to free them xfs: move the post-repair block reaping code to a separate file xfs: cull repair code that will never get used
show more ...
|
Revision tags: v6.1.46, v6.1.45 |
|
#
369c001b |
| 10-Aug-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: rewrite xchk_inode_is_allocated to work properly
Back in the mists of time[1], I proposed this function to assist the inode btree scrubbers in checking the inode btree contents against the allo
xfs: rewrite xchk_inode_is_allocated to work properly
Back in the mists of time[1], I proposed this function to assist the inode btree scrubbers in checking the inode btree contents against the allocation state of the inode records. The original version performed a direct lookup in the inode cache and returned the allocation status if the cached inode hadn't been reused and wasn't in an intermediate state. Brian thought it would be better to use the usual iget/irele mechanisms, so that was changed for the final version.
Unfortunately, this hasn't aged well -- the IGET_INCORE flag only has one user and clutters up the regular iget path, which makes it hard to reason about how it actually works. Worse yet, the inode inactivation series silently broke it because iget won't return inodes that are anywhere in the inactivation machinery, even though the caller is already required to prevent inode allocation and freeing. Inodes in the inactivation machinery are still allocated, but the current code's interactions with the iget code prevent us from being able to say that.
Now that I understand the inode lifecycle better than I did in early 2017, I now realize that as long as the cached inode hasn't been reused and isn't actively being reclaimed, it's safe to access the i_mode field (with the AGI, rcu, and i_flags locks held), and we don't need to worry about the inode being freed out from under us.
Therefore, port the original version to modern code structure, which fixes the brokennes w.r.t. inactivation. In the next patch we'll remove IGET_INCORE since it's no longer necessary.
[1] https://lore.kernel.org/linux-xfs/149643868294.23065.8094890990886436794.stgit@birch.djwong.org/
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
show more ...
|
#
5c83df2e |
| 10-Aug-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: allow userspace to rebuild metadata structures
Add a new (superuser-only) flag to the online metadata repair ioctl to force it to rebuild structures, even if they're not broken. We will use th
xfs: allow userspace to rebuild metadata structures
Add a new (superuser-only) flag to the online metadata repair ioctl to force it to rebuild structures, even if they're not broken. We will use this to move metadata structures out of the way during a free space defragmentation operation.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
show more ...
|
#
526aab5f |
| 10-Aug-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: implement online scrubbing of rtsummary info
Finish the realtime summary scrubber by adding the functions we need to compute a fresh copy of the rtsummary info and comparing it to the copy on d
xfs: implement online scrubbing of rtsummary info
Finish the realtime summary scrubber by adding the functions we need to compute a fresh copy of the rtsummary info and comparing it to the copy on disk.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
show more ...
|
#
e5b46c75 |
| 10-Aug-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: speed up xfarray sort by sorting xfile page contents directly
If all the records in an xfarray subset live within the same memory page, we can short-circuit even more quicksort recursion by map
xfs: speed up xfarray sort by sorting xfile page contents directly
If all the records in an xfarray subset live within the same memory page, we can short-circuit even more quicksort recursion by mapping that page into the local CPU and using the kernel's heapsort function to sort the subset. On the author's computer, this reduces the runtime by another 15% on a 500,000 element array.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev> Reviewed-by: Dave Chinner <dchinner@redhat.com>
show more ...
|
#
137db333 |
| 10-Aug-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: teach xfile to pass back direct-map pages to caller
Certain xfile array operations (such as sorting) can be sped up quite a bit by allowing xfile users to grab a page to bulk-read the records c
xfs: teach xfile to pass back direct-map pages to caller
Certain xfile array operations (such as sorting) can be sped up quite a bit by allowing xfile users to grab a page to bulk-read the records contained within it. Create helper methods to facilitate this.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev> Reviewed-by: Dave Chinner <dchinner@redhat.com>
show more ...
|
#
c390c645 |
| 10-Aug-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: convert xfarray insertion sort to heapsort using scratchpad memory
In the previous patch, we created a very basic quicksort implementation for xfile arrays. While the use of an alternate sorti
xfs: convert xfarray insertion sort to heapsort using scratchpad memory
In the previous patch, we created a very basic quicksort implementation for xfile arrays. While the use of an alternate sorting algorithm to avoid quicksort recursion on very small subsets reduces the runtime modestly, we could do better than a load and store-heavy insertion sort, particularly since each load and store requires a page mapping lookup in the xfile.
For a small increase in kernel memory requirements, we could instead bulk load the xfarray records into memory, use the kernel's existing heapsort implementation to sort the records, and bulk store the memory buffer back into the xfile. On the author's computer, this reduces the runtime by about 5% on a 500,000 element array.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev> Reviewed-by: Dave Chinner <dchinner@redhat.com>
show more ...
|
#
232ea052 |
| 10-Aug-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: enable sorting of xfile-backed arrays
The btree bulk loading code requires that records be provided in the correct record sort order for the given btree type. In general, repair code cannot be
xfs: enable sorting of xfile-backed arrays
The btree bulk loading code requires that records be provided in the correct record sort order for the given btree type. In general, repair code cannot be required to collect records in order, and it is not feasible to insert new records in the middle of an array to maintain sort order.
Implement a sorting algorithm so that we can sort the records just prior to bulk loading. In principle, an xfarray could consume many gigabytes of memory and its backing pages can be sent out to disk at any time. This means that we cannot map the entire array into memory at once, so we must find a way to divide the work into smaller portions (e.g. a page) that /can/ be mapped into memory.
Quicksort seems like a reasonable fit for this purpose, since it uses a divide and conquer strategy to keep its average runtime logarithmic. The solution presented here is a port of the glibc implementation, which itself is derived from the median-of-three and tail call recursion strategies outlined by Sedgwick.
Subsequent patches will optimize the implementation further by utilizing the kernel's heapsort on directly-mapped memory whenever possible, and improving the quicksort pivot selection algorithm to try to avoid O(n^2) collapses.
Note: The sorting functionality gets its own patch because the basic big array mechanisms were plenty for a single code patch.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev> Reviewed-by: Dave Chinner <dchinner@redhat.com>
show more ...
|
#
3934e8eb |
| 10-Aug-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: create a big array data structure
Create a simple 'big array' data structure for storage of fixed-size metadata records that will be used to reconstruct a btree index. For repair operations, t
xfs: create a big array data structure
Create a simple 'big array' data structure for storage of fixed-size metadata records that will be used to reconstruct a btree index. For repair operations, the most important operations are append, iterate, and sort.
Earlier implementations of the big array used linked lists and suffered from severe problems -- pinning all records in kernel memory was not a good idea and frequently lead to OOM situations; random access was very inefficient; and record overhead for the lists was unacceptably high at 40-60%.
Therefore, the big memory array relies on the 'xfile' abstraction, which creates a memfd file and stores the records in page cache pages. Since the memfd is created in tmpfs, the memory pages can be pushed out to disk if necessary and we have a built-in usage limit of 50% of physical memory.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev> Reviewed-by: Dave Chinner <dchinner@redhat.com>
show more ...
|
#
1c7ce115 |
| 10-Aug-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: reap large AG metadata extents when possible
When we're freeing extents that have been set in a bitmap, break the bitmap extent into multiple sub-extents organized by fate, and reap the extents
xfs: reap large AG metadata extents when possible
When we're freeing extents that have been set in a bitmap, break the bitmap extent into multiple sub-extents organized by fate, and reap the extents. This enables us to dispose of old resources more efficiently than doing them block by block.
While we're at it, rename the reaping functions to make it clear that they're reaping per-AG extents.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
show more ...
|