History log of /openbmc/linux/fs/xfs/scrub/trace.h (Results 1 – 25 of 248)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 55e43d6a 05-Jan-2025 Andrew Jeffery <andrew@codeconstruct.com.au>

Merge tag 'v6.6.68' into for/openbmc/dev-6.6

This is the 6.6.68 stable release


Revision tags: v6.6.69, v6.6.68, v6.6.67
# 1bee32f3 18-Dec-2024 Darrick J. Wong <djwong@kernel.org>

xfs: fix file_path handling in tracepoints

commit 19ebc8f84ea12e18dd6c8d3ecaf87bcf4666eee1 upstream.

[backport: only apply fix for 3934e8ebb7cc6]

Since file_path() takes the output buffer as one o

xfs: fix file_path handling in tracepoints

commit 19ebc8f84ea12e18dd6c8d3ecaf87bcf4666eee1 upstream.

[backport: only apply fix for 3934e8ebb7cc6]

Since file_path() takes the output buffer as one of its arguments, we
might as well have it format directly into the tracepoint's char array
instead of wasting stack space.

Fixes: 3934e8ebb7cc6 ("xfs: create a big array data structure")
Fixes: 5076a6040ca16 ("xfs: support in-memory buffer cache targets")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202403290419.HPcyvqZu-lkp@intel.com/
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
Signed-off-by: Catherine Hoang <catherine.hoang@oracle.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


# 16f6ccde 19-Dec-2024 Andrew Jeffery <andrew@codeconstruct.com.au>

Merge tag 'v6.6.67' into for/openbmc/dev-6.6

This is the 6.6.67 stable release


Revision tags: v6.6.66, v6.6.65, v6.6.64
# 08b1325d 02-Dec-2024 Darrick J. Wong <djwong@kernel.org>

xfs: fix scrub tracepoints when inode-rooted btrees are involved

commit ffc3ea4f3c1cc83a86b7497b0c4b0aee7de5480d upstream.

Fix a minor mistakes in the scrub tracepoints that can manifest when
inode

xfs: fix scrub tracepoints when inode-rooted btrees are involved

commit ffc3ea4f3c1cc83a86b7497b0c4b0aee7de5480d upstream.

Fix a minor mistakes in the scrub tracepoints that can manifest when
inode-rooted btrees are enabled. The existing code worked fine for bmap
btrees, but we should tighten the code up to be less sloppy.

Cc: <stable@vger.kernel.org> # v5.7
Fixes: 92219c292af8dd ("xfs: convert btree cursor inode-private member names")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

show more ...


Revision tags: v6.6.63, v6.6.62, v6.6.61, v6.6.60, v6.6.59, v6.6.58, v6.6.57, v6.6.56, v6.6.55, v6.6.54, v6.6.53, v6.6.52, v6.6.51, v6.6.50, v6.6.49, v6.6.48, v6.6.47, v6.6.46, v6.6.45, v6.6.44, v6.6.43, v6.6.42, v6.6.41, v6.6.40, v6.6.39, v6.6.38, v6.6.37, v6.6.36, v6.6.35, v6.6.34, v6.6.33, v6.6.32, v6.6.31, v6.6.30, v6.6.29, v6.6.28, v6.6.27, v6.6.26, v6.6.25
# 46eeaa11 03-Apr-2024 Andrew Jeffery <andrew@codeconstruct.com.au>

Merge tag 'v6.6.24' into dev-6.6

This is the 6.6.24 stable release


Revision tags: v6.6.24
# 57a20b61 26-Mar-2024 Darrick J. Wong <djwong@kernel.org>

xfs: convert rt bitmap extent lengths to xfs_rtbxlen_t

commit f29c3e745dc253bf9d9d06ddc36af1a534ba1dd0 upstream.

XFS uses xfs_rtblock_t for many different uses, which makes it much more
difficult t

xfs: convert rt bitmap extent lengths to xfs_rtbxlen_t

commit f29c3e745dc253bf9d9d06ddc36af1a534ba1dd0 upstream.

XFS uses xfs_rtblock_t for many different uses, which makes it much more
difficult to perform a unit analysis on the codebase. One of these
(ab)uses is when we need to store the length of a free space extent as
stored in the realtime bitmap. Because there can be up to 2^64 realtime
extents in a filesystem, we need a new type that is larger than
xfs_rtxlen_t for callers that are querying the bitmap directly. This
means scrub and growfs.

Create this type as "xfs_rtbxlen_t" and use it to store 64-bit rtx
lengths. 'b' stands for 'bitmap' or 'big'; reader's choice.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Catherine Hoang <catherine.hoang@oracle.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

show more ...


Revision tags: v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3
# c900529f 12-Sep-2023 Thomas Zimmermann <tzimmermann@suse.de>

Merge drm/drm-fixes into drm-misc-fixes

Forwarding to v6.6-rc1.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>


Revision tags: v6.5.2, v6.1.51, v6.5.1
# 1ac731c5 30-Aug-2023 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge branch 'next' into for-linus

Prepare input updates for 6.6 merge window.


# 53ea7f62 30-Aug-2023 Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'xfs-6.6-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs updates from Chandan Babu:

- Chandan Babu will be taking over as the XFS release manager. He has
reviewed a

Merge tag 'xfs-6.6-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs updates from Chandan Babu:

- Chandan Babu will be taking over as the XFS release manager. He has
reviewed all the patches that are in this branch, though I'm signing
the branch one last time since I'm still technically maintainer. :P

- Create a maintainer entry profile for XFS in which we lay out the
various roles that I have played for many years. Aside from release
manager, the remaining roles are as yet unfilled.

- Start merging online repair -- we now have in-memory pageable memory
for staging btrees, a bunch of pending fixes, and we've started the
process of refactoring the scrub support code to support more of
repair. In particular, reaping of old blocks from damaged structures.

- Scrub the realtime summary file.

- Fix a bug where scrub's quota iteration only ever returned the root
dquot. Oooops.

- Fix some typos.

[ Pull request from Chandan Babu, but signed tag and description from
Darrick Wong, thus the first person singular above is Darrick, not
Chandan ]

* tag 'xfs-6.6-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (37 commits)
fs/xfs: Fix typos in comments
xfs: fix dqiterate thinko
xfs: don't check reflink iflag state when checking cow fork
xfs: simplify returns in xchk_bmap
xfs: rewrite xchk_inode_is_allocated to work properly
xfs: hide xfs_inode_is_allocated in scrub common code
xfs: fix agf_fllast when repairing an empty AGFL
xfs: allow userspace to rebuild metadata structures
xfs: clear pagf_agflreset when repairing the AGFL
xfs: allow the user to cancel repairs before we start writing
xfs: don't complain about unfixed metadata when repairs were injected
xfs: implement online scrubbing of rtsummary info
xfs: always rescan allegedly healthy per-ag metadata after repair
xfs: move the realtime summary file scrubber to a separate source file
xfs: wrap ilock/iunlock operations on sc->ip
xfs: get our own reference to inodes that we want to scrub
xfs: track usage statistics of online fsck
xfs: improve xfarray quicksort pivot
xfs: create scaffolding for creating debugfs entries
xfs: cache pages used for xfarray quicksort convergence
...

show more ...


Revision tags: v6.1.50
# 511fb5ba 28-Aug-2023 Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'v6.6-vfs.super' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull superblock updates from Christian Brauner:
"This contains the super rework that was ready for this cycle. Th

Merge tag 'v6.6-vfs.super' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull superblock updates from Christian Brauner:
"This contains the super rework that was ready for this cycle. The
first part changes the order of how we open block devices and allocate
superblocks, contains various cleanups, simplifications, and a new
mechanism to wait on superblock state changes.

This unblocks work to ultimately limit the number of writers to a
block device. Jan has already scheduled follow-up work that will be
ready for v6.7 and allows us to restrict the number of writers to a
given block device. That series builds on this work right here.

The second part contains filesystem freezing updates.

Overview:

The generic superblock changes are rougly organized as follows
(ignoring additional minor cleanups):

(1) Removal of the bd_super member from struct block_device.

This was a very odd back pointer to struct super_block with
unclear rules. For all relevant places we have other means to get
the same information so just get rid of this.

(2) Simplify rules for superblock cleanup.

Roughly, everything that is allocated during fs_context
initialization and that's stored in fs_context->s_fs_info needs
to be cleaned up by the fs_context->free() implementation before
the superblock allocation function has been called successfully.

After sget_fc() returned fs_context->s_fs_info has been
transferred to sb->s_fs_info at which point sb->kill_sb() if
fully responsible for cleanup. Adhering to these rules means that
cleanup of sb->s_fs_info in fill_super() is to be avoided as it's
brittle and inconsistent.

Cleanup shouldn't be duplicated between sb->put_super() as
sb->put_super() is only called if sb->s_root has been set aka
when the filesystem has been successfully born (SB_BORN). That
complexity should be avoided.

This also means that block devices are to be closed in
sb->kill_sb() instead of sb->put_super(). More details in the
lower section.

(3) Make it possible to lookup or create a superblock before opening
block devices

There's a subtle dependency on (2) as some filesystems did rely
on fill_super() to be called in order to correctly clean up
sb->s_fs_info. All these filesystems have been fixed.

(4) Switch most filesystem to follow the same logic as the generic
mount code now does as outlined in (3).

(5) Use the superblock as the holder of the block device. We can now
easily go back from block device to owning superblock.

(6) Export and extend the generic fs_holder_ops and use them as
holder ops everywhere and remove the filesystem specific holder
ops.

(7) Call from the block layer up into the filesystem layer when the
block device is removed, allowing to shut down the filesystem
without risk of deadlocks.

(8) Get rid of get_super().

We can now easily go back from the block device to owning
superblock and can call up from the block layer into the
filesystem layer when the device is removed. So no need to wade
through all registered superblock to find the owning superblock
anymore"

Link: https://lore.kernel.org/lkml/20230824-prall-intakt-95dbffdee4a0@brauner/

* tag 'v6.6-vfs.super' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (47 commits)
super: use higher-level helper for {freeze,thaw}
super: wait until we passed kill super
super: wait for nascent superblocks
super: make locking naming consistent
super: use locking helpers
fs: simplify invalidate_inodes
fs: remove get_super
block: call into the file system for ioctl BLKFLSBUF
block: call into the file system for bdev_mark_dead
block: consolidate __invalidate_device and fsync_bdev
block: drop the "busy inodes on changed media" log message
dasd: also call __invalidate_device when setting the device offline
amiflop: don't call fsync_bdev in FDFMTBEG
floppy: call disk_force_media_change when changing the format
block: simplify the disk_force_media_change interface
nbd: call blk_mark_disk_dead in nbd_clear_sock_ioctl
xfs use fs_holder_ops for the log and RT devices
xfs: drop s_umount over opening the log and RT devices
ext4: use fs_holder_ops for the log device
ext4: drop s_umount over opening the log device
...

show more ...


Revision tags: v6.5, v6.1.49, v6.1.48
# cd4284cf 23-Aug-2023 Christian Brauner <brauner@kernel.org>

Merge tag 'vfs-6.6-merge-3' of ssh://gitolite.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs online fsck update from Darrick Wong:

New code for 6.6:

* Allow the kernel to initiate a freeze of a fil

Merge tag 'vfs-6.6-merge-3' of ssh://gitolite.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs online fsck update from Darrick Wong:

New code for 6.6:

* Allow the kernel to initiate a freeze of a filesystem. The kernel
and userspace can both hold a freeze on a filesystem at the same
time; the freeze is not lifted until /both/ holders lift it. This
will enable us to fix a longstanding bug in XFS online fsck.
* Use kernel-initated fsfreeze to fix some longstanding false negatives
in online fsck of the free space and inode counters.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Message-Id: <20230822182604.GB11286@frogsfrogsfrogs>
Signed-off-by: Christian Brauner <brauner@kernel.org>

show more ...


# 220c8d57 18-Aug-2023 Chandan Babu R <chandan.babu@oracle.com>

Merge tag 'scrub-bmap-fixes-6.6_2023-08-10' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.6-mergeA

xfs: fixes for the block mapping checker

This series amends the f

Merge tag 'scrub-bmap-fixes-6.6_2023-08-10' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.6-mergeA

xfs: fixes for the block mapping checker

This series amends the file extent map checking code so that nonexistent
cow/attr forks get the ENOENT return they're supposed to; and fixes some
incorrect logic about the presence of a cow fork vs. reflink iflag.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>

* tag 'scrub-bmap-fixes-6.6_2023-08-10' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
xfs: don't check reflink iflag state when checking cow fork
xfs: simplify returns in xchk_bmap
xfs: rewrite xchk_inode_is_allocated to work properly
xfs: hide xfs_inode_is_allocated in scrub common code

show more ...


# 5221002c 18-Aug-2023 Chandan Babu R <chandan.babu@oracle.com>

Merge tag 'repair-force-rebuild-6.6_2023-08-10' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.6-mergeA

xfs: force rebuilding of metadata

This patchset adds a new IF

Merge tag 'repair-force-rebuild-6.6_2023-08-10' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.6-mergeA

xfs: force rebuilding of metadata

This patchset adds a new IFLAG to the scrub ioctl so that userspace can
force a rebuild of an otherwise consistent piece of metadata. This will
eventually enable the use of online repair to relocate metadata during a
filesystem reorganization (e.g. shrink). For now, it facilitates stress
testing of online repair without needing the debugging knobs to be
enabled.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>

* tag 'repair-force-rebuild-6.6_2023-08-10' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
xfs: allow userspace to rebuild metadata structures
xfs: don't complain about unfixed metadata when repairs were injected

show more ...


# df783323 18-Aug-2023 Chandan Babu R <chandan.babu@oracle.com>

Merge tag 'scrub-rtsummary-6.6_2023-08-10' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.6-mergeA

xfs: online scrubbing of realtime summary files

This patchset impl

Merge tag 'scrub-rtsummary-6.6_2023-08-10' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.6-mergeA

xfs: online scrubbing of realtime summary files

This patchset implements an online checker for the realtime summary
file. The first few changes are some general cleanups -- scrub should
get its own references to all inodes, and we also wrap the inode lock
functions so that we can standardize unlocking and releasing inodes that
are the focus of a scrub.

With that out of the way, we move on to constructing a shadow copy of
the rtsummary information from the rtbitmap, and compare the new copy
against the ondisk copy.

This has been running on the djcloud for years with no problems. Enjoy!

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>

* tag 'scrub-rtsummary-6.6_2023-08-10' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
xfs: implement online scrubbing of rtsummary info
xfs: move the realtime summary file scrubber to a separate source file
xfs: wrap ilock/iunlock operations on sc->ip
xfs: get our own reference to inodes that we want to scrub

show more ...


# d668fc1f 18-Aug-2023 Chandan Babu R <chandan.babu@oracle.com>

Merge tag 'big-array-6.6_2023-08-10' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.6-mergeA

xfs: stage repair information in pageable memory

In general, online repa

Merge tag 'big-array-6.6_2023-08-10' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.6-mergeA

xfs: stage repair information in pageable memory

In general, online repair of an indexed record set walks the filesystem
looking for records. These records are sorted and bulk-loaded into a
new btree. To make this happen without pinning gigabytes of metadata in
memory, first create an abstraction ('xfile') of memfd files so that
kernel code can access paged memory, and then an array abstraction
('xfarray') based on xfiles so that online repair can create an array of
new records without pinning memory.

These two data storage abstractions are critical for repair of space
metadata -- the memory used is pageable, which helps us avoid pinning
kernel memory and driving OOM problems; and they are byte-accessible
enough that we can use them like (very slow and programmatic) memory
buffers.

Later patchsets will build on this functionality to provide blob storage
and btrees.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>

* tag 'big-array-6.6_2023-08-10' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
xfs: improve xfarray quicksort pivot
xfs: cache pages used for xfarray quicksort convergence
xfs: speed up xfarray sort by sorting xfile page contents directly
xfs: teach xfile to pass back direct-map pages to caller
xfs: convert xfarray insertion sort to heapsort using scratchpad memory
xfs: enable sorting of xfile-backed arrays
xfs: create a big array data structure

show more ...


# 81fbc5f9 18-Aug-2023 Chandan Babu R <chandan.babu@oracle.com>

Merge tag 'repair-reap-fixes-6.6_2023-08-10' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.6-mergeA

xfs: fix online repair block reaping

These patches fix a few pro

Merge tag 'repair-reap-fixes-6.6_2023-08-10' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.6-mergeA

xfs: fix online repair block reaping

These patches fix a few problems that I noticed in the code that deals
with old btree blocks after a successful repair.

First, I observed that it is possible for repair to incorrectly
invalidate and delete old btree blocks if they were crosslinked. The
solution here is to consult the reverse mappings for each block in the
extent -- singly owned blocks are invalidated and freed, whereas for
crosslinked blocks, we merely drop the incorrect reverse mapping.

A largeish change in this patchset is moving the reaping code to a
separate file, because the code are mostly interrelated static
functions. For now this also drops the ability to reap file blocks,
which will return when we add the bmbt repair functions.

Second, we convert the reap function to use EFIs so that we can commit
to freeing as many blocks in as few transactions as we dare. We would
like to free as many old blocks as we can in the same transaction that
commits the new structure to the ondisk filesystem to minimize the
number of blocks that leak if the system crashes before the repair fully
completes.

The third change made in this series is to avoid tripping buffer cache
assertions if we're merely scanning the buffer cache for buffers to
invalidate, and find a non-stale buffer of the wrong length. This is
primarily cosmetic, but makes my life easier.

The fourth change restructures the reaping code to try to process as many
blocks in one go as possible, to reduce logging traffic.

The last change switches the reaping mechanism to use per-AG bitmaps
defined in a previous patchset. This should reduce type confusion when
reading the source code.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>

* tag 'repair-reap-fixes-6.6_2023-08-10' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
xfs: use per-AG bitmaps to reap unused AG metadata blocks during repair
xfs: reap large AG metadata extents when possible
xfs: allow scanning ranges of the buffer cache for live buffers
xfs: rearrange xrep_reap_block to make future code flow easier
xfs: use deferred frees to reap old btree blocks
xfs: only allow reaping of per-AG blocks in xrep_reap_extents
xfs: only invalidate blocks if we're going to free them
xfs: move the post-repair block reaping code to a separate file
xfs: cull repair code that will never get used

show more ...


Revision tags: v6.1.46, v6.1.45
# 369c001b 10-Aug-2023 Darrick J. Wong <djwong@kernel.org>

xfs: rewrite xchk_inode_is_allocated to work properly

Back in the mists of time[1], I proposed this function to assist the
inode btree scrubbers in checking the inode btree contents against the
allo

xfs: rewrite xchk_inode_is_allocated to work properly

Back in the mists of time[1], I proposed this function to assist the
inode btree scrubbers in checking the inode btree contents against the
allocation state of the inode records. The original version performed a
direct lookup in the inode cache and returned the allocation status if
the cached inode hadn't been reused and wasn't in an intermediate state.
Brian thought it would be better to use the usual iget/irele mechanisms,
so that was changed for the final version.

Unfortunately, this hasn't aged well -- the IGET_INCORE flag only has
one user and clutters up the regular iget path, which makes it hard to
reason about how it actually works. Worse yet, the inode inactivation
series silently broke it because iget won't return inodes that are
anywhere in the inactivation machinery, even though the caller is
already required to prevent inode allocation and freeing. Inodes in the
inactivation machinery are still allocated, but the current code's
interactions with the iget code prevent us from being able to say that.

Now that I understand the inode lifecycle better than I did in early
2017, I now realize that as long as the cached inode hasn't been reused
and isn't actively being reclaimed, it's safe to access the i_mode field
(with the AGI, rcu, and i_flags locks held), and we don't need to worry
about the inode being freed out from under us.

Therefore, port the original version to modern code structure, which
fixes the brokennes w.r.t. inactivation. In the next patch we'll remove
IGET_INCORE since it's no longer necessary.

[1] https://lore.kernel.org/linux-xfs/149643868294.23065.8094890990886436794.stgit@birch.djwong.org/

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>

show more ...


# 5c83df2e 10-Aug-2023 Darrick J. Wong <djwong@kernel.org>

xfs: allow userspace to rebuild metadata structures

Add a new (superuser-only) flag to the online metadata repair ioctl to
force it to rebuild structures, even if they're not broken. We will use
th

xfs: allow userspace to rebuild metadata structures

Add a new (superuser-only) flag to the online metadata repair ioctl to
force it to rebuild structures, even if they're not broken. We will use
this to move metadata structures out of the way during a free space
defragmentation operation.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>

show more ...


# 526aab5f 10-Aug-2023 Darrick J. Wong <djwong@kernel.org>

xfs: implement online scrubbing of rtsummary info

Finish the realtime summary scrubber by adding the functions we need to
compute a fresh copy of the rtsummary info and comparing it to the copy
on d

xfs: implement online scrubbing of rtsummary info

Finish the realtime summary scrubber by adding the functions we need to
compute a fresh copy of the rtsummary info and comparing it to the copy
on disk.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>

show more ...


# e5b46c75 10-Aug-2023 Darrick J. Wong <djwong@kernel.org>

xfs: speed up xfarray sort by sorting xfile page contents directly

If all the records in an xfarray subset live within the same memory
page, we can short-circuit even more quicksort recursion by map

xfs: speed up xfarray sort by sorting xfile page contents directly

If all the records in an xfarray subset live within the same memory
page, we can short-circuit even more quicksort recursion by mapping that
page into the local CPU and using the kernel's heapsort function to sort
the subset. On the author's computer, this reduces the runtime by
another 15% on a 500,000 element array.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev>
Reviewed-by: Dave Chinner <dchinner@redhat.com>

show more ...


# 137db333 10-Aug-2023 Darrick J. Wong <djwong@kernel.org>

xfs: teach xfile to pass back direct-map pages to caller

Certain xfile array operations (such as sorting) can be sped up quite a
bit by allowing xfile users to grab a page to bulk-read the records
c

xfs: teach xfile to pass back direct-map pages to caller

Certain xfile array operations (such as sorting) can be sped up quite a
bit by allowing xfile users to grab a page to bulk-read the records
contained within it. Create helper methods to facilitate this.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev>
Reviewed-by: Dave Chinner <dchinner@redhat.com>

show more ...


# c390c645 10-Aug-2023 Darrick J. Wong <djwong@kernel.org>

xfs: convert xfarray insertion sort to heapsort using scratchpad memory

In the previous patch, we created a very basic quicksort implementation
for xfile arrays. While the use of an alternate sorti

xfs: convert xfarray insertion sort to heapsort using scratchpad memory

In the previous patch, we created a very basic quicksort implementation
for xfile arrays. While the use of an alternate sorting algorithm to
avoid quicksort recursion on very small subsets reduces the runtime
modestly, we could do better than a load and store-heavy insertion sort,
particularly since each load and store requires a page mapping lookup in
the xfile.

For a small increase in kernel memory requirements, we could instead
bulk load the xfarray records into memory, use the kernel's existing
heapsort implementation to sort the records, and bulk store the memory
buffer back into the xfile. On the author's computer, this reduces the
runtime by about 5% on a 500,000 element array.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev>
Reviewed-by: Dave Chinner <dchinner@redhat.com>

show more ...


# 232ea052 10-Aug-2023 Darrick J. Wong <djwong@kernel.org>

xfs: enable sorting of xfile-backed arrays

The btree bulk loading code requires that records be provided in the
correct record sort order for the given btree type. In general, repair
code cannot be

xfs: enable sorting of xfile-backed arrays

The btree bulk loading code requires that records be provided in the
correct record sort order for the given btree type. In general, repair
code cannot be required to collect records in order, and it is not
feasible to insert new records in the middle of an array to maintain
sort order.

Implement a sorting algorithm so that we can sort the records just prior
to bulk loading. In principle, an xfarray could consume many gigabytes
of memory and its backing pages can be sent out to disk at any time.
This means that we cannot map the entire array into memory at once, so
we must find a way to divide the work into smaller portions (e.g. a
page) that /can/ be mapped into memory.

Quicksort seems like a reasonable fit for this purpose, since it uses a
divide and conquer strategy to keep its average runtime logarithmic.
The solution presented here is a port of the glibc implementation, which
itself is derived from the median-of-three and tail call recursion
strategies outlined by Sedgwick.

Subsequent patches will optimize the implementation further by utilizing
the kernel's heapsort on directly-mapped memory whenever possible, and
improving the quicksort pivot selection algorithm to try to avoid O(n^2)
collapses.

Note: The sorting functionality gets its own patch because the basic big
array mechanisms were plenty for a single code patch.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev>
Reviewed-by: Dave Chinner <dchinner@redhat.com>

show more ...


# 3934e8eb 10-Aug-2023 Darrick J. Wong <djwong@kernel.org>

xfs: create a big array data structure

Create a simple 'big array' data structure for storage of fixed-size
metadata records that will be used to reconstruct a btree index. For
repair operations, t

xfs: create a big array data structure

Create a simple 'big array' data structure for storage of fixed-size
metadata records that will be used to reconstruct a btree index. For
repair operations, the most important operations are append, iterate,
and sort.

Earlier implementations of the big array used linked lists and suffered
from severe problems -- pinning all records in kernel memory was not a
good idea and frequently lead to OOM situations; random access was very
inefficient; and record overhead for the lists was unacceptably high at
40-60%.

Therefore, the big memory array relies on the 'xfile' abstraction, which
creates a memfd file and stores the records in page cache pages. Since
the memfd is created in tmpfs, the memory pages can be pushed out to
disk if necessary and we have a built-in usage limit of 50% of physical
memory.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev>
Reviewed-by: Dave Chinner <dchinner@redhat.com>

show more ...


# 1c7ce115 10-Aug-2023 Darrick J. Wong <djwong@kernel.org>

xfs: reap large AG metadata extents when possible

When we're freeing extents that have been set in a bitmap, break the
bitmap extent into multiple sub-extents organized by fate, and reap the
extents

xfs: reap large AG metadata extents when possible

When we're freeing extents that have been set in a bitmap, break the
bitmap extent into multiple sub-extents organized by fate, and reap the
extents. This enables us to dispose of old resources more efficiently
than doing them block by block.

While we're at it, rename the reaping functions to make it clear that
they're reaping per-AG extents.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>

show more ...


12345678910