#
27c1936a |
| 12-Apr-2021 |
Miklos Szeredi <mszeredi@redhat.com> |
ovl: allow upperdir inside lowerdir
commit 708fa01597fa002599756bf56a96d0de1677375c upstream.
Commit 146d62e5a586 ("ovl: detect overlapping layers") made sure we don't have overlapping layers, but
ovl: allow upperdir inside lowerdir
commit 708fa01597fa002599756bf56a96d0de1677375c upstream.
Commit 146d62e5a586 ("ovl: detect overlapping layers") made sure we don't have overlapping layers, but it also broke the arguably valid use case of
mount -olowerdir=/,upperdir=/subdir,..
where upperdir overlaps lowerdir on the same filesystem. This has been causing regressions.
Revert the check, but only for the specific case where upperdir and/or workdir are subdirectories of lowerdir. Any other overlap (e.g. lowerdir is subdirectory of upperdir, etc) case is crazy, so leave the check in place for those.
Overlaps are detected at lookup time too, so reverting the mount time check should be safe.
Fixes: 146d62e5a586 ("ovl: detect overlapping layers") Cc: <stable@vger.kernel.org> # v5.2 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
e5c376c4 |
| 12-Nov-2020 |
Miklos Szeredi <mszeredi@redhat.com> |
ovl: expand warning in ovl_d_real()
commit cef4cbff06fbc3be54d6d79ee139edecc2ee8598 upstream.
There was a syzbot report with this warning but insufficient information...
Signed-off-by: Miklos Szer
ovl: expand warning in ovl_d_real()
commit cef4cbff06fbc3be54d6d79ee139edecc2ee8598 upstream.
There was a syzbot report with this warning but insufficient information...
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
8ccf963c |
| 07-Jan-2021 |
Sargun Dhillon <sargun@sargun.me> |
ovl: implement volatile-specific fsync error behaviour
commit 335d3fc57941e5c6164c69d439aec1cb7a800876 upstream.
Overlayfs's volatile option allows the user to bypass all forced sync calls to the u
ovl: implement volatile-specific fsync error behaviour
commit 335d3fc57941e5c6164c69d439aec1cb7a800876 upstream.
Overlayfs's volatile option allows the user to bypass all forced sync calls to the upperdir filesystem. This comes at the cost of safety. We can never ensure that the user's data is intact, but we can make a best effort to expose whether or not the data is likely to be in a bad state.
The best way to handle this in the time being is that if an overlayfs's upperdir experiences an error after a volatile mount occurs, that error will be returned on fsync, fdatasync, sync, and syncfs. This is contradictory to the traditional behaviour of VFS which fails the call once, and only raises an error if a subsequent fsync error has occurred, and been raised by the filesystem.
One awkward aspect of the patch is that we have to manually set the superblock's errseq_t after the sync_fs callback as opposed to just returning an error from syncfs. This is because the call chain looks something like this:
sys_syncfs -> sync_filesystem -> __sync_filesystem -> /* The return value is ignored here sb->s_op->sync_fs(sb) _sync_blockdev /* Where the VFS fetches the error to raise to userspace */ errseq_check_and_advance
Because of this we call errseq_set every time the sync_fs callback occurs. Due to the nature of this seen / unseen dichotomy, if the upperdir is an inconsistent state at the initial mount time, overlayfs will refuse to mount, as overlayfs cannot get a snapshot of the upperdir's errseq that will increment on error until the user calls syncfs.
Signed-off-by: Sargun Dhillon <sargun@sargun.me> Suggested-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Fixes: c86243b090bc ("ovl: provide a mount option "volatile"") Cc: stable@vger.kernel.org Reviewed-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
Revision tags: v5.9, v5.8.14, v5.8.13, v5.8.12, v5.8.11, v5.8.10, v5.8.9, v5.8.8, v5.8.7, v5.8.6, v5.4.62 |
|
#
610afc0b |
| 02-Sep-2020 |
Miklos Szeredi <mszeredi@redhat.com> |
ovl: pass ovl_fs down to functions accessing private xattrs
This paves the way for optionally using the "user.overlay." xattr namespace.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
#
26150ab5 |
| 02-Sep-2020 |
Miklos Szeredi <mszeredi@redhat.com> |
ovl: drop flags argument from ovl_do_setxattr()
All callers pass zero flags to ovl_do_setxattr(). So drop this argument.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
#
71097047 |
| 02-Sep-2020 |
Miklos Szeredi <mszeredi@redhat.com> |
ovl: adhere to the vfs_ vs. ovl_do_ conventions for xattrs
Call ovl_do_*xattr() when accessing an overlay private xattr, vfs_*xattr() otherwise.
This has an effect on debug output, which is made mo
ovl: adhere to the vfs_ vs. ovl_do_ conventions for xattrs
Call ovl_do_*xattr() when accessing an overlay private xattr, vfs_*xattr() otherwise.
This has an effect on debug output, which is made more consistent by this patch.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
show more ...
|
#
c86243b0 |
| 31-Aug-2020 |
Vivek Goyal <vgoyal@redhat.com> |
ovl: provide a mount option "volatile"
Container folks are complaining that dnf/yum issues too many sync while installing packages and this slows down the image build. Build requirement is such that
ovl: provide a mount option "volatile"
Container folks are complaining that dnf/yum issues too many sync while installing packages and this slows down the image build. Build requirement is such that they don't care if a node goes down while build was still going on. In that case, they will simply throw away unfinished layer and start new build. So they don't care about syncing intermediate state to the disk and hence don't want to pay the price associated with sync.
So they are asking for mount options where they can disable sync on overlay mount point.
They primarily seem to have two use cases.
- For building images, they will mount overlay with nosync and then sync upper layer after unmounting overlay and reuse upper as lower for next layer.
- For running containers, they don't seem to care about syncing upper layer because if node goes down, they will simply throw away upper layer and create a fresh one.
So this patch provides a mount option "volatile" which disables all forms of sync. Now it is caller's responsibility to throw away upper if system crashes or shuts down and start fresh.
With "volatile", I am seeing roughly 20% speed up in my VM where I am just installing emacs in an image. Installation time drops from 31 seconds to 25 seconds when nosync option is used. This is for the case of building on top of an image where all packages are already cached. That way I take out the network operations latency out of the measurement.
Giuseppe is also looking to cut down on number of iops done on the disk. He is complaining that often in cloud their VMs are throttled if they cross the limit. This option can help them where they reduce number of iops (by cutting down on frequent sync and writebacks).
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
show more ...
|
#
235ce9ed |
| 30-Aug-2020 |
Amir Goldstein <amir73il@gmail.com> |
ovl: check for incompatible features in work dir
An incompatible feature is marked by a non-empty directory nested 2 levels deep under "work" dir, e.g.: workdir/work/incompat/volatile.
This commit
ovl: check for incompatible features in work dir
An incompatible feature is marked by a non-empty directory nested 2 levels deep under "work" dir, e.g.: workdir/work/incompat/volatile.
This commit checks for marked incompat features, warns about them and fails to mount the overlay, for example: overlayfs: overlay with incompat feature 'volatile' cannot be mounted
Very old kernels (i.e. v3.18) will fail to remove a non-empty "work" dir and fail the mount. Newer kernels will fail to remove a "work" dir with entries nested 3 levels and fall back to read-only mount.
User mounting with old kernel will see a warning like these in dmesg: overlayfs: cleanup of 'incompat/...' failed (-39) overlayfs: cleanup of 'work/incompat' failed (-39) overlayfs: cleanup of 'ovl-work/work' failed (-39) overlayfs: failed to create directory /vdf/ovl-work/work (errno: 17); mounting read-only
These warnings should give the hint to the user that: 1. mount failure is caused by backward incompatible features 2. mount failure can be resolved by manually removing the "work" directory
There is nothing preventing users on old kernels from manually removing workdir entirely or mounting overlay with a new workdir, so this is in no way a full proof backward compatibility enforcement, but only a best effort.
Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
show more ...
|
Revision tags: v5.8.5, v5.8.4, v5.4.61, v5.8.3, v5.4.60, v5.8.2, v5.4.59, v5.8.1, v5.4.58, v5.4.57, v5.4.56, v5.8, v5.7.12, v5.4.55, v5.7.11, v5.4.54, v5.7.10, v5.4.53, v5.4.52, v5.7.9 |
|
#
f0e1266e |
| 13-Jul-2020 |
Amir Goldstein <amir73il@gmail.com> |
ovl: fix mount option checks for nfs_export with no upperdir
Without upperdir mount option, there is no index dir and the dependency checks nfs_export => index for mount options parsing are incorrec
ovl: fix mount option checks for nfs_export with no upperdir
Without upperdir mount option, there is no index dir and the dependency checks nfs_export => index for mount options parsing are incorrect.
Allow the combination nfs_export=on,index=off with no upperdir and move the check for dependency redirect_dir=nofollow for non-upper mount case to mount options parsing.
Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
show more ...
|
#
470c1563 |
| 13-Jul-2020 |
Amir Goldstein <amir73il@gmail.com> |
ovl: force read-only sb on failure to create index dir
With index feature enabled, on failure to create index dir, overlay is being mounted read-only. However, we do not forbid user to remount over
ovl: force read-only sb on failure to create index dir
With index feature enabled, on failure to create index dir, overlay is being mounted read-only. However, we do not forbid user to remount overlay read-write. Fix that by setting ofs->workdir to NULL, which prevents remount read-write.
Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
show more ...
|
Revision tags: v5.7.8, v5.4.51 |
|
#
a888db31 |
| 08-Jul-2020 |
Amir Goldstein <amir73il@gmail.com> |
ovl: fix regression with re-formatted lower squashfs
Commit 9df085f3c9a2 ("ovl: relax requirement for non null uuid of lower fs") relaxed the requirement for non null uuid with single lower layer to
ovl: fix regression with re-formatted lower squashfs
Commit 9df085f3c9a2 ("ovl: relax requirement for non null uuid of lower fs") relaxed the requirement for non null uuid with single lower layer to allow enabling index and nfs_export features with single lower squashfs.
Fabian reported a regression in a setup when overlay re-uses an existing upper layer and re-formats the lower squashfs image. Because squashfs has no uuid, the origin xattr in upper layer are decoded from the new lower layer where they may resolve to a wrong origin file and user may get an ESTALE or EIO error on lookup.
To avoid the reported regression while still allowing the new features with single lower squashfs, do not allow decoding origin with lower null uuid unless user opted-in to one of the new features that require following the lower inode of non-dir upper (index, xino, metacopy).
Reported-by: Fabian <godi.beat@gmx.net> Link: https://lore.kernel.org/linux-unionfs/32532923.JtPX5UtSzP@fgdesktop/ Fixes: 9df085f3c9a2 ("ovl: relax requirement for non null uuid of lower fs") Cc: stable@vger.kernel.org # v4.20+ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
show more ...
|
Revision tags: v5.4.50, v5.7.7, v5.4.49, v5.7.6, v5.7.5, v5.4.48 |
|
#
20396365 |
| 21-Jun-2020 |
Amir Goldstein <amir73il@gmail.com> |
ovl: fix oops in ovl_indexdir_cleanup() with nfs_export=on
Mounting with nfs_export=on, xfstests overlay/031 triggers a kernel panic since v5.8-rc1 overlayfs updates.
overlayfs: orphan index entry
ovl: fix oops in ovl_indexdir_cleanup() with nfs_export=on
Mounting with nfs_export=on, xfstests overlay/031 triggers a kernel panic since v5.8-rc1 overlayfs updates.
overlayfs: orphan index entry (index/00fb1..., ftype=4000, nlink=2) BUG: kernel NULL pointer dereference, address: 0000000000000030 RIP: 0010:ovl_cleanup_and_whiteout+0x28/0x220 [overlay]
Bisect point at commit c21c839b8448 ("ovl: whiteout inode sharing")
Minimal reproducer: -------------------------------------------------- rm -rf l u w m mkdir -p l u w m mkdir -p l/testdir touch l/testdir/testfile mount -t overlay -o lowerdir=l,upperdir=u,workdir=w,nfs_export=on overlay m echo 1 > m/testdir/testfile umount m rm -rf u/testdir mount -t overlay -o lowerdir=l,upperdir=u,workdir=w,nfs_export=on overlay m umount m --------------------------------------------------
When mount with nfs_export=on, and fail to verify an orphan index, we're cleaning this index from indexdir by calling ovl_cleanup_and_whiteout(). This dereferences ofs->workdir, that was earlier set to NULL.
The design was that ovl->workdir will point at ovl->indexdir, but we are assigning ofs->indexdir to ofs->workdir only after ovl_indexdir_cleanup(). There is no reason not to do it sooner, because once we get success from ofs->indexdir = ovl_workdir_create(... there is no turning back.
Reported-and-tested-by: Murphy Zhou <jencce.kernel@gmail.com> Fixes: c21c839b8448 ("ovl: whiteout inode sharing") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
show more ...
|
Revision tags: v5.7.4, v5.7.3, v5.4.47 |
|
#
24f14009 |
| 16-Jun-2020 |
youngjun <her0gyugyu@gmail.com> |
ovl: inode reference leak in ovl_is_inuse true case.
When "ovl_is_inuse" true case, trap inode reference not put. plus adding the comment explaining sequence of ovl_is_inuse after ovl_setup_trap.
ovl: inode reference leak in ovl_is_inuse true case.
When "ovl_is_inuse" true case, trap inode reference not put. plus adding the comment explaining sequence of ovl_is_inuse after ovl_setup_trap.
Fixes: 0be0bfd2de9d ("ovl: fix regression caused by overlapping layers detection") Cc: <stable@vger.kernel.org> # v4.19+ Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: youngjun <her0gyugyu@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
show more ...
|
Revision tags: v5.4.46, v5.7.2, v5.4.45, v5.7.1 |
|
#
2068cf7d |
| 07-Jun-2020 |
youngjun <her0gyugyu@gmail.com> |
ovl: remove unnecessary lock check
Directory is always locked until "out_unlock" label. So lock check is not needed.
Signed-off-by: youngjun <her0gyugyu@gmail.com> Signed-off-by: Miklos Szeredi <m
ovl: remove unnecessary lock check
Directory is always locked until "out_unlock" label. So lock check is not needed.
Signed-off-by: youngjun <her0gyugyu@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
show more ...
|
#
df820f8d |
| 04-Jun-2020 |
Miklos Szeredi <mszeredi@redhat.com> |
ovl: make private mounts longterm
Overlayfs is using clone_private_mount() to create internal mounts for underlying layers. These are used for operations requiring a path, such as dentry_open().
S
ovl: make private mounts longterm
Overlayfs is using clone_private_mount() to create internal mounts for underlying layers. These are used for operations requiring a path, such as dentry_open().
Since these private mounts are not in any namespace they are treated as short term, "detached" mounts and mntput() involves taking the global mount_lock, which can result in serious cacheline pingpong.
Make these private mounts longterm instead, which trade the penalty on mntput() for a slightly longer shutdown time due to an added RCU grace period when putting these mounts.
Introduce a new helper kern_unmount_many() that can take care of multiple longterm mounts with a single RCU grace period.
Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
show more ...
|
#
b8e42a65 |
| 04-Jun-2020 |
Miklos Szeredi <mszeredi@redhat.com> |
ovl: get rid of redundant members in struct ovl_fs
ofs->upper_mnt is copied to ->layers[0].mnt and ->layers[0].trap could be used instead of a separate ->upperdir_trap.
Split the lowerdir option ea
ovl: get rid of redundant members in struct ovl_fs
ofs->upper_mnt is copied to ->layers[0].mnt and ->layers[0].trap could be used instead of a separate ->upperdir_trap.
Split the lowerdir option early to get the number of layers, then allocate the ->layers array, and finally fill the upper and lower layers, as before.
Get rid of path_put_init() in ovl_lower_dir(), since the only caller will take care of that.
[Colin Ian King] Fix null pointer dereference on null stack pointer on error return found by Coverity.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
show more ...
|
#
08f4c7c8 |
| 04-Jun-2020 |
Miklos Szeredi <mszeredi@redhat.com> |
ovl: add accessor for ofs->upper_mnt
Next patch will remove ofs->upper_mnt, so add an accessor function for this field.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
Revision tags: v5.4.44, v5.7, v5.4.43, v5.4.42, v5.4.41, v5.4.40, v5.4.39, v5.4.38, v5.4.37, v5.4.36, v5.4.35 |
|
#
399c109d |
| 21-Apr-2020 |
Chengguang Xu <cgxu519@mykernel.net> |
ovl: sync dirty data when remounting to ro mode
sync_filesystem() does not sync dirty data for readonly filesystem during umount, so before changing to readonly filesystem we should sync dirty data
ovl: sync dirty data when remounting to ro mode
sync_filesystem() does not sync dirty data for readonly filesystem during umount, so before changing to readonly filesystem we should sync dirty data for data integrity.
Signed-off-by: Chengguang Xu <cgxu519@mykernel.net> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
show more ...
|
#
c21c839b |
| 23-Apr-2020 |
Chengguang Xu <cgxu519@mykernel.net> |
ovl: whiteout inode sharing
Share inode with different whiteout files for saving inode and speeding up delete operation.
If EMLINK is encountered when linking a shared whiteout, create a new one. I
ovl: whiteout inode sharing
Share inode with different whiteout files for saving inode and speeding up delete operation.
If EMLINK is encountered when linking a shared whiteout, create a new one. In case of any other error, disable sharing for this super block.
Note: ofs->whiteout is protected by inode lock on workdir.
Signed-off-by: Chengguang Xu <cgxu519@mykernel.net> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
show more ...
|
#
654255fa |
| 23-Apr-2020 |
Jeffle Xu <jefflexu@linux.alibaba.com> |
ovl: inherit SB_NOSEC flag from upperdir
Since the stacking of regular file operations [1], the overlayfs edition of write_iter() is called when writing regular files.
Since then, xattr lookup is n
ovl: inherit SB_NOSEC flag from upperdir
Since the stacking of regular file operations [1], the overlayfs edition of write_iter() is called when writing regular files.
Since then, xattr lookup is needed on every write since file_remove_privs() is called from ovl_write_iter(), which would become the performance bottleneck when writing small chunks of data. In my test case, file_remove_privs() would consume ~15% CPU when running fstime of unixbench (the workload is repeadly writing 1 KB to the same file) [2].
Inherit the SB_NOSEC flag from upperdir. Since then xattr lookup would be done only once on the first write. Unixbench fstime gets a ~20% performance gain with this patch.
[1] https://lore.kernel.org/lkml/20180606150905.GC9426@magnolia/T/ [2] https://www.spinics.net/lists/linux-unionfs/msg07153.html
Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
show more ...
|
Revision tags: v5.4.34, v5.4.33, v5.4.32 |
|
#
32b1924b |
| 09-Apr-2020 |
Konstantin Khlebnikov <khlebnikov@yandex-team.ru> |
ovl: skip overlayfs superblocks at global sync
Stacked filesystems like overlayfs has no own writeback, but they have to forward syncfs() requests to backend for keeping data integrity.
During glob
ovl: skip overlayfs superblocks at global sync
Stacked filesystems like overlayfs has no own writeback, but they have to forward syncfs() requests to backend for keeping data integrity.
During global sync() each overlayfs instance calls method ->sync_fs() for backend although it itself is in global list of superblocks too. As a result one syscall sync() could write one superblock several times and send multiple disk barriers.
This patch adds flag SB_I_SKIP_SYNC into sb->sb_iflags to avoid that.
Reported-by: Dmitry Monakhov <dmtrmonakhov@yandex-team.ru> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
show more ...
|
Revision tags: v5.4.31 |
|
#
62a8a85b |
| 03-Apr-2020 |
Amir Goldstein <amir73il@gmail.com> |
ovl: index dir act as work dir
With index=on, let index dir act as the work dir for copy up and cleanups. This will help implementing whiteout inode sharing.
We still create the "work" dir on mount
ovl: index dir act as work dir
With index=on, let index dir act as the work dir for copy up and cleanups. This will help implementing whiteout inode sharing.
We still create the "work" dir on mount regardless of index=on and it is used to test the features supported by upper fs. One reason is that before the feature tests, we do not know if index could be enabled or not.
The reason we do not use "index" directory also as workdir with index=off is because the existence of the "index" directory acts as a simple persistent signal that index was enabled on this filesystem and tools may want to use that signal.
Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
show more ...
|
#
b0def88d |
| 09-Apr-2020 |
Amir Goldstein <amir73il@gmail.com> |
ovl: resolve more conflicting mount options
Similar to the way that a conflict between metacopy=on,redirect_dir=off is resolved, also resolve conflicts between nfs_export=on,index=off and nfs_export
ovl: resolve more conflicting mount options
Similar to the way that a conflict between metacopy=on,redirect_dir=off is resolved, also resolve conflicts between nfs_export=on,index=off and nfs_export=on,metacopy=on.
An explicit mount option wins over a default config value. Both explicit mount options result in an error.
Without this change the xfstests group overlay/exportfs are skipped if metacopy is enabled by default.
Reported-by: Chengguang Xu <cgxu519@mykernel.net> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
show more ...
|
Revision tags: v5.4.30, v5.4.29, v5.6, v5.4.28, v5.4.27, v5.4.26, v5.4.25, v5.4.24, v5.4.23, v5.4.22 |
|
#
926e94d7 |
| 21-Feb-2020 |
Amir Goldstein <amir73il@gmail.com> |
ovl: enable xino automatically in more cases
So far, with xino=auto, we only enable xino if we know that all underlying filesystem use 32bit inode numbers.
When users configure overlay with xino=au
ovl: enable xino automatically in more cases
So far, with xino=auto, we only enable xino if we know that all underlying filesystem use 32bit inode numbers.
When users configure overlay with xino=auto, they already declare that they are ready to handle 64bit inode number from overlay.
It is a very common case, that underlying filesystem uses 64bit ino, but rarely or never uses the high inode number bits (e.g. tmpfs, xfs). Leaving it for the users to declare high ino bits are unused with xino=on is not a recipe for many users to enjoy the benefits of xino.
There appears to be very little reason not to enable xino when users declare xino=auto even if we do not know how many bits underlying filesystem uses for inode numbers.
In the worst case of xino bits overflow by real inode number, we already fall back to the non-xino behavior - real inode number with unique pseudo dev or to non persistent inode number and overlay st_dev (for directories).
The only annoyance from auto enabling xino is that xino bits overflow emits a warning to kmsg. Suppress those warnings unless users explicitly asked for xino=on, suggesting that they expected high ino bits to be unused by underlying filesystem.
Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
show more ...
|
#
dfe51d47 |
| 21-Feb-2020 |
Amir Goldstein <amir73il@gmail.com> |
ovl: avoid possible inode number collisions with xino=on
When xino feature is enabled and a real directory inode number overflows the lower xino bits, we cannot map this directory inode number to a
ovl: avoid possible inode number collisions with xino=on
When xino feature is enabled and a real directory inode number overflows the lower xino bits, we cannot map this directory inode number to a unique and persistent inode number and we fall back to the real inode st_ino and overlay st_dev.
The real inode st_ino with high bits may collide with a lower inode number on overlay st_dev that was mapped using xino.
To avoid possible collision with legitimate xino values, map a non persistent inode number to a dedicated range in the xino address space. The dedicated range is created by adding one more bit to the number of reserved high xino bits. We could have added just one more fsid, but that would have had the undesired effect of changing persistent overlay inode numbers on kernel or require more complex xino mapping code.
Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
show more ...
|