#
e90334e8 |
| 08-Oct-2021 |
Xiubo Li <xiubli@redhat.com> |
ceph: ignore the truncate when size won't change with Fx caps issued
If the new size is the same as the current size, the MDS will do nothing but change the mtime/atime. POSIX doesn't mandate that t
ceph: ignore the truncate when size won't change with Fx caps issued
If the new size is the same as the current size, the MDS will do nothing but change the mtime/atime. POSIX doesn't mandate that the filesystems must update them in this case, so just ignore it instead.
Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
show more ...
|
#
5d6451b1 |
| 31-Aug-2021 |
Jeff Layton <jlayton@kernel.org> |
ceph: shut down access to inode when async create fails
Add proper error handling for when an async create fails. The inode never existed, so any dirty caps or data are now toast. We already d_drop
ceph: shut down access to inode when async create fails
Add proper error handling for when an async create fails. The inode never existed, so any dirty caps or data are now toast. We already d_drop the dentry in that case, but the now-stale inode may still be around. We want to shut down access to these inodes, and ensure that they can't harbor any more dirty data, which can cause problems at umount time.
When this occurs, flag such inodes as being SHUTDOWN, and trash any caps and cap flushes that may be in flight for them, and invalidate the pagecache for the inode. Add a new helper that can check whether an inode or an entire mount is now shut down, and call it instead of accessing the mount_state directly in places where we test that now.
URL: https://tracker.ceph.com/issues/51279 Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
show more ...
|
#
6407fbb9 |
| 02-Sep-2021 |
Jeff Layton <jlayton@kernel.org> |
ceph: print inode numbers instead of pointer values
We have a lot of log messages that print inode pointer values. This is of dubious utility. Switch a random assortment of the ones I've found most
ceph: print inode numbers instead of pointer values
We have a lot of log messages that print inode pointer values. This is of dubious utility. Switch a random assortment of the ones I've found most useful to use ceph_vinop to print the snap:inum tuple instead.
[ idryomov: use . as a separator, break unnecessarily long lines ]
Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
show more ...
|
#
3ae71635 |
| 02-Mar-2022 |
Xiubo Li <xiubli@redhat.com> |
ceph: fix inode reference leakage in ceph_get_snapdir()
[ Upstream commit 322794d3355c33adcc4feace0045d85a8e4ed813 ]
The ceph_get_inode() will search for or insert a new inode into the hash for the
ceph: fix inode reference leakage in ceph_get_snapdir()
[ Upstream commit 322794d3355c33adcc4feace0045d85a8e4ed813 ]
The ceph_get_inode() will search for or insert a new inode into the hash for the given vino, and return a reference to it. If new is non-NULL, its reference is consumed.
We should release the reference when in error handing cases.
Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
1bd85aa6 |
| 07-Oct-2021 |
Jeff Layton <jlayton@kernel.org> |
ceph: fix handling of "meta" errors
Currently, we check the wb_err too early for directories, before all of the unsafe child requests have been waited on. In order to fix that we need to check the m
ceph: fix handling of "meta" errors
Currently, we check the wb_err too early for directories, before all of the unsafe child requests have been waited on. In order to fix that we need to check the mapping->wb_err later nearer to the end of ceph_fsync.
We also have an overly-complex method for tracking errors after blocklisting. The errors recorded in cleanup_session_requests go to a completely separate field in the inode, but we end up reporting them the same way we would for any other error (in fsync).
There's no real benefit to tracking these errors in two different places, since the only reporting mechanism for them is in fsync, and we'd need to advance them both every time.
Given that, we can just remove i_meta_err, and convert the places that used it to instead just use mapping->wb_err instead. That also fixes the original problem by ensuring that we do a check_and_advance of the wb_err at the end of the fsync op.
Cc: stable@vger.kernel.org URL: https://tracker.ceph.com/issues/52864 Reported-by: Patrick Donnelly <pdonnell@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Xiubo Li <xiubli@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
show more ...
|
#
0ba92e1c |
| 02-Aug-2021 |
Jeff Layton <jlayton@kernel.org> |
ceph: add ceph_change_snap_realm() helper
Consolidate some fiddly code for changing an inode's snap_realm into a new helper function, and change the callers to use it.
While we're in here, nothing
ceph: add ceph_change_snap_realm() helper
Consolidate some fiddly code for changing an inode's snap_realm into a new helper function, and change the callers to use it.
While we're in here, nothing uses the i_snap_realm_counter field, so remove that from the inode.
Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Luis Henriques <lhenriques@suse.de> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
show more ...
|
#
23c2c76e |
| 04-Jun-2021 |
Jeff Layton <jlayton@kernel.org> |
ceph: eliminate ceph_async_iput()
Now that we don't need to hold session->s_mutex or the snap_rwsem when calling ceph_check_caps, we can eliminate ceph_async_iput and just use normal iput calls.
Si
ceph: eliminate ceph_async_iput()
Now that we don't need to hold session->s_mutex or the snap_rwsem when calling ceph_check_caps, we can eliminate ceph_async_iput and just use normal iput calls.
Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
show more ...
|
#
52d60f8e |
| 04-Jun-2021 |
Jeff Layton <jlayton@kernel.org> |
ceph: eliminate session->s_gen_ttl_lock
Turn s_cap_gen field into an atomic_t, and just rely on the fact that we hold the s_mutex when changing the s_cap_ttl field.
Signed-off-by: Jeff Layton <jlay
ceph: eliminate session->s_gen_ttl_lock
Turn s_cap_gen field into an atomic_t, and just rely on the fact that we hold the s_mutex when changing the s_cap_ttl field.
Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Luis Henriques <lhenriques@suse.de> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
show more ...
|
#
7e65624d |
| 09-Jun-2021 |
Jeff Layton <jlayton@kernel.org> |
ceph: allow ceph_put_mds_session to take NULL or ERR_PTR
...to simplify some error paths.
Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Luis Henriques <lhenriques@suse.de> Signed-off
ceph: allow ceph_put_mds_session to take NULL or ERR_PTR
...to simplify some error paths.
Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Luis Henriques <lhenriques@suse.de> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
show more ...
|
#
27171ae6 |
| 01-Jun-2021 |
Jeff Layton <jlayton@kernel.org> |
ceph: must hold snap_rwsem when filling inode for async create
...and add a lockdep assertion for it to ceph_fill_inode().
Cc: stable@vger.kernel.org # v5.7+ Fixes: 9a8d03ca2e2c3 ("ceph: attempt to
ceph: must hold snap_rwsem when filling inode for async create
...and add a lockdep assertion for it to ceph_fill_inode().
Cc: stable@vger.kernel.org # v5.7+ Fixes: 9a8d03ca2e2c3 ("ceph: attempt to do async create when possible") Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
show more ...
|
#
d4f6b31d |
| 01-Apr-2021 |
Jeff Layton <jlayton@kernel.org> |
ceph: don't allow access to MDS-private inodes
The MDS reserves a set of inodes for its own usage, and these should never be accessible to clients. Add a new helper to vet a proposed inode number ag
ceph: don't allow access to MDS-private inodes
The MDS reserves a set of inodes for its own usage, and these should never be accessible to clients. Add a new helper to vet a proposed inode number against that range, and complain loudly and refuse to create or look it up if it's in it.
Also, ensure that the MDS doesn't try to delegate inodes that are in that range or lower. Print a warning if it does, and don't save the range in the xarray.
URL: https://tracker.ceph.com/issues/49922 Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Xiubo Li <xiubli@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
show more ...
|
#
2d6795fb |
| 09-Apr-2021 |
Jeff Layton <jlayton@kernel.org> |
ceph: fix up some bare fetches of i_size
We need to use i_size_read(), which properly handles the torn read case on 32-bit arches.
Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ily
ceph: fix up some bare fetches of i_size
We need to use i_size_read(), which properly handles the torn read case on 32-bit arches.
Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
show more ...
|
#
e7f72952 |
| 27-Aug-2020 |
Yanhu Cao <gmayyyha@gmail.com> |
ceph: support getting ceph.dir.rsnaps vxattr
Add support for grabbing the rsnaps value out of the inode info in traces, and exposing that via ceph.dir.rsnaps xattr.
Signed-off-by: Yanhu Cao <gmayyy
ceph: support getting ceph.dir.rsnaps vxattr
Add support for grabbing the rsnaps value out of the inode info in traces, and exposing that via ceph.dir.rsnaps xattr.
Signed-off-by: Yanhu Cao <gmayyyha@gmail.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
show more ...
|
#
d3c51ae1 |
| 01-Mar-2021 |
Jeff Layton <jlayton@kernel.org> |
ceph: don't clobber i_snap_caps on non-I_NEW inode
We want the snapdir to mirror the non-snapped directory's attributes for most things, but i_snap_caps represents the caps granted on the snapshot d
ceph: don't clobber i_snap_caps on non-I_NEW inode
We want the snapdir to mirror the non-snapped directory's attributes for most things, but i_snap_caps represents the caps granted on the snapshot directory by the MDS itself. A misbehaving MDS could issue different caps for the snapdir and we lose them here.
Only reset i_snap_caps when the inode is I_NEW. Also, move the setting of i_op and i_fop inside the if block since they should never change anyway.
Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
show more ...
|
#
10a7052c |
| 21-Jan-2021 |
Jeff Layton <jlayton@kernel.org> |
ceph: fix fscache invalidation
Ensure that we invalidate the fscache whenever we invalidate the pagecache.
Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmai
ceph: fix fscache invalidation
Ensure that we invalidate the fscache whenever we invalidate the pagecache.
Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
show more ...
|
#
ed94f87c |
| 25-Feb-2021 |
Jeff Layton <jlayton@kernel.org> |
ceph: don't allow type or device number to change on non-I_NEW inodes
Al pointed out that a malicious or broken MDS could change the type or device number of a given inode number. It may also be pos
ceph: don't allow type or device number to change on non-I_NEW inodes
Al pointed out that a malicious or broken MDS could change the type or device number of a given inode number. It may also be possible for the MDS to reuse an old inode number.
Ensure that we never allow fill_inode to change the type part of the i_mode or the i_rdev unless I_NEW is set. Throw warnings if the MDS ever changes these on us mid-stream, and return an error.
Don't set i_rdev directly, and rely on init_special_inode to do it. Also, fix up error handling in the callers of ceph_get_inode.
In handle_cap_grant, check for and warn if the inode type changes, and only overwrite the mode if it didn't.
Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
show more ...
|
#
3e10a15f |
| 25-Feb-2021 |
Jeff Layton <jlayton@kernel.org> |
ceph: fix up error handling with snapdirs
There are several warts in the snapdir error handling. The -EOPNOTSUPP return in __snapfh_to_dentry is currently lost, and the call to ceph_handle_snapdir i
ceph: fix up error handling with snapdirs
There are several warts in the snapdir error handling. The -EOPNOTSUPP return in __snapfh_to_dentry is currently lost, and the call to ceph_handle_snapdir is not currently checked at all.
Fix all of this up and eliminate a BUG_ON in ceph_get_snapdir. We can handle that case with a warning and return an error.
Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
show more ...
|
#
a8810cdc |
| 10-Dec-2020 |
Jeff Layton <jlayton@kernel.org> |
ceph: allow queueing cap/snap handling after putting cap references
Testing with the fscache overhaul has triggered some lockdep warnings about circular lock dependencies involving page_mkwrite and
ceph: allow queueing cap/snap handling after putting cap references
Testing with the fscache overhaul has triggered some lockdep warnings about circular lock dependencies involving page_mkwrite and the mmap_lock. It'd be better to do the "real work" without the mmap lock being held.
Change the skip_checking_caps parameter in __ceph_put_cap_refs to an enum, and use that to determine whether to queue check_caps, do it synchronously or not at all. Change ceph_page_mkwrite to do a ceph_put_cap_refs_async().
Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
show more ...
|
#
64f28c62 |
| 09-Oct-2020 |
Jeff Layton <jlayton@kernel.org> |
ceph: clean up inode work queueing
Add a generic function for taking an inode reference, setting the I_WORK bit and queueing i_work. Turn the ceph_queue_* functions into static inline wrappers that
ceph: clean up inode work queueing
Add a generic function for taking an inode reference, setting the I_WORK bit and queueing i_work. Turn the ceph_queue_* functions into static inline wrappers that pass in the right bit.
Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
show more ...
|
#
549c7297 |
| 21-Jan-2021 |
Christian Brauner <christian.brauner@ubuntu.com> |
fs: make helpers idmap mount aware
Extend some inode methods with an additional user namespace argument. A filesystem that is aware of idmapped mounts will receive the user namespace the mount has b
fs: make helpers idmap mount aware
Extend some inode methods with an additional user namespace argument. A filesystem that is aware of idmapped mounts will receive the user namespace the mount has been marked with. This can be used for additional permission checking and also to enable filesystems to translate between uids and gids if they need to. We have implemented all relevant helpers in earlier patches.
As requested we simply extend the exisiting inode method instead of introducing new ones. This is a little more code churn but it's mostly mechanical and doesnt't leave us with additional inode methods.
Link: https://lore.kernel.org/r/20210121131959.646623-25-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
show more ...
|
#
0d56a451 |
| 21-Jan-2021 |
Christian Brauner <christian.brauner@ubuntu.com> |
stat: handle idmapped mounts
The generic_fillattr() helper fills in the basic attributes associated with an inode. Enable it to handle idmapped mounts. If the inode is accessed through an idmapped m
stat: handle idmapped mounts
The generic_fillattr() helper fills in the basic attributes associated with an inode. Enable it to handle idmapped mounts. If the inode is accessed through an idmapped mount map it into the mount's user namespace before we store the uid and gid. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before.
Link: https://lore.kernel.org/r/20210121131959.646623-12-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
show more ...
|
#
e65ce2a5 |
| 21-Jan-2021 |
Christian Brauner <christian.brauner@ubuntu.com> |
acl: handle idmapped mounts
The posix acl permission checking helpers determine whether a caller is privileged over an inode according to the acls associated with the inode. Add helpers that make it
acl: handle idmapped mounts
The posix acl permission checking helpers determine whether a caller is privileged over an inode according to the acls associated with the inode. Add helpers that make it possible to handle acls on idmapped mounts.
The vfs and the filesystems targeted by this first iteration make use of posix_acl_fix_xattr_from_user() and posix_acl_fix_xattr_to_user() to translate basic posix access and default permissions such as the ACL_USER and ACL_GROUP type according to the initial user namespace (or the superblock's user namespace) to and from the caller's current user namespace. Adapt these two helpers to handle idmapped mounts whereby we either map from or into the mount's user namespace depending on in which direction we're translating. Similarly, cap_convert_nscap() is used by the vfs to translate user namespace and non-user namespace aware filesystem capabilities from the superblock's user namespace to the caller's user namespace. Enable it to handle idmapped mounts by accounting for the mount's user namespace.
In addition the fileystems targeted in the first iteration of this patch series make use of the posix_acl_chmod() and, posix_acl_update_mode() helpers. Both helpers perform permission checks on the target inode. Let them handle idmapped mounts. These two helpers are called when posix acls are set by the respective filesystems to handle this case we extend the ->set() method to take an additional user namespace argument to pass the mount's user namespace down.
Link: https://lore.kernel.org/r/20210121131959.646623-9-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
show more ...
|
#
2f221d6f |
| 21-Jan-2021 |
Christian Brauner <christian.brauner@ubuntu.com> |
attr: handle idmapped mounts
When file attributes are changed most filesystems rely on the setattr_prepare(), setattr_copy(), and notify_change() helpers for initialization and permission checking.
attr: handle idmapped mounts
When file attributes are changed most filesystems rely on the setattr_prepare(), setattr_copy(), and notify_change() helpers for initialization and permission checking. Let them handle idmapped mounts. If the inode is accessed through an idmapped mount map it into the mount's user namespace. Afterwards the checks are identical to non-idmapped mounts. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before.
Helpers that perform checks on the ia_uid and ia_gid fields in struct iattr assume that ia_uid and ia_gid are intended values and have already been mapped correctly at the userspace-kernelspace boundary as we already do today. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before.
Link: https://lore.kernel.org/r/20210121131959.646623-8-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
show more ...
|
#
47291baa |
| 21-Jan-2021 |
Christian Brauner <christian.brauner@ubuntu.com> |
namei: make permission helpers idmapped mount aware
The two helpers inode_permission() and generic_permission() are used by the vfs to perform basic permission checking by verifying that the caller
namei: make permission helpers idmapped mount aware
The two helpers inode_permission() and generic_permission() are used by the vfs to perform basic permission checking by verifying that the caller is privileged over an inode. In order to handle idmapped mounts we extend the two helpers with an additional user namespace argument. On idmapped mounts the two helpers will make sure to map the inode according to the mount's user namespace and then peform identical permission checks to inode_permission() and generic_permission(). If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before.
Link: https://lore.kernel.org/r/20210121131959.646623-6-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: James Morris <jamorris@linux.microsoft.com> Acked-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
show more ...
|
#
0f51a983 |
| 09-Dec-2020 |
Jeff Layton <jlayton@kernel.org> |
ceph: don't reach into request header for readdir info
We already have a pointer to the argument struct in req->r_args. Use that instead of groveling around in the ceph_mds_request_head.
Signed-off
ceph: don't reach into request header for readdir info
We already have a pointer to the argument struct in req->r_args. Use that instead of groveling around in the ceph_mds_request_head.
Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Xiubo Li <xiubli@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
show more ...
|