Revision tags: v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10 |
|
#
ceca54f1 |
| 01-Nov-2023 |
David Howells <dhowells@redhat.com> |
afs: Fix file locking on R/O volumes to operate in local mode
[ Upstream commit b590eb41be766c5a63acc7e8896a042f7a4e8293 ]
AFS doesn't really do locking on R/O volumes as fileservers don't maintain
afs: Fix file locking on R/O volumes to operate in local mode
[ Upstream commit b590eb41be766c5a63acc7e8896a042f7a4e8293 ]
AFS doesn't really do locking on R/O volumes as fileservers don't maintain state with each other and thus a lock on a R/O volume file on one fileserver will not be be visible to someone looking at the same file on another fileserver.
Further, the server may return an error if you try it.
Fix this by doing what other AFS clients do and handle filelocking on R/O volume files entirely within the client and don't touch the server.
Fixes: 6c6c1d63c243 ("afs: Provide mount-time configurable byte-range file locking emulation") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8, v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80, v6.0.9, v5.15.79, v6.0.8, v5.15.78, v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4, v6.0.3, v6.0.2, v5.15.74, v5.15.73, v6.0.1, v5.15.72, v6.0, v5.15.71, v5.15.70, v5.15.69, v5.15.68, v5.15.67, v5.15.66, v5.15.65, v5.15.64, v5.15.63, v5.15.62, v5.15.61, v5.15.60, v5.15.59, v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55, v5.15.54, v5.15.53, v5.15.52, v5.15.51, v5.15.50, v5.15.49, v5.15.48, v5.15.47 |
|
#
874c8ca1 |
| 09-Jun-2022 |
David Howells <dhowells@redhat.com> |
netfs: Fix gcc-12 warning by embedding vfs inode in netfs_i_context
While randstruct was satisfied with using an open-coded "void *" offset cast for the netfs_i_context <-> inode casting, __builtin_
netfs: Fix gcc-12 warning by embedding vfs inode in netfs_i_context
While randstruct was satisfied with using an open-coded "void *" offset cast for the netfs_i_context <-> inode casting, __builtin_object_size() as used by FORTIFY_SOURCE was not as easily fooled. This was causing the following complaint[1] from gcc v12:
In file included from include/linux/string.h:253, from include/linux/ceph/ceph_debug.h:7, from fs/ceph/inode.c:2: In function 'fortify_memset_chk', inlined from 'netfs_i_context_init' at include/linux/netfs.h:326:2, inlined from 'ceph_alloc_inode' at fs/ceph/inode.c:463:2: include/linux/fortify-string.h:242:25: warning: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wattribute-warning] 242 | __write_overflow_field(p_size_field, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fix this by embedding a struct inode into struct netfs_i_context (which should perhaps be renamed to struct netfs_inode). The struct inode vfs_inode fields are then removed from the 9p, afs, ceph and cifs inode structs and vfs_inode is then simply changed to "netfs.inode" in those filesystems.
Further, rename netfs_i_context to netfs_inode, get rid of the netfs_inode() function that converted a netfs_i_context pointer to an inode pointer (that can now be done with &ctx->inode) and rename the netfs_i_context() function to netfs_inode() (which is now a wrapper around container_of()).
Most of the changes were done with:
perl -p -i -e 's/vfs_inode/netfs.inode/'g \ `git grep -l 'vfs_inode' -- fs/{9p,afs,ceph,cifs}/*.[ch]`
Kees suggested doing it with a pair structure[2] and a special declarator to insert that into the network filesystem's inode wrapper[3], but I think it's cleaner to embed it - and then it doesn't matter if struct randomisation reorders things.
Dave Chinner suggested using a filesystem-specific VFS_I() function in each filesystem to convert that filesystem's own inode wrapper struct into the VFS inode struct[4].
Version #2: - Fix a couple of missed name changes due to a disabled cifs option. - Rename nfs_i_context to nfs_inode - Use "netfs" instead of "nic" as the member name in per-fs inode wrapper structs.
[ This also undoes commit 507160f46c55 ("netfs: gcc-12: temporarily disable '-Wattribute-warning' for now") that is no longer needed ]
Fixes: bc899ee1c898 ("netfs: Add a netfs inode context") Reported-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Xiubo Li <xiubli@redhat.com> cc: Jonathan Corbet <corbet@lwn.net> cc: Eric Van Hensbergen <ericvh@gmail.com> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Ilya Dryomov <idryomov@gmail.com> cc: Steve French <smfrench@gmail.com> cc: William Kucharski <william.kucharski@oracle.com> cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> cc: Dave Chinner <david@fromorbit.com> cc: linux-doc@vger.kernel.org cc: v9fs-developer@lists.sourceforge.net cc: linux-afs@lists.infradead.org cc: ceph-devel@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: samba-technical@lists.samba.org cc: linux-fsdevel@vger.kernel.org cc: linux-hardening@vger.kernel.org Link: https://lore.kernel.org/r/d2ad3a3d7bdd794c6efb562d2f2b655fb67756b9.camel@kernel.org/ [1] Link: https://lore.kernel.org/r/20220517210230.864239-1-keescook@chromium.org/ [2] Link: https://lore.kernel.org/r/20220518202212.2322058-1-keescook@chromium.org/ [3] Link: https://lore.kernel.org/r/20220524101205.GI2306852@dread.disaster.area/ [4] Link: https://lore.kernel.org/r/165296786831.3591209.12111293034669289733.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/165305805651.4094995.7763502506786714216.stgit@warthog.procyon.org.uk # v2 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
Revision tags: v5.15.46, v5.15.45, v5.15.44, v5.15.43, v5.15.42, v5.18, v5.15.41, v5.15.40, v5.15.39, v5.15.38, v5.15.37, v5.15.36, v5.15.35, v5.15.34, v5.15.33, v5.15.32, v5.15.31 |
|
#
fd60b288 |
| 22-Mar-2022 |
Muchun Song <songmuchun@bytedance.com> |
fs: allocate inode by using alloc_inode_sb()
The inode allocation is supposed to use alloc_inode_sb(), so convert kmem_cache_alloc() of all filesystems to alloc_inode_sb().
Link: https://lkml.kerne
fs: allocate inode by using alloc_inode_sb()
The inode allocation is supposed to use alloc_inode_sb(), so convert kmem_cache_alloc() of all filesystems to alloc_inode_sb().
Link: https://lkml.kernel.org/r/20220228122126.37293-5-songmuchun@bytedance.com Signed-off-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Theodore Ts'o <tytso@mit.edu> [ext4] Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Cc: Alex Shi <alexs@kernel.org> Cc: Anna Schumaker <Anna.Schumaker@Netapp.com> Cc: Chao Yu <chao@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Fam Zheng <fam.zheng@bytedance.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kari Argillander <kari.argillander@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Xiongchun Duan <duanxiongchun@bytedance.com> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
Revision tags: v5.17, v5.15.30, v5.15.29, v5.15.28, v5.15.27, v5.15.26, v5.15.25, v5.15.24, v5.15.23, v5.15.22, v5.15.21, v5.15.20, v5.15.19, v5.15.18, v5.15.17, v5.4.173, v5.15.16, v5.15.15, v5.16, v5.15.10, v5.15.9, v5.15.8, v5.15.7, v5.15.6, v5.15.5, v5.15.4, v5.15.3, v5.15.2, v5.15.1, v5.15, v5.14.14, v5.14.13, v5.14.12, v5.14.11, v5.14.10, v5.14.9, v5.14.8, v5.14.7, v5.14.6, v5.10.67, v5.10.66, v5.14.5, v5.14.4, v5.10.65, v5.14.3, v5.10.64, v5.14.2, v5.10.63, v5.14.1, v5.10.62, v5.14, v5.10.61, v5.10.60, v5.10.53, v5.10.52, v5.10.51, v5.10.50, v5.10.49 |
|
#
bc899ee1 |
| 29-Jun-2021 |
David Howells <dhowells@redhat.com> |
netfs: Add a netfs inode context
Add a netfs_i_context struct that should be included in the network filesystem's own inode struct wrapper, directly after the VFS's inode struct, e.g.:
struct my_i
netfs: Add a netfs inode context
Add a netfs_i_context struct that should be included in the network filesystem's own inode struct wrapper, directly after the VFS's inode struct, e.g.:
struct my_inode { struct { /* These must be contiguous */ struct inode vfs_inode; struct netfs_i_context netfs_ctx; }; };
The netfs_i_context struct so far contains a single field for the network filesystem to use - the cache cookie:
struct netfs_i_context { ... struct fscache_cookie *cache; };
Three functions are provided to help with this:
(1) void netfs_i_context_init(struct inode *inode, const struct netfs_request_ops *ops);
Initialise the netfs context and set the operations.
(2) struct netfs_i_context *netfs_i_context(struct inode *inode);
Find the netfs context from the VFS inode.
(3) struct inode *netfs_inode(struct netfs_i_context *ctx);
Find the VFS inode from the netfs context.
Changes ======= ver #4) - Fix netfs_is_cache_enabled() to check cookie->cache_priv to see if a cache is present[3]. - Fix netfs_skip_folio_read() to zero out all of the page, not just some of it[3].
ver #3) - Split out the bit to move ceph cap-getting on readahead into ceph_init_request()[1]. - Stick in a comment to the netfs inode structs indicating the contiguity requirements[2].
ver #2) - Adjust documentation to match. - Use "#if IS_ENABLED()" in netfs_i_cookie(), not "#ifdef". - Move the cap check from ceph_readahead() to ceph_init_request() to be called from netfslib. - Remove ceph_readahead() and use netfs_readahead() directly instead.
Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com
Link: https://lore.kernel.org/r/8af0d47f17d89c06bbf602496dd845f2b0bf25b3.camel@kernel.org/ [1] Link: https://lore.kernel.org/r/beaf4f6a6c2575ed489adb14b257253c868f9a5c.camel@kernel.org/ [2] Link: https://lore.kernel.org/r/3536452.1647421585@warthog.procyon.org.uk/ [3] Link: https://lore.kernel.org/r/164622984545.3564931.15691742939278418580.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/164678213320.1200972.16807551936267647470.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/164692909854.2099075.9535537286264248057.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/306388.1647595110@warthog.procyon.org.uk/ # v4
show more ...
|
Revision tags: v5.13, v5.10.46, v5.10.43, v5.10.42, v5.10.41, v5.10.40, v5.10.39, v5.4.119, v5.10.36, v5.10.35, v5.10.34, v5.4.116, v5.10.33, v5.12, v5.10.32, v5.10.31, v5.10.30, v5.10.27, v5.10.26, v5.10.25, v5.10.24, v5.10.23, v5.10.22, v5.10.21, v5.10.20, v5.10.19, v5.4.101, v5.10.18, v5.10.17, v5.11, v5.10.16, v5.10.15, v5.10.14, v5.10, v5.8.17, v5.8.16, v5.8.15, v5.9, v5.8.14, v5.8.13, v5.8.12, v5.8.11, v5.8.10, v5.8.9, v5.8.8, v5.8.7, v5.8.6, v5.4.62, v5.8.5, v5.8.4, v5.4.61, v5.8.3, v5.4.60, v5.8.2, v5.4.59, v5.8.1, v5.4.58, v5.4.57, v5.4.56, v5.8, v5.7.12, v5.4.55, v5.7.11, v5.4.54, v5.7.10, v5.4.53, v5.4.52, v5.7.9, v5.7.8, v5.4.51, v5.4.50, v5.7.7, v5.4.49, v5.7.6, v5.7.5, v5.4.48, v5.7.4, v5.7.3, v5.4.47, v5.4.46, v5.7.2, v5.4.45, v5.7.1, v5.4.44, v5.7, v5.4.43, v5.4.42, v5.4.41, v5.4.40, v5.4.39, v5.4.38, v5.4.37, v5.4.36, v5.4.35, v5.4.34, v5.4.33, v5.4.32, v5.4.31, v5.4.30, v5.4.29, v5.6, v5.4.28, v5.4.27, v5.4.26, v5.4.25, v5.4.24, v5.4.23, v5.4.22, v5.4.21, v5.4.20, v5.4.19 |
|
#
c7f75ef3 |
| 06-Feb-2020 |
David Howells <dhowells@redhat.com> |
afs: Copy local writes to the cache when writing to the server
When writing to the server from afs_writepage() or afs_writepages(), copy the data to the cache object too.
To make this possible, the
afs: Copy local writes to the cache when writing to the server
When writing to the server from afs_writepage() or afs_writepages(), copy the data to the cache object too.
To make this possible, the cookie must have its active users count incremented when the page is dirtied and kept incremented until we manage to clean up all the pages. This allows the writeback to take place after the last file struct is released.
Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: kafs-testing@auristor.com Acked-by: Jeff Layton <jlayton@kernel.org> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819662333.215744.7531373404219224438.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906970998.143852.674420788614608063.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967176564.1823006.16666056085593949570.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021570208.640689.9193494979708031862.stgit@warthog.procyon.org.uk/ # v4
show more ...
|
#
1744a22a |
| 14-Dec-2021 |
David Howells <dhowells@redhat.com> |
afs: Fix mmap
Fix afs_add_open_map() to check that the vnode isn't already on the list when it adds it. It's possible that afs_drop_open_mmap() decremented the cb_nr_mmap counter, but hadn't yet go
afs: Fix mmap
Fix afs_add_open_map() to check that the vnode isn't already on the list when it adds it. It's possible that afs_drop_open_mmap() decremented the cb_nr_mmap counter, but hadn't yet got into the locked section to remove it.
Also vnode->cb_mmap_link should be initialised, so fix that too.
Fixes: 6e0e99d58a65 ("afs: Fix mmap coherency vs 3rd-party changes") Reported-by: kafs-testing+fedora34_64checkkafs-build-300@auristor.com Suggested-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: kafs-testing+fedora34_64checkkafs-build-300@auristor.com cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/686465.1639435380@warthog.procyon.org.uk/ # v1 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
#
5473fe0c |
| 14-Dec-2021 |
David Howells <dhowells@redhat.com> |
afs: Fix mmap
[ Upstream commit 1744a22ae948799da7927b53ec97ccc877ff9d61 ]
Fix afs_add_open_map() to check that the vnode isn't already on the list when it adds it. It's possible that afs_drop_ope
afs: Fix mmap
[ Upstream commit 1744a22ae948799da7927b53ec97ccc877ff9d61 ]
Fix afs_add_open_map() to check that the vnode isn't already on the list when it adds it. It's possible that afs_drop_open_mmap() decremented the cb_nr_mmap counter, but hadn't yet got into the locked section to remove it.
Also vnode->cb_mmap_link should be initialised, so fix that too.
Fixes: 6e0e99d58a65 ("afs: Fix mmap coherency vs 3rd-party changes") Reported-by: kafs-testing+fedora34_64checkkafs-build-300@auristor.com Suggested-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: kafs-testing+fedora34_64checkkafs-build-300@auristor.com cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/686465.1639435380@warthog.procyon.org.uk/ # v1 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
6e0e99d5 |
| 02-Sep-2021 |
David Howells <dhowells@redhat.com> |
afs: Fix mmap coherency vs 3rd-party changes
Fix the coherency management of mmap'd data such that 3rd-party changes become visible as soon as possible after the callback notification is delivered b
afs: Fix mmap coherency vs 3rd-party changes
Fix the coherency management of mmap'd data such that 3rd-party changes become visible as soon as possible after the callback notification is delivered by the fileserver. This is done by the following means:
(1) When we break a callback on a vnode specified by the CB.CallBack call from the server, we queue a work item (vnode->cb_work) to go and clobber all the PTEs mapping to that inode.
This causes the CPU to trip through the ->map_pages() and ->page_mkwrite() handlers if userspace attempts to access the page(s) again.
(Ideally, this would be done in the service handler for CB.CallBack, but the server is waiting for our reply before considering, and we have a list of vnodes, all of which need breaking - and the process of getting the mmap_lock and stripping the PTEs on all CPUs could be quite slow.)
(2) Call afs_validate() from the ->map_pages() handler to check to see if the file has changed and to get a new callback promise from the server.
Also handle the fileserver telling us that it's dropping all callbacks, possibly after it's been restarted by sending us a CB.InitCallBackState* call by the following means:
(3) Maintain a per-cell list of afs files that are currently mmap'd (cell->fs_open_mmaps).
(4) Add a work item to each server that is invoked if there are any open mmaps when CB.InitCallBackState happens. This work item goes through the aforementioned list and invokes the vnode->cb_work work item for each one that is currently using this server.
This causes the PTEs to be cleared, causing ->map_pages() or ->page_mkwrite() to be called again, thereby calling afs_validate() again.
I've chosen to simply strip the PTEs at the point of notification reception rather than invalidate all the pages as well because (a) it's faster, (b) we may get a notification for other reasons than the data being altered (in which case we don't want to clobber the pagecache) and (c) we need to ask the server to find out - and I don't want to wait for the reply before holding up userspace.
This was tested using the attached test program:
#include <stdbool.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <fcntl.h> #include <sys/mman.h> int main(int argc, char *argv[]) { size_t size = getpagesize(); unsigned char *p; bool mod = (argc == 3); int fd; if (argc != 2 && argc != 3) { fprintf(stderr, "Format: %s <file> [mod]\n", argv[0]); exit(2); } fd = open(argv[1], mod ? O_RDWR : O_RDONLY); if (fd < 0) { perror(argv[1]); exit(1); }
p = mmap(NULL, size, mod ? PROT_READ|PROT_WRITE : PROT_READ, MAP_SHARED, fd, 0); if (p == MAP_FAILED) { perror("mmap"); exit(1); } for (;;) { if (mod) { p[0]++; msync(p, size, MS_ASYNC); fsync(fd); } printf("%02x", p[0]); fflush(stdout); sleep(1); } }
It runs in two modes: in one mode, it mmaps a file, then sits in a loop reading the first byte, printing it and sleeping for a second; in the second mode it mmaps a file, then sits in a loop incrementing the first byte and flushing, then printing and sleeping.
Two instances of this program can be run on different machines, one doing the reading and one doing the writing. The reader should see the changes made by the writer, but without this patch, they aren't because validity checking is being done lazily - only on entry to the filesystem.
Testing the InitCallBackState change is more complicated. The server has to be taken offline, the saved callback state file removed and then the server restarted whilst the reading-mode program continues to run. The client machine then has to poke the server to trigger the InitCallBackState call.
Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Markus Suvanto <markus.suvanto@gmail.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/163111668833.283156.382633263709075739.stgit@warthog.procyon.org.uk/
show more ...
|
#
4cb68296 |
| 08-Dec-2020 |
David Howells <dhowells@redhat.com> |
afs: Fix memory leak when mounting with multiple source parameters
There's a memory leak in afs_parse_source() whereby multiple source= parameters overwrite fc->source in the fs_context struct witho
afs: Fix memory leak when mounting with multiple source parameters
There's a memory leak in afs_parse_source() whereby multiple source= parameters overwrite fc->source in the fs_context struct without freeing the previously recorded source.
Fix this by only permitting a single source parameter and rejecting with an error all subsequent ones.
This was caught by syzbot with the kernel memory leak detector, showing something like the following trace:
unreferenced object 0xffff888114375440 (size 32): comm "repro", pid 5168, jiffies 4294923723 (age 569.948s) backtrace: slab_post_alloc_hook+0x42/0x79 __kmalloc_track_caller+0x125/0x16a kmemdup_nul+0x24/0x3c vfs_parse_fs_string+0x5a/0xa1 generic_parse_monolithic+0x9d/0xc5 do_new_mount+0x10d/0x15a do_mount+0x5f/0x8e __do_sys_mount+0xff/0x127 do_syscall_64+0x2d/0x3a entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: 13fcc6837049 ("afs: Add fs_context support") Reported-by: syzbot+86dc6632faaca40133ab@syzkaller.appspotmail.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
#
dca54a7b |
| 13-Oct-2020 |
David Howells <dhowells@redhat.com> |
afs: Add tracing for cell refcount and active user count
Add a tracepoint to log the cell refcount and active user count and pass in a reason code through various functions that manipulate these cou
afs: Add tracing for cell refcount and active user count
Add a tracepoint to log the cell refcount and active user count and pass in a reason code through various functions that manipulate these counters.
Additionally, a helper function, afs_see_cell(), is provided to log interesting places that deal with a cell without actually doing any accounting directly.
Signed-off-by: David Howells <dhowells@redhat.com>
show more ...
|
Revision tags: v5.4.18, v5.4.17, v5.4.16, v5.5, v5.4.15, v5.4.14, v5.4.13, v5.4.12, v5.4.11, v5.4.10, v5.4.9, v5.4.8, v5.4.7, v5.4.6, v5.4.5, v5.4.4, v5.4.3, v5.3.15, v5.4.2, v5.4.1, v5.3.14, v5.4, v5.3.13, v5.3.12, v5.3.11, v5.3.10, v5.3.9, v5.3.8, v5.3.7, v5.3.6, v5.3.5, v5.3.4, v5.3.3, v5.3.2, v5.3.1, v5.3, v5.2.14, v5.3-rc8, v5.2.13, v5.2.12, v5.2.11, v5.2.10, v5.2.9, v5.2.8, v5.2.7, v5.2.6, v5.2.5, v5.2.4, v5.2.3 |
|
#
88c853c3 |
| 23-Jul-2019 |
David Howells <dhowells@redhat.com> |
afs: Fix cell refcounting by splitting the usage counter
Management of the lifetime of afs_cell struct has some problems due to the usage counter being used to determine whether objects of that type
afs: Fix cell refcounting by splitting the usage counter
Management of the lifetime of afs_cell struct has some problems due to the usage counter being used to determine whether objects of that type are in use in addition to whether anyone might be interested in the structure.
This is made trickier by cell objects being cached for a period of time in case they're quickly reused as they hold the result of a setup process that may be slow (DNS lookups, AFS RPC ops).
Problems include the cached root volume from alias resolution pinning its parent cell record, rmmod occasionally hanging and occasionally producing assertion failures.
Fix this by splitting the count of active users from the struct reference count. Things then work as follows:
(1) The cell cache keeps +1 on the cell's activity count and this has to be dropped before the cell can be removed. afs_manage_cell() tries to exchange the 1 to a 0 with the cells_lock write-locked, and if successful, the record is removed from the net->cells.
(2) One struct ref is 'owned' by the activity count. That is put when the active count is reduced to 0 (final_destruction label).
(3) A ref can be held on a cell whilst it is queued for management on a work queue without confusing the active count. afs_queue_cell() is added to wrap this.
(4) The queue's ref is dropped at the end of the management. This is split out into a separate function, afs_manage_cell_work().
(5) The root volume record is put after a cell is removed (at the final_destruction label) rather then in the RCU destruction routine.
(6) Volumes hold struct refs, but aren't active users.
(7) Both counts are displayed in /proc/net/afs/cells.
There are some management function changes:
(*) afs_put_cell() now just decrements the refcount and triggers the RCU destruction if it becomes 0. It no longer sets a timer to have the manager do this.
(*) afs_use_cell() and afs_unuse_cell() are added to increase and decrease the active count. afs_unuse_cell() sets the management timer.
(*) afs_queue_cell() is added to queue a cell with approprate refs.
There are also some other fixes:
(*) Don't let /proc/net/afs/cells access a cell's vllist if it's NULL.
(*) Make sure that candidate cells in lookups are properly destroyed rather than being simply kfree'd. This ensures the bits it points to are destroyed also.
(*) afs_dec_cells_outstanding() is now called in cell destruction rather than at "final_destruction". This ensures that cell->net is still valid to the end of the destructor.
(*) As a consequence of the previous two changes, move the increment of net->cells_outstanding that was at the point of insertion into the tree to the allocation routine to correctly balance things.
Fixes: 989782dcdc91 ("afs: Overhaul cell database management") Signed-off-by: David Howells <dhowells@redhat.com>
show more ...
|
#
92e3cc91 |
| 09-Oct-2020 |
David Howells <dhowells@redhat.com> |
afs: Fix rapid cell addition/removal by not using RCU on cells tree
There are a number of problems that are being seen by the rapidly mounting and unmounting an afs dynamic root with an explicit cel
afs: Fix rapid cell addition/removal by not using RCU on cells tree
There are a number of problems that are being seen by the rapidly mounting and unmounting an afs dynamic root with an explicit cell and volume specified (which should probably be rejected, but that's a separate issue):
What the tests are doing is to look up/create a cell record for the name given and then tear it down again without actually using it to try to talk to a server. This is repeated endlessly, very fast, and the new cell collides with the old one if it's not quick enough to reuse it.
It appears (as suggested by Hillf Danton) that the search through the RB tree under a read_seqbegin_or_lock() under RCU conditions isn't safe and that it's not blocking the write_seqlock(), despite taking two passes at it. He suggested that the code should take a ref on the cell it's attempting to look at - but this shouldn't be necessary until we've compared the cell names. It's possible that I'm missing a barrier somewhere.
However, using an RCU search for this is overkill, really - we only need to access the cell name in a few places, and they're places where we're may end up sleeping anyway.
Fix this by switching to an R/W semaphore instead.
Additionally, draw the down_read() call inside the function (renamed to afs_find_cell()) since all the callers were taking the RCU read lock (or should've been[*]).
[*] afs_probe_cell_name() should have been, but that doesn't appear to be involved in the bug reports.
The symptoms of this look like:
general protection fault, probably for non-canonical address 0xf27d208691691fdb: 0000 [#1] PREEMPT SMP KASAN KASAN: maybe wild-memory-access in range [0x93e924348b48fed8-0x93e924348b48fedf] ... RIP: 0010:strncasecmp lib/string.c:52 [inline] RIP: 0010:strncasecmp+0x5f/0x240 lib/string.c:43 afs_lookup_cell_rcu+0x313/0x720 fs/afs/cell.c:88 afs_lookup_cell+0x2ee/0x1440 fs/afs/cell.c:249 afs_parse_source fs/afs/super.c:290 [inline] ...
Fixes: 989782dcdc91 ("afs: Overhaul cell database management") Reported-by: syzbot+459a5dce0b4cb70fd076@syzkaller.appspotmail.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Hillf Danton <hdanton@sina.com> cc: syzkaller-bugs@googlegroups.com
show more ...
|
#
55b2598e |
| 24-Sep-2020 |
Christoph Hellwig <hch@lst.de> |
bdi: initialize ->ra_pages and ->io_pages in bdi_init
Set up a readahead size by default, as very few users have a good reason to change it. This means code, ecryptfs, and orangefs now set up the v
bdi: initialize ->ra_pages and ->io_pages in bdi_init
Set up a readahead size by default, as very few users have a good reason to change it. This means code, ecryptfs, and orangefs now set up the values while they were previously missing it, while ubifs, mtd and vboxsf manually set it to 0 to avoid readahead.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: David Sterba <dsterba@suse.com> [btrfs] Acked-by: Richard Weinberger <richard@nod.at> [ubifs, mtd] Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
f11a016a |
| 01-May-2020 |
David Howells <dhowells@redhat.com> |
afs: Fix afs_statfs() to not let the values go below zero
Fix afs_statfs() so that the value for f_bavail and f_bfree don't go "negative" if the number of blocks in use by a volume exceeds the max q
afs: Fix afs_statfs() to not let the values go below zero
Fix afs_statfs() so that the value for f_bavail and f_bfree don't go "negative" if the number of blocks in use by a volume exceeds the max quota for that volume.
Signed-off-by: David Howells <dhowells@redhat.com>
show more ...
|
#
20325960 |
| 29-Apr-2020 |
David Howells <dhowells@redhat.com> |
afs: Reorganise volume and server trees to be rooted on the cell
Reorganise afs_volume objects such that they're in a tree keyed on volume ID, rooted at on an afs_cell object rather than being in mu
afs: Reorganise volume and server trees to be rooted on the cell
Reorganise afs_volume objects such that they're in a tree keyed on volume ID, rooted at on an afs_cell object rather than being in multiple trees, each of which is rooted on an afs_server object.
afs_server structs become per-cell and acquire a pointer to the cell.
The process of breaking a callback then starts with finding the server by its network address, following that to the cell and then looking up each volume ID in the volume tree.
This is simpler than the afs_vol_interest/afs_cb_interest N:M mapping web and allows those structs and the code for maintaining them to be simplified or removed.
It does make a couple of things a bit more tricky, though:
(1) Operations now start with a volume, not a server, so there can be more than one answer as to whether or not the server we'll end up using supports the FS.InlineBulkStatus RPC.
(2) CB RPC operations that specify the server UUID. There's still a tree of servers by UUID on the afs_net struct, but the UUIDs in it aren't guaranteed unique.
Signed-off-by: David Howells <dhowells@redhat.com>
show more ...
|
#
cca37d45 |
| 29-Apr-2020 |
David Howells <dhowells@redhat.com> |
afs: Add a tracepoint to track the lifetime of the afs_volume struct
Add a tracepoint to track the lifetime of the afs_volume struct.
Signed-off-by: David Howells <dhowells@redhat.com>
|
#
8a070a96 |
| 25-Apr-2020 |
David Howells <dhowells@redhat.com> |
afs: Detect cell aliases 1 - Cells with root volumes
Put in the first phase of cell alias detection. This part handles alias detection for cells that have root.cell volumes (which is expected to be
afs: Detect cell aliases 1 - Cells with root volumes
Put in the first phase of cell alias detection. This part handles alias detection for cells that have root.cell volumes (which is expected to be likely).
When a cell becomes newly active, it is probed for its root.cell volume, and if it has one, this volume is compared against other root.cell volumes to find out if the list of fileserver UUIDs have any in common - and if that's the case, do the address lists of those fileservers have any addresses in common. If they do, the new cell is adjudged to be an alias of the old cell and the old cell is used instead.
Comparing is aided by the server list in struct afs_server_list being sorted in UUID order and the addresses in the fileserver address lists being sorted in address order.
The cell then retains the afs_volume object for the root.cell volume, even if it's not mounted for future alias checking.
This necessary because:
(1) Whilst fileservers have UUIDs that are meant to be globally unique, in practice they are not because cells get cloned without changing the UUIDs - so afs_server records need to be per cell.
(2) Sometimes the DNS is used to make cell aliases - but if we don't know they're the same, we may end up with multiple superblocks and multiple afs_server records for the same thing, impairing our ability to deliver callback notifications of third party changes
(3) The fileserver RPC API doesn't contain the cell name, so it can't tell us which cell it's notifying and can't see that a change made to to one cell should notify the same client that's also accessed as the other cell.
Reported-by: Jeffrey Altman <jaltman@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com>
show more ...
|
#
e49c7b2f |
| 10-Apr-2020 |
David Howells <dhowells@redhat.com> |
afs: Build an abstraction around an "operation" concept
Turn the afs_operation struct into the main way that most fileserver operations are managed. Various things are added to the struct, includin
afs: Build an abstraction around an "operation" concept
Turn the afs_operation struct into the main way that most fileserver operations are managed. Various things are added to the struct, including the following:
(1) All the parameters and results of the relevant operations are moved into it, removing corresponding fields from the afs_call struct. afs_call gets a pointer to the op.
(2) The target volume is made the main focus of the operation, rather than the target vnode(s), and a bunch of op->vnode->volume are made op->volume instead.
(3) Two vnode records are defined (op->file[]) for the vnode(s) involved in most operations. The vnode record (struct afs_vnode_param) contains:
- The vnode pointer.
- The fid of the vnode to be included in the parameters or that was returned in the reply (eg. FS.MakeDir).
- The status and callback information that may be returned in the reply about the vnode.
- Callback break and data version tracking for detecting simultaneous third-parth changes.
(4) Pointers to dentries to be updated with new inodes.
(5) An operations table pointer. The table includes pointers to functions for issuing AFS and YFS-variant RPCs, handling the success and abort of an operation and handling post-I/O-lock local editing of a directory.
To make this work, the following function restructuring is made:
(A) The rotation loop that issues calls to fileservers that can be found in each function that wants to issue an RPC (such as afs_mkdir()) is extracted out into common code, in a new file called fs_operation.c.
(B) The rotation loops, such as the one in afs_mkdir(), are replaced with a much smaller piece of code that allocates an operation, sets the parameters and then calls out to the common code to do the actual work.
(C) The code for handling the success and failure of an operation are moved into operation functions (as (5) above) and these are called from the core code at appropriate times.
(D) The pseudo inode getting stuff used by the dynamic root code is moved over into dynroot.c.
(E) struct afs_iget_data is absorbed into the operation struct and afs_iget() expects to be given an op pointer and a vnode record.
(F) Point (E) doesn't work for the root dir of a volume, but we know the FID in advance (it's always vnode 1, unique 1), so a separate inode getter, afs_root_iget(), is provided to special-case that.
(G) The inode status init/update functions now also take an op and a vnode record.
(H) The RPC marshalling functions now, for the most part, just take an afs_operation struct as their only argument. All the data they need is held there. The result delivery functions write their answers there as well.
(I) The call is attached to the operation and then the operation core does the waiting.
And then the new operation code is, for the moment, made to just initialise the operation, get the appropriate vnode I/O locks and do the same rotation loop as before.
This lays the foundation for the following changes in the future:
(*) Overhauling the rotation (again).
(*) Support for asynchronous I/O, where the fileserver rotation must be done asynchronously also.
Signed-off-by: David Howells <dhowells@redhat.com>
show more ...
|
#
a310082f |
| 20-Mar-2020 |
David Howells <dhowells@redhat.com> |
afs: Rename struct afs_fs_cursor to afs_operation
As a prelude to implementing asynchronous fileserver operations in the afs filesystem, rename struct afs_fs_cursor to afs_operation.
This struct is
afs: Rename struct afs_fs_cursor to afs_operation
As a prelude to implementing asynchronous fileserver operations in the afs filesystem, rename struct afs_fs_cursor to afs_operation.
This struct is going to form the core of the operation management and is going to acquire more members in later.
Signed-off-by: David Howells <dhowells@redhat.com>
show more ...
|
#
d7167b14 |
| 07-Sep-2019 |
Al Viro <viro@zeniv.linux.org.uk> |
fs_parse: fold fs_parameter_desc/fs_parameter_spec
The former contains nothing but a pointer to an array of the latter...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
96cafb9c |
| 06-Dec-2019 |
Eric Sandeen <sandeen@sandeen.net> |
fs_parser: remove fs_parameter_description name field
Unused now.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.l
fs_parser: remove fs_parameter_description name field
Unused now.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
show more ...
|
#
5eede625 |
| 16-Dec-2019 |
Al Viro <viro@zeniv.linux.org.uk> |
fold struct fs_parameter_enum into struct constant_table
no real difference now
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
2710c957 |
| 06-Sep-2019 |
Al Viro <viro@zeniv.linux.org.uk> |
fs_parse: get rid of ->enums
Don't do a single array; attach them to fsparam_enum() entry instead. And don't bother trying to embed the names into those - it actually loses memory, with no real spe
fs_parse: get rid of ->enums
Don't do a single array; attach them to fsparam_enum() entry instead. And don't bother trying to embed the names into those - it actually loses memory, with no real speedup worth mentioning.
Simplifies validation as well.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
show more ...
|
#
106bc798 |
| 11-Dec-2019 |
David Howells <dhowells@redhat.com> |
afs: Fix missing cell comparison in afs_test_super()
Fix missing cell comparison in afs_test_super(). Without this, any pair volumes that have the same volume ID will share a superblock, no matter
afs: Fix missing cell comparison in afs_test_super()
Fix missing cell comparison in afs_test_super(). Without this, any pair volumes that have the same volume ID will share a superblock, no matter the cell, unless they're in different network namespaces.
Normally, most users will only deal with a single cell and so they won't see this. Even if they do look into a second cell, they won't see a problem unless they happen to hit a volume with the same ID as one they've already got mounted.
Before the patch:
# ls /afs/grand.central.org/archive linuxdev/ mailman/ moin/ mysql/ pipermail/ stage/ twiki/ # ls /afs/kth.se/ linuxdev/ mailman/ moin/ mysql/ pipermail/ stage/ twiki/ # cat /proc/mounts | grep afs none /afs afs rw,relatime,dyn,autocell 0 0 #grand.central.org:root.cell /afs/grand.central.org afs ro,relatime 0 0 #grand.central.org:root.archive /afs/grand.central.org/archive afs ro,relatime 0 0 #grand.central.org:root.archive /afs/kth.se afs ro,relatime 0 0
After the patch:
# ls /afs/grand.central.org/archive linuxdev/ mailman/ moin/ mysql/ pipermail/ stage/ twiki/ # ls /afs/kth.se/ admin/ common/ install/ OldFiles/ service/ system/ bakrestores/ home/ misc/ pkg/ src/ wsadmin/ # cat /proc/mounts | grep afs none /afs afs rw,relatime,dyn,autocell 0 0 #grand.central.org:root.cell /afs/grand.central.org afs ro,relatime 0 0 #grand.central.org:root.archive /afs/grand.central.org/archive afs ro,relatime 0 0 #kth.se:root.cell /afs/kth.se afs ro,relatime 0 0
Fixes: ^1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Carsten Jacobi <jacobi@de.ibm.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> Tested-by: Jonathan Billings <jsbillings@jsbillings.org> cc: Todd DeSantis <atd@us.ibm.com>
show more ...
|
#
bcbccaf2 |
| 09-Dec-2019 |
David Howells <dhowells@redhat.com> |
afs: Fix SELinux setting security label on /afs
Make the AFS dynamic root superblock R/W so that SELinux can set the security label on it. Without this, upgrades to, say, the Fedora filesystem-afs
afs: Fix SELinux setting security label on /afs
Make the AFS dynamic root superblock R/W so that SELinux can set the security label on it. Without this, upgrades to, say, the Fedora filesystem-afs RPM fail if afs is mounted on it because the SELinux label can't be (re-)applied.
It might be better to make it possible to bypass the R/O check for LSM label application through setxattr.
Fixes: 4d673da14533 ("afs: Support the AFS dynamic root") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> cc: selinux@vger.kernel.org cc: linux-security-module@vger.kernel.org
show more ...
|