Revision tags: v5.15.60, v5.15.59 |
|
#
3c59366c |
| 31-Jul-2022 |
NeilBrown <neilb@suse.de> |
NFS: don't unhash dentry during unlink/rename
NFS unlink() (and rename over existing target) must determine if the file is open, and must perform a "silly rename" instead of an unlink (or before ren
NFS: don't unhash dentry during unlink/rename
NFS unlink() (and rename over existing target) must determine if the file is open, and must perform a "silly rename" instead of an unlink (or before rename) if it is. Otherwise the client might hold a file open which has been removed on the server.
Consequently if it determines that the file isn't open, it must block any subsequent opens until the unlink/rename has been completed on the server.
This is currently achieved by unhashing the dentry. This forces any open attempt to the slow-path for lookup which will block on i_rwsem on the directory until the unlink/rename completes. A future patch will change the VFS to only get a shared lock on i_rwsem for unlink, so this will no longer work.
Instead we introduce an explicit interlock. A special value is stored in dentry->d_fsdata while the unlink/rename is running and ->d_revalidate blocks while that value is present. When ->d_revalidate unblocks, the dentry will be invalid. This closes the race without requiring exclusion on i_rwsem.
d_fsdata is already used in two different ways. 1/ an IS_ROOT directory dentry might have a "devname" stored in d_fsdata. Such a dentry doesn't have a name and so cannot be the target of unlink or rename. For safety we check if an old devname is still stored, and remove it if it is. 2/ a dentry with DCACHE_NFSFS_RENAMED set will have a 'struct nfs_unlinkdata' stored in d_fsdata. While this is set maydelete() will fail, so an unlink or rename will never proceed on such a dentry.
Neither of these can be in effect when a dentry is the target of unlink or rename. So we can expect d_fsdata to be NULL, and store a special value ((void*)1) which is given the name NFS_FSDATA_BLOCKED to indicate that any lookup will be blocked.
The d_count() is incremented under d_lock() when a lookup finds the dentry, so we check d_count() is low, and set NFS_FSDATA_BLOCKED under the same lock to avoid any races.
Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
show more ...
|
Revision tags: v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55, v5.15.54, v5.15.53, v5.15.52, v5.15.51 |
|
#
c77c738c |
| 28-Jun-2022 |
Fabio M. De Francesco <fmdefrancesco@gmail.com> |
nfs: Replace kmap() with kmap_local_page()
The use of kmap() is being deprecated in favor of kmap_local_page().
With kmap_local_page(), the mapping is per thread, CPU local and not globally visible
nfs: Replace kmap() with kmap_local_page()
The use of kmap() is being deprecated in favor of kmap_local_page().
With kmap_local_page(), the mapping is per thread, CPU local and not globally visible. Furthermore, the mapping can be acquired from any context (including interrupts).
Therefore, use kmap_local_page() in nfs_do_filldir() because this mapping is per thread, CPU local, and not globally visible.
Suggested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
show more ...
|
#
6ca0a6f8 |
| 27-Jun-2022 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Fix case insensitive renames
For filesystems that are case insensitive and case preserving, we need to be able to rename from one case folded variant of the filename to another. Currently, if w
NFS: Fix case insensitive renames
For filesystems that are case insensitive and case preserving, we need to be able to rename from one case folded variant of the filename to another. Currently, if we have looked up the target filename before the call to rename, then we may have a hashed dentry with that target name in the dcache, causing the vfs to optimise away the rename. To avoid that, let's drop the target dentry, and leave it to the server to optimise away the rename if that is the correct thing to do.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
show more ...
|
Revision tags: v5.15.50, v5.15.49, v5.15.48, v5.15.47 |
|
#
5ee3d10f |
| 09-Jun-2022 |
Dave Wysochanski <dwysocha@redhat.com> |
NFSv4: Add FMODE_CAN_ODIRECT after successful open of a NFS4.x file
Commit a2ad63daa88b ("VFS: add FMODE_CAN_ODIRECT file flag") added the FMODE_CAN_ODIRECT flag for NFSv3 but neglected to add it fo
NFSv4: Add FMODE_CAN_ODIRECT after successful open of a NFS4.x file
Commit a2ad63daa88b ("VFS: add FMODE_CAN_ODIRECT file flag") added the FMODE_CAN_ODIRECT flag for NFSv3 but neglected to add it for NFSv4.x. This causes direct io on NFSv4.x to fail open with EINVAL: mount -o vers=4.2 127.0.0.1:/export /mnt/nfs4 dd if=/dev/zero of=/mnt/nfs4/file.bin bs=128k count=1 oflag=direct dd: failed to open '/mnt/nfs4/file.bin': Invalid argument dd of=/dev/null if=/mnt/nfs4/file.bin bs=128k count=1 iflag=direct dd: failed to open '/mnt/dir1/file1.bin': Invalid argument
Fixes: a2ad63daa88b ("VFS: add FMODE_CAN_ODIRECT file flag") Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
Revision tags: v5.15.46, v5.15.45, v5.15.44, v5.15.43, v5.15.42, v5.18, v5.15.41, v5.15.40, v5.15.39, v5.15.38, v5.15.37 |
|
#
aa5dc8c4 |
| 01-May-2022 |
Matthew Wilcox (Oracle) <willy@infradead.org> |
nfs: Convert to free_folio
Add a wrapper that converts back from the folio to the page. This entire file needs to be converted to use folios, but that's a task for a different set of patches.
Sign
nfs: Convert to free_folio
Add a wrapper that converts back from the folio to the page. This entire file needs to be converted to use folios, but that's a task for a different set of patches.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
show more ...
|
Revision tags: v5.15.36, v5.15.35, v5.15.34, v5.15.33 |
|
#
830f1111 |
| 30-Mar-2022 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Replace readdir's use of xxhash() with hash_64()
Both xxhash() and hash_64() appear to give similarly low collision rates with a standard linearly increasing readdir offset. They both give simi
NFS: Replace readdir's use of xxhash() with hash_64()
Both xxhash() and hash_64() appear to give similarly low collision rates with a standard linearly increasing readdir offset. They both give similarly higher collision rates when applied to ext4's offsets.
So switch to using the standard hash_64().
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
show more ...
|
#
b243874f |
| 29-Mar-2022 |
ChenXiaoSong <chenxiaosong2@huawei.com> |
NFSv4: fix open failure with O_ACCMODE flag
open() with O_ACCMODE|O_DIRECT flags secondly will fail.
Reproducer: 1. mount -t nfs -o vers=4.2 $server_ip:/ /mnt/ 2. fd = open("/mnt/file", O_ACCMO
NFSv4: fix open failure with O_ACCMODE flag
open() with O_ACCMODE|O_DIRECT flags secondly will fail.
Reproducer: 1. mount -t nfs -o vers=4.2 $server_ip:/ /mnt/ 2. fd = open("/mnt/file", O_ACCMODE|O_DIRECT|O_CREAT) 3. close(fd) 4. fd = open("/mnt/file", O_ACCMODE|O_DIRECT)
Server nfsd4_decode_share_access() will fail with error nfserr_bad_xdr when client use incorrect share access mode of 0.
Fix this by using NFS4_SHARE_ACCESS_BOTH share access mode in client, just like firstly opening.
Fixes: ce4ef7c0a8a05 ("NFS: Split out NFS v4 file operations") Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
show more ...
|
Revision tags: v5.15.32, v5.15.31 |
|
#
e47a62df |
| 22-Mar-2022 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Fix revalidation of empty readdir pages
If the page is empty, we need to check the array->last_cookie instead of the first entry. Add a helper for the cases where we care.
Signed-off-by: Trond
NFS: Fix revalidation of empty readdir pages
If the page is empty, we need to check the array->last_cookie instead of the first entry. Add a helper for the cases where we care.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
show more ...
|
#
648a4548 |
| 21-Mar-2022 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Don't deadlock when cookie hashes collide
In the very rare case where the readdir reply contains multiple cookies that map to the same hash value, we can end up deadlocking waiting for a page l
NFS: Don't deadlock when cookie hashes collide
In the very rare case where the readdir reply contains multiple cookies that map to the same hash value, we can end up deadlocking waiting for a page lock that we already hold. In this case we should fail the page lock by using grab_cache_page_nowait().
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
show more ...
|
Revision tags: v5.17, v5.15.30, v5.15.29, v5.15.28, v5.15.27, v5.15.26 |
|
#
612896ec |
| 24-Feb-2022 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Cache all entries in the readdirplus reply
Even if we're not able to cache all the entries in the readdir buffer, let's ensure that we do prime the dcache.
Signed-off-by: Trond Myklebust <tron
NFS: Cache all entries in the readdirplus reply
Even if we're not able to cache all the entries in the readdir buffer, let's ensure that we do prime the dcache.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
show more ...
|
#
0adf85b4 |
| 27-Feb-2022 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Optimise away the previous cookie field
Replace the 'previous cookie' field in struct nfs_entry with the array->last_cookie.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
b0365ccb |
| 23-Feb-2022 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Fix up forced readdirplus
Avoid clearing the entire readdir page cache if we're just doing forced readdirplus for the 'ls -l' heuristic.
Signed-off-by: Trond Myklebust <trond.myklebust@hammers
NFS: Fix up forced readdirplus
Avoid clearing the entire readdir page cache if we're just doing forced readdirplus for the 'ls -l' heuristic.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
show more ...
|
#
f648022f |
| 23-Feb-2022 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Convert readdir page cache to use a cookie based index
Instead of using a linear index to address the pages, use the cookie of the first entry, since that is what we use to match the page anywa
NFS: Convert readdir page cache to use a cookie based index
Instead of using a linear index to address the pages, use the cookie of the first entry, since that is what we use to match the page anyway.
This allows us to avoid re-reading the entire cache on a seekdir() type of operation. The latter is very common when re-exporting NFS, and is a major performance drain.
The change does affect our duplicate cookie detection, since we can no longer rely on the page index as a linear offset for detecting whether we looped backwards. However since we no longer do a linear search through all the pages on each call to nfs_readdir(), this is less of a concern than it was previously. The other downside is that invalidate_mapping_pages() no longer can use the page index to avoid clearing pages that have been read. A subsequent patch will restore the functionality this provides to the 'ls -l' heuristic.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
show more ...
|
#
9332cf14 |
| 26-Feb-2022 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Clean up page array initialisation/free
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
Revision tags: v5.15.25 |
|
#
11d03d0a |
| 19-Feb-2022 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Trace effects of the readdirplus heuristic
Enable tracking of when the readdirplus heuristic causes a page cache invalidation.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
eace45a1 |
| 19-Feb-2022 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Trace effects of readdirplus on the dcache
Trace the effects of readdirplus on attribute and dentry revalidation.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
310e3187 |
| 19-Feb-2022 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Add basic readdir tracing
Add tracing to track how often the client goes to the server for updated readdir information.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
0b3cc71b |
| 19-Feb-2022 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Don't request readdirplus when revalidation was forced
If the revalidation was forced, due to the presence of a LOOKUP_EXCL or a LOOKUP_REVAL flag, then readdirplus won't help. It also can't he
NFS: Don't request readdirplus when revalidation was forced
If the revalidation was forced, due to the presence of a LOOKUP_EXCL or a LOOKUP_REVAL flag, then readdirplus won't help. It also can't help when we're doing a path component lookup.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
show more ...
|
#
2c2c3365 |
| 19-Feb-2022 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Readdirplus can't help lookup for case insensitive filesystems
If the filesystem is case insensitive, then readdirplus can't help with cache misses, since it won't return case folded variants o
NFS: Readdirplus can't help lookup for case insensitive filesystems
If the filesystem is case insensitive, then readdirplus can't help with cache misses, since it won't return case folded variants of the filename.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
show more ...
|
#
230bc98f |
| 17-Feb-2022 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Improve heuristic for readdirplus
The heuristic for readdirplus is designed to try to detect 'ls -l' and similar patterns. It does so by looking for cache hit/miss patterns in both the attribut
NFS: Improve heuristic for readdirplus
The heuristic for readdirplus is designed to try to detect 'ls -l' and similar patterns. It does so by looking for cache hit/miss patterns in both the attribute cache and in the dcache of the files in a given directory, and then sets a flag for the readdirplus code to interpret.
The problem with this approach is that a single attribute or dcache miss can cause the NFS code to force a refresh of the attributes for the entire set of files contained in the directory.
To be able to make a more nuanced decision, let's sample the number of hits and misses in the set of open directory descriptors. That allows us to set thresholds at which we start preferring READDIRPLUS over regular READDIR, or at which we start to force a re-read of the remaining readdir cache using READDIRPLUS.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
show more ...
|
#
9c3f4d98 |
| 17-Feb-2022 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Reduce use of uncached readdir
When reading a very large directory, we want to try to keep the page cache up to date if doing so is inexpensive. With the change to allow readdir to continue rea
NFS: Reduce use of uncached readdir
When reading a very large directory, we want to try to keep the page cache up to date if doing so is inexpensive. With the change to allow readdir to continue reading even when the cache is incomplete, we no longer need to fall back to uncached readdir in order to scale to large directories.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
show more ...
|
Revision tags: v5.15.24, v5.15.23, v5.15.22 |
|
#
9ff89c25 |
| 07-Feb-2022 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Simplify nfs_readdir_xdr_to_array()
Recent changes to readdir mean that we can cope with partially filled page cache entries, so we no longer need to rely on looping in nfs_readdir_xdr_to_array
NFS: Simplify nfs_readdir_xdr_to_array()
Recent changes to readdir mean that we can cope with partially filled page cache entries, so we no longer need to rely on looping in nfs_readdir_xdr_to_array().
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
show more ...
|
#
6c34f05b |
| 22-Feb-2022 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: If the cookie verifier changes, we must invalidate the page cache
Ensure that if the cookie verifier changes when we use the zero-valued cookie, then we invalidate any cached pages.
Signed-off
NFS: If the cookie verifier changes, we must invalidate the page cache
Ensure that if the cookie verifier changes when we use the zero-valued cookie, then we invalidate any cached pages.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
show more ...
|
#
580f2367 |
| 07-Feb-2022 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Adjust the amount of readahead performed by NFS readdir
The current NFS readdir code will always try to maximise the amount of readahead it performs on the assumption that we can cache anything
NFS: Adjust the amount of readahead performed by NFS readdir
The current NFS readdir code will always try to maximise the amount of readahead it performs on the assumption that we can cache anything that isn't immediately read by the process. There are several cases where this assumption breaks down, including when the 'ls -l' heuristic kicks in to try to force use of readdirplus as a batch replacement for lookup/getattr.
This patch therefore tries to tone down the amount of readahead we perform, and adjust it to try to match the amount of data being requested by user space.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
show more ...
|
#
c8f0523b |
| 26-Feb-2022 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Don't advance the page pointer unless the page is full
When we hit the end of the data in the readdir page, we don't want to start filling a new page, unless this one is full.
Signed-off-by: T
NFS: Don't advance the page pointer unless the page is full
When we hit the end of the data in the readdir page, we don't want to start filling a new page, unless this one is full.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
show more ...
|