#
1750d929 |
| 26-Jul-2017 |
Anna Schumaker <Anna.Schumaker@Netapp.com> |
NFS: Don't compare apples to elephants to determine access bits
The NFS_ACCESS_* flags aren't a 1:1 mapping to the MAY_* flags, so checking for MAY_WHATEVER might have surprising results in nfs*_pro
NFS: Don't compare apples to elephants to determine access bits
The NFS_ACCESS_* flags aren't a 1:1 mapping to the MAY_* flags, so checking for MAY_WHATEVER might have surprising results in nfs*_proc_access(). Let's simplify this check when determining which bits to ask for, and do it in a generic place instead of copying code for each NFS version.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
#
3c181827 |
| 26-Jul-2017 |
Anna Schumaker <Anna.Schumaker@Netapp.com> |
NFS: Create NFS_ACCESS_* flags
Passing the NFS v4 flags into the v3 code seems weird to me, even if they are defined to the same values. This patch adds in generic flags to help me feel better
Sig
NFS: Create NFS_ACCESS_* flags
Passing the NFS v4 flags into the v3 code seems weird to me, even if they are defined to the same values. This patch adds in generic flags to help me feel better
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
#
03c6f7d6 |
| 15-Aug-2017 |
NeilBrown <neilb@suse.com> |
NFS: remove jiffies field from access cache
This field hasn't been used since commit 57b691819ee2 ("NFS: Cache access checks more aggressively").
Signed-off-by: NeilBrown <neilb@suse.com> Signed-of
NFS: remove jiffies field from access cache
This field hasn't been used since commit 57b691819ee2 ("NFS: Cache access checks more aggressively").
Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
show more ...
|
#
ecbb903c |
| 11-Jul-2017 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFS: Be more careful about mapping file permissions
When mapping a directory, we want the MAY_WRITE permissions to reflect whether or not we have permission to modify, add and delete the directory e
NFS: Be more careful about mapping file permissions
When mapping a directory, we want the MAY_WRITE permissions to reflect whether or not we have permission to modify, add and delete the directory entries. MAY_EXEC must map to lookup permissions.
On the other hand, for files, we want MAY_WRITE to reflect a permission to modify and extend the file.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
#
bd8b2441 |
| 11-Jul-2017 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFS: Store the raw NFS access mask in the inode's access cache
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
15d4b73a |
| 11-Jul-2017 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFS: Refactor NFS access to kernel access mask calculation
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
Revision tags: v4.12 |
|
#
774d9513 |
| 29-Jun-2017 |
Peng Tao <tao.peng@primarydata.com> |
nfs: replace d_add with d_splice_alias in atomic_open
It's a trival change but follows knfsd export document that asks for d_splice_alias during lookup.
Signed-off-by: Peng Tao <tao.peng@primarydat
nfs: replace d_add with d_splice_alias in atomic_open
It's a trival change but follows knfsd export document that asks for d_splice_alias during lookup.
Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
#
eaa2b82c |
| 03-Jul-2017 |
NeilBrown <neilb@suse.com> |
NFS: guard against confused server in nfs_atomic_open()
A confused server could return a filehandle for an NFSv4 OPEN request, which it previously returned for a directory. So the inode returned by
NFS: guard against confused server in nfs_atomic_open()
A confused server could return a filehandle for an NFSv4 OPEN request, which it previously returned for a directory. So the inode returned by ->open_context() in nfs_atomic_open() could conceivably be a directory inode.
This has particular implications for the call to nfs_file_set_open_context() in nfs_finish_open(). If that is called on a directory inode, then the nfs_open_context that gets stored in the filp->private_data will be linked to nfs_inode->open_files.
When the directory is closed, nfs_closedir() will (ultimately) free the ->private_data, but not unlink it from nfs_inode->open_files (because it doesn't expect an nfs_open_context there).
Subsequently the memory could get used for something else and eventually if the ->open_files list is walked, the walker will fall off the end and crash.
So: change nfs_finish_open() to only call nfs_file_set_open_context() for regular-file inodes.
This failure mode has been seen in a production setting (unknown NFS server implementation). The kernel was v3.0 and the specific sequence seen would not affect more recent kernels, but I think a risk is still present, and caution is wise.
Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
#
cc89684c |
| 04-Jul-2017 |
NeilBrown <neilb@suse.com> |
NFS: only invalidate dentrys that are clearly invalid.
Since commit bafc9b754f75 ("vfs: More precise tests in d_invalidate") in v3.18, a return of '0' from ->d_revalidate() will cause the dentry to
NFS: only invalidate dentrys that are clearly invalid.
Since commit bafc9b754f75 ("vfs: More precise tests in d_invalidate") in v3.18, a return of '0' from ->d_revalidate() will cause the dentry to be invalidated even if it has filesystems mounted on or it or on a descendant. The mounted filesystem is unmounted.
This means we need to be careful not to return 0 unless the directory referred to truly is invalid. So -ESTALE or -ENOENT should invalidate the directory. Other errors such a -EPERM or -ERESTARTSYS should be returned from ->d_revalidate() so they are propagated to the caller.
A particular problem can be demonstrated by:
1/ mount an NFS filesystem using NFSv3 on /mnt 2/ mount any other filesystem on /mnt/foo 3/ ls /mnt/foo 4/ turn off network, or otherwise make the server unable to respond 5/ ls /mnt/foo & 6/ cat /proc/$!/stack # note that nfs_lookup_revalidate is in the call stack 7/ kill -9 $! # this results in -ERESTARTSYS being returned 8/ observe that /mnt/foo has been unmounted.
This patch changes nfs_lookup_revalidate() to only treat -ESTALE from nfs_lookup_verify_inode() and -ESTALE or -ENOENT from ->lookup() as indicating an invalid inode. Other errors are returned.
Also nfs_check_inode_attributes() is changed to return -ESTALE rather than -EIO. This is consistent with the error returned in similar circumstances from nfs_update_inode().
As this bug allows any user to unmount a filesystem mounted on an NFS filesystem, this fix is suitable for stable kernels.
Fixes: bafc9b754f75 ("vfs: More precise tests in d_invalidate") Cc: stable@vger.kernel.org (v3.18+) Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
#
818a8dbe |
| 16-Jun-2017 |
Benjamin Coddington <bcodding@redhat.com> |
NFS: nfs_rename() - revalidate directories on -ERESTARTSYS
An interrupted rename will leave the old dentry behind if the rename succeeds. Fix this by forcing a lookup the next time through ->d_reva
NFS: nfs_rename() - revalidate directories on -ERESTARTSYS
An interrupted rename will leave the old dentry behind if the rename succeeds. Fix this by forcing a lookup the next time through ->d_revalidate.
A previous attempt at solving this problem took the approach to complete the work of the rename asynchronously, however that approach was wrong since it would allow the d_move() to occur after the directory's i_mutex had been dropped by the original process.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
#
a7a3b1e9 |
| 20-Jun-2017 |
Benjamin Coddington <bcodding@redhat.com> |
NFS: convert flags to bool
NFS uses some int, and unsigned int :1, and bool as flags in structs and args. Assert the preference for uniformly replacing these with the bool type.
Signed-off-by: Ben
NFS: convert flags to bool
NFS uses some int, and unsigned int :1, and bool as flags in structs and args. Assert the preference for uniformly replacing these with the bool type.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
#
d9f29500 |
| 16-Jun-2017 |
Benjamin Coddington <bcodding@redhat.com> |
Revert "NFS: nfs_rename() handle -ERESTARTSYS dentry left behind"
This reverts commit 920b4530fb80430ff30ef83efe21ba1fa5623731 which could call d_move() without holding the directory's i_mutex, and
Revert "NFS: nfs_rename() handle -ERESTARTSYS dentry left behind"
This reverts commit 920b4530fb80430ff30ef83efe21ba1fa5623731 which could call d_move() without holding the directory's i_mutex, and reverts commit d4ea7e3c5c0e341c15b073016dbf3ab6c65f12f3 "NFS: Fix old dentry rehash after move", which was a follow-up fix.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Fixes: 920b4530fb80 ("NFS: nfs_rename() handle -ERESTARTSYS dentry left behind") Cc: stable@vger.kernel.org # v4.10+ Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
show more ...
|
Revision tags: v4.10.17, v4.10.16, v4.10.15 |
|
#
0795bf83 |
| 03-May-2017 |
Fabian Frederick <fabf@skynet.be> |
nfs: use kmap/kunmap directly
This patch removes useless nfs_readdir_get_array() and nfs_readdir_release_array() as suggested by Trond Myklebust
nfs_readdir() calls nfs_revalidate_mapping() before
nfs: use kmap/kunmap directly
This patch removes useless nfs_readdir_get_array() and nfs_readdir_release_array() as suggested by Trond Myklebust
nfs_readdir() calls nfs_revalidate_mapping() before readdir_search_pagecache() , nfs_do_filldir(), uncached_readdir() so mapping should be correct.
While kmap() can't fail, all subsequent error checks were removed as well as unused labels.
Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
show more ...
|
Revision tags: v4.10.14, v4.10.13, v4.10.12, v4.10.11, v4.10.10, v4.10.9, v4.10.8, v4.10.7, v4.10.6, v4.10.5, v4.10.4, v4.10.3, v4.10.2 |
|
#
b044f645 |
| 10-Mar-2017 |
Benjamin Coddington <bcodding@redhat.com> |
NFS: switch back to to ->iterate()
NFS has some optimizations for readdir to choose between using READDIR or READDIRPLUS based on workload, and which NFS operation to use is determined by subsequent
NFS: switch back to to ->iterate()
NFS has some optimizations for readdir to choose between using READDIR or READDIRPLUS based on workload, and which NFS operation to use is determined by subsequent interactions with lookup, d_revalidate, and getattr.
Concurrent use of nfs_readdir() via ->iterate_shared() can cause those optimizations to repeatedly invalidate the pagecache used to store directory entries during readdir(), which causes some very bad performance for directories with many entries (more than about 10000).
There's a couple ways to fix this in NFS, but no fix would be as simple as going back to ->iterate() to serialize nfs_readdir(), and neither fix I tested performed as well as going back to ->iterate().
The first required taking the directory's i_lock for each entry, with the result of terrible contention.
The second way adds another flag to the nfs_inode, and so keeps the optimizations working for large directories. The difference from using ->iterate() here is that much more memory is consumed for a given workload without any performance gain.
The workings of nfs_readdir() are such that concurrent users are serialized within read_cache_page() waiting to retrieve pages of entries from the server. By serializing this work in iterate_dir() instead, contention for cache pages is reduced. Waiting processes can have an uncontended pass at the entirety of the directory's pagecache once previous processes have completed filling it.
v2 - Keep the bits needed for parallel lookup
Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
show more ...
|
#
d4ea7e3c |
| 15-Mar-2017 |
Benjamin Coddington <bcodding@redhat.com> |
NFS: Fix old dentry rehash after move
Now that nfs_rename()'s d_move has moved within the RPC task's rpc_call_done callback, rehashing new_dentry will actually rehash the old dentry's name in nfs_re
NFS: Fix old dentry rehash after move
Now that nfs_rename()'s d_move has moved within the RPC task's rpc_call_done callback, rehashing new_dentry will actually rehash the old dentry's name in nfs_rename(). d_move() is going to rehash the new dentry for us anyway, so doing it again here is unnecessary.
Reported-by: Chuck Lever <chuck.lever@oracle.com> Fixes: 920b4530fb80 ("NFS: nfs_rename() handle -ERESTARTSYS dentry left behind") Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
Revision tags: v4.10.1, v4.10 |
|
#
920b4530 |
| 31-Jan-2017 |
Benjamin Coddington <bcodding@redhat.com> |
NFS: nfs_rename() handle -ERESTARTSYS dentry left behind
An interrupted rename will leave the old dentry behind if the rename succeeds. Fix this by moving the final local work of the rename to rpc_
NFS: nfs_rename() handle -ERESTARTSYS dentry left behind
An interrupted rename will leave the old dentry behind if the rename succeeds. Fix this by moving the final local work of the rename to rpc_call_done so that the results of the RENAME can always be handled, even if the original process has already returned with -ERESTARTSYS.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
#
21c3ba7e |
| 16-Dec-2016 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFS: Fix and clean up the access cache validity checking
The access cache needs to check whether or not the mode bits, ownership, or ACL has changed or the cache has timed out.
Signed-off-by: Trond
NFS: Fix and clean up the access cache validity checking
The access cache needs to check whether or not the mode bits, ownership, or ACL has changed or the cache has timed out.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
show more ...
|
#
9cdd1d3f |
| 16-Dec-2016 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFS: Only look at the change attribute cache state in nfs_weak_revalidate()
Just like in nfs_check_verifier(), we want to use nfs_mapping_need_revalidate_inode() to check our knowledge of the change
NFS: Only look at the change attribute cache state in nfs_weak_revalidate()
Just like in nfs_check_verifier(), we want to use nfs_mapping_need_revalidate_inode() to check our knowledge of the change attribute is up to date.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
show more ...
|
Revision tags: v4.9 |
|
#
dff25ddb |
| 02-Dec-2016 |
Andreas Gruenbacher <agruenba@redhat.com> |
nfs: add support for the umask attribute
Clients can set the umask attribute when creating files to cause the server to apply it always except when inheriting permissions from the parent directory.
nfs: add support for the umask attribute
Clients can set the umask attribute when creating files to cause the server to apply it always except when inheriting permissions from the parent directory. That way, the new files will end up with the same permissions as files created locally.
See https://tools.ietf.org/html/draft-ietf-nfsv4-umask-02 for more details.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
show more ...
|
#
1cd9cb05 |
| 04-Dec-2016 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFS: Only look at the change attribute cache state in nfs_check_verifier
When looking at whether or not our dcache is valid, we really don't care about the general state of the directory attribute c
NFS: Only look at the change attribute cache state in nfs_check_verifier
When looking at whether or not our dcache is valid, we really don't care about the general state of the directory attribute cache. Instead, we we only care about the state of the change attribute.
This fixes a performance issue when the client is responsible for changing the directory contents; a number of NFSv4 operations will atomically update the directory change attribute, but may not return all the other attributes.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
show more ...
|
#
1bcf4c5c |
| 02-Dec-2016 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFS: Allow getattr to also report readdirplus cache hits
If the use called stat() on an 'ls -l' workload, and the attribute cache was successfully revalidate by READDIRPLUS, then we want to report t
NFS: Allow getattr to also report readdirplus cache hits
If the use called stat() on an 'ls -l' workload, and the attribute cache was successfully revalidate by READDIRPLUS, then we want to report that back so that the readdir code continues to use readdirplus.
Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
show more ...
|
Revision tags: openbmc-4.4-20161121-1 |
|
#
63519fbc |
| 19-Nov-2016 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFS: Be more targeted about readdirplus use when doing lookup/revalidation
There is little point in setting NFS_INO_ADVISE_RDPLUS in nfs_lookup and nfs_lookup_revalidate() unless a process is actual
NFS: Be more targeted about readdirplus use when doing lookup/revalidation
There is little point in setting NFS_INO_ADVISE_RDPLUS in nfs_lookup and nfs_lookup_revalidate() unless a process is actually doing readdir on the parent directory. Furthermore, there is little point in using readdirplus if we're trying to revalidate a negative dentry.
Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
show more ...
|
#
79f687a3 |
| 19-Nov-2016 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFS: Fix a performance regression in readdir
Ben Coddington reports that commit 311324ad1713, by adding the function nfs_dir_mapping_need_revalidate() that checks page cache validity on each call to
NFS: Fix a performance regression in readdir
Ben Coddington reports that commit 311324ad1713, by adding the function nfs_dir_mapping_need_revalidate() that checks page cache validity on each call to nfs_readdir() causes a performance regression when the directory is being modified.
If the directory is changing while we're iterating through the directory, POSIX does not require us to invalidate the page cache unless the user calls rewinddir(). However, we still do want to ensure that we use readdirplus in order to avoid a load of stat() calls when the user is doing an 'ls -l' workload.
The fix should be to invalidate the page cache immediately when we're setting the NFS_INO_ADVISE_RDPLUS bit.
Reported-by: Benjamin Coddington <bcodding@redhat.com> Fixes: 311324ad1713 ("NFS: Be more aggressive in using readdirplus...") Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
show more ...
|
Revision tags: v4.4.33, v4.4.32, v4.4.31, v4.4.30, v4.4.29, v4.4.28, v4.4.27, v4.7.10, openbmc-4.4-20161021-1, v4.7.9, v4.4.26, v4.7.8, v4.4.25 |
|
#
532d4def |
| 12-Oct-2016 |
NeilBrown <neilb@suse.com> |
NFSv4: add flock_owner to open context
An open file description (struct file) in a given process can be associated with two different lock owners.
It can have a Posix lock owner which will be diffe
NFSv4: add flock_owner to open context
An open file description (struct file) in a given process can be associated with two different lock owners.
It can have a Posix lock owner which will be different in each process that has a fd on the file. It can have a Flock owner which will be the same in all processes.
When searching for a lock stateid to use, we need to consider both of these owners
So add a new "flock_owner" to the "nfs_open_context" (of which there is one for each open file description).
This flock_owner does not need to be reference-counted as there is a 1-1 relation between 'struct file' and nfs open contexts, and it will never be part of a list of contexts. So there is no need for a 'flock_context' - just the owner is enough.
The io_count included in the (Posix) lock_context provides no guarantee that all read-aheads that could use the state have completed, so not supporting it for flock locks in not a serious problem. Synchronization between flock and read-ahead can be added later if needed.
When creating an open_context for a non-openning create call, we don't have a 'struct file' to pass in, so the lock context gets initialized with a NULL owner, but this will never be used.
The flock_owner is not used at all in this patch, that will come later.
Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: NeilBrown <neilb@suse.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
show more ...
|
Revision tags: v4.4.24, v4.7.7, v4.8, v4.4.23, v4.7.6, v4.7.5, v4.4.22 |
|
#
7dc72d5f |
| 22-Sep-2016 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFS: Fix inode corruption in nfs_prime_dcache()
Due to inode number reuse in filesystems, we can end up corrupting the inode on our client if we apply the file attributes without ensuring that the f
NFS: Fix inode corruption in nfs_prime_dcache()
Due to inode number reuse in filesystems, we can end up corrupting the inode on our client if we apply the file attributes without ensuring that the filehandle matches. Typical symptoms include spurious "mode changed" reports in the syslog.
We still do want to ensure that we don't invalidate the dentry if the inode number matches, but we don't have a filehandle.
Fixes: fa9233699cc1 ("NFS: Don't require a filehandle to refresh...") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org # v4.0+ Tested-by: Oleg Drokin <green@linuxhacker.ru> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|