#
f98a128a |
| 16-Apr-2014 |
Yan, Zheng <zheng.z.yan@intel.com> |
ceph: update inode fields according to issued caps Cap message and request reply from non-auth MDS may carry stale information (corresponding locks are in LOCK states) even they have
ceph: update inode fields according to issued caps Cap message and request reply from non-auth MDS may carry stale information (corresponding locks are in LOCK states) even they have the newest inode version. So client should update inode fields according to issued caps. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
show more ...
|
#
0a8a70f9 |
| 14-Apr-2014 |
Yan, Zheng <zheng.z.yan@intel.com> |
ceph: clear directory's completeness when creating file When creating a file, ceph_set_dentry_offset() puts the new dentry at the end of directory's d_subdirs, then set the dentry's offs
ceph: clear directory's completeness when creating file When creating a file, ceph_set_dentry_offset() puts the new dentry at the end of directory's d_subdirs, then set the dentry's offset based on directory's max offset. The offset does not reflect the real postion of the dentry in directory. Later readdir reply from MDS may change the dentry's position/offset. This inconsistency can cause missing/duplicate entries in readdir result if readdir is partly satisfied by dcache_readdir(). The fix is clear directory's completeness after creating/renaming file. It prevents later readdir from using dcache_readdir(). Fixes: http://tracker.ceph.com/issues/8025 Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>
show more ...
|
Revision tags: v3.15-rc1 |
|
#
48193012 |
| 01-Apr-2014 |
Yan, Zheng <zheng.z.yan@intel.com> |
ceph: don't grabs open file reference for aborted request Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
Revision tags: v3.14, v3.14-rc8 |
|
#
5f75ce57 |
| 21-Mar-2014 |
Fabian Frederick <fabf@skynet.be> |
ceph: Remove get/set acl on symlinks Remove unsupported symlink operations. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.co
ceph: Remove get/set acl on symlinks Remove unsupported symlink operations. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
show more ...
|
Revision tags: v3.14-rc7, v3.14-rc6 |
|
#
8c93cd61 |
| 08-Mar-2014 |
Yan, Zheng <zheng.z.yan@intel.com> |
ceph: update i_max_size even if inode version does not change handle following sequence of events: - client releases a inode with i_max_size > 0. The release message is queued. (
ceph: update i_max_size even if inode version does not change handle following sequence of events: - client releases a inode with i_max_size > 0. The release message is queued. (is not sent to the auth MDS) - a 'lookup' request reply from non-auth MDS returns the same inode. - client opens the inode in write mode. The version of inode trace in 'open' request reply is equal to the cached inode's version. - client requests new max size. The MDS ignores the request because it does not affect client's write range Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>
show more ...
|
#
19913b4e |
| 06-Mar-2014 |
Yan, Zheng <zheng.z.yan@intel.com> |
ceph: add get_name() NFS export callback Use the newly introduced LOOKUPNAME MDS request to connect child inode to its parent directory. Signed-off-by: Yan, Zheng <zheng.z.yan@i
ceph: add get_name() NFS export callback Use the newly introduced LOOKUPNAME MDS request to connect child inode to its parent directory. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>
show more ...
|
Revision tags: v3.14-rc5, v3.14-rc4, v3.14-rc3, v3.14-rc2, v3.14-rc1, v3.13, v3.13-rc8, v3.13-rc7, v3.13-rc6, v3.13-rc5, v3.13-rc4, v3.13-rc3, v3.13-rc2, v3.13-rc1, v3.12, v3.12-rc7, v3.12-rc6, v3.12-rc5, v3.12-rc4, v3.12-rc3, v3.12-rc2, v3.12-rc1, v3.11, v3.11-rc7, v3.11-rc6, v3.11-rc5, v3.11-rc4, v3.11-rc3, v3.11-rc2, v3.11-rc1, v3.10, v3.10-rc7, v3.10-rc6, v3.10-rc5, v3.10-rc4, v3.10-rc3, v3.10-rc2, v3.10-rc1, v3.9, v3.9-rc8, v3.9-rc7, v3.9-rc6, v3.9-rc5, v3.9-rc4, v3.9-rc3, v3.9-rc2, v3.9-rc1, v3.8, v3.8-rc7 |
|
#
752c8bdc |
| 05-Feb-2013 |
Sage Weil <sage@inktank.com> |
ceph: do not chain inode updates to parent fsync The fsync(dirfd) only covers namespace operations, not inode updates. We do not need to cover setattr variants or O_TRUNC. Repor
ceph: do not chain inode updates to parent fsync The fsync(dirfd) only covers namespace operations, not inode updates. We do not need to cover setattr variants or O_TRUNC. Reported-by: Al Viro <viro@xeniv.linux.org.uk> Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Yan, Zheng <zheng.z.yan@intel.com>
show more ...
|
#
72466d0b |
| 29-Jan-2014 |
Sage Weil <sage@inktank.com> |
ceph: fix posix ACL hooks The merge of commit 7221fe4c2ed7 ("ceph: add acl for cephfs") raced with upstream changes in the generic POSIX ACL code (eg commit 2aeccbe957d0 "fs: add gen
ceph: fix posix ACL hooks The merge of commit 7221fe4c2ed7 ("ceph: add acl for cephfs") raced with upstream changes in the generic POSIX ACL code (eg commit 2aeccbe957d0 "fs: add generic xattr_acl handlers" and others). Some of the fallout was fixed in commit 4db658ea0ca ("ceph: Fix up after semantic merge conflict"), but it was incomplete: the set_acl inode_operation wasn't getting set, and the prototype needed to be adjusted a bit (it doesn't take a dentry anymore). Signed-off-by: Sage Weil <sage@inktank.com> Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
#
4db658ea |
| 28-Jan-2014 |
Linus Torvalds <torvalds@linux-foundation.org> |
ceph: Fix up after semantic merge conflict The previous ceph-client merge resulted in ceph not even building, because there was a merge conflict that wasn't visible as an actual data
ceph: Fix up after semantic merge conflict The previous ceph-client merge resulted in ceph not even building, because there was a merge conflict that wasn't visible as an actual data conflict: commit 7221fe4c2ed7 ("ceph: add acl for cephfs") added support for POSIX ACL's into Ceph, but unluckily we also had the VFS tree change a lot of the POSIX ACL helper functions to be much more helpful to filesystems (see for example commits 2aeccbe957d0 "fs: add generic xattr_acl handlers", 5bf3258fd2ac "fs: make posix_acl_chmod more useful" and 37bc15392a23 "fs: make posix_acl_create more useful") The reason this conflict wasn't obvious was many-fold: because it was a semantic conflict rather than a data conflict, it wasn't visible in the git merge as a conflict. And because the VFS tree hadn't been in linux-next, people hadn't become aware of it that way. And because I was at jury duty this morning, I was using my laptop and as a result not doing constant "allmodconfig" builds. Anyway, this fixes the build and generally removes a fair chunk of the Ceph POSIX ACL support code, since the improved helpers seem to match really well for Ceph too. But I don't actually have any way to *test* the end result, and I was really hoping for some ACK's for this. Oh, well. Not compiling certainly doesn't make things easier to test, so I'm committing this without the acks after having waited for four hours... Plus it's what I would have done for the merge had I noticed the semantic conflict.. Reported-by: Dave Jones <davej@redhat.com> Cc: Sage Weil <sage@inktank.com> Cc: Guangliang Zhao <lucienchao@gmail.com> Cc: Li Wang <li.wang@ubuntykylin.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
#
11df2dfb |
| 24-Nov-2013 |
Yan, Zheng <zheng.z.yan@intel.com> |
ceph: add imported caps when handling cap export message Version 3 cap export message includes information about the imported caps. It allows us to add the imported caps if the correspon
ceph: add imported caps when handling cap export message Version 3 cap export message includes information about the imported caps. It allows us to add the imported caps if the corresponding cap import message still hasn't been received. This allow us to handle situation that the importer MDS crashes and the cap import message is missing. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
show more ...
|
#
9563f88c |
| 21-Nov-2013 |
Yan, Zheng <zheng.z.yan@intel.com> |
ceph: fix cache revoke race handle following sequence of events: - non-auth MDS revokes Fc cap. queue invalidate work - auth MDS issues Fc cap through request reply. i_rdcache_g
ceph: fix cache revoke race handle following sequence of events: - non-auth MDS revokes Fc cap. queue invalidate work - auth MDS issues Fc cap through request reply. i_rdcache_gen gets increased. - invalidate work runs. it finds i_rdcache_revoking != i_rdcache_gen, so it does nothing. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
show more ...
|
#
7221fe4c |
| 11-Nov-2013 |
Guangliang Zhao <lucienchao@gmail.com> |
ceph: add acl for cephfs Signed-off-by: Guangliang Zhao <lucienchao@gmail.com> Reviewed-by: Li Wang <li.wang@ubuntykylin.com> Reviewed-by: Zheng Yan <zheng.z.yan@intel.com>
|
#
9f12bd11 |
| 20-Sep-2013 |
Yan, Zheng <zheng.z.yan@intel.com> |
ceph: drop unconnected inodes Positve dentry and corresponding inode are always accompanied in MDS reply. So no need to keep inode in the cache after dropping all its aliases. S
ceph: drop unconnected inodes Positve dentry and corresponding inode are always accompanied in MDS reply. So no need to keep inode in the cache after dropping all its aliases. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>
show more ...
|
#
86b58d13 |
| 04-Dec-2013 |
Yan, Zheng <zheng.z.yan@intel.com> |
ceph: initialize inode before instantiating dentry commit b18825a7c8 (Put a small type field into struct dentry::d_flags) put a type field into struct dentry::d_flags. __d_instantiate()
ceph: initialize inode before instantiating dentry commit b18825a7c8 (Put a small type field into struct dentry::d_flags) put a type field into struct dentry::d_flags. __d_instantiate() set the field by checking inode->i_mode. So we should initialize inode before instantiating dentry when handling mds reply. Fixes: http://tracker.ceph.com/issues/6930 Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>
show more ...
|
#
81c6aea5 |
| 17-Sep-2013 |
Yan, Zheng <zheng.z.yan@intel.com> |
ceph: handle frag mismatch between readdir request and reply If client has outdated directory fragments information, it may request readdir an non-existent directory fragment. In this ca
ceph: handle frag mismatch between readdir request and reply If client has outdated directory fragments information, it may request readdir an non-existent directory fragment. In this case, the MDS finds an approximate directory fragment and sends its contents back to the client. When receiving a reply with fragment that is different than the requested one, the client need to reset the 'readdir offset'. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>
show more ...
|
#
53e879a4 |
| 17-Sep-2013 |
Yan, Zheng <zheng.z.yan@intel.com> |
ceph: remove outdated frag information If directory fragments change, fill_inode() inserts new frags into the fragtree, but it does not remove outdated frags from the fragtree. This
ceph: remove outdated frag information If directory fragments change, fill_inode() inserts new frags into the fragtree, but it does not remove outdated frags from the fragtree. This patch fixes it. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>
show more ...
|
#
ed284c49 |
| 02-Sep-2013 |
Yan, Zheng <zheng.z.yan@intel.com> |
ceph: remove ceph_lookup_inode() commit 6f60f889 (ceph: fix freeing inode vs removing session caps race) introduced ceph_lookup_inode(). But there is already a ceph_find_inode() whic
ceph: remove ceph_lookup_inode() commit 6f60f889 (ceph: fix freeing inode vs removing session caps race) introduced ceph_lookup_inode(). But there is already a ceph_find_inode() which provides similar function. So remove ceph_lookup_inode(), use ceph_find_inode() instead. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Alex Elder <alex.elder@linary.org> Reviewed-by: Sage Weil <sage@inktank.com>
show more ...
|
#
99ccbd22 |
| 21-Aug-2013 |
Milosz Tanski <milosz@adfin.com> |
ceph: use fscache as a local presisent cache Adding support for fscache to the Ceph filesystem. This would bring it to on par with some of the other network filesystems in Linux (like NF
ceph: use fscache as a local presisent cache Adding support for fscache to the Ceph filesystem. This would bring it to on par with some of the other network filesystems in Linux (like NFS, AFS, etc...) In order to mount the filesystem with fscache the 'fsc' mount option must be passed. Signed-off-by: Milosz Tanski <milosz@adfin.com> Signed-off-by: Sage Weil <sage@inktank.com>
show more ...
|
#
b0d7c223 |
| 12-Aug-2013 |
Yan, Zheng <zheng.z.yan@intel.com> |
ceph: introduce i_truncate_mutex I encountered below deadlock when running fsstress wmtruncate work truncate MDS --------------- ------------------ ------
ceph: introduce i_truncate_mutex I encountered below deadlock when running fsstress wmtruncate work truncate MDS --------------- ------------------ -------------------------- lock i_mutex <- truncate file lock i_mutex (blocked) <- revoking Fcb (filelock to MIX) send request -> handle request (xlock filelock) At the initial time, there are some dirty pages in the page cache. When the kclient receives the truncate message, it reduces inode size and creates some 'out of i_size' dirty pages. wmtruncate work can't truncate these dirty pages because it's blocked by the i_mutex. Later when the kclient receives the cap message that revokes Fcb caps, It can't flush all dirty pages because writepages() only flushes dirty pages within the inode size. When the MDS handles the 'truncate' request from kclient, it waits for the filelock to become stable. But the filelock is stuck in unstable state because it can't finish revoking kclient's Fcb caps. The truncate pagecache locking has already caused lots of trouble for use. I think it's time simplify it by introducing a new mutex. We use the new mutex to prevent concurrent truncate_inode_pages(). There is no need to worry about race between buffered write and truncate_inode_pages(), because our "get caps" mechanism prevents them from concurrent execution. Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
show more ...
|
#
ee3e542f |
| 15-Aug-2013 |
Sage Weil <sage@inktank.com> |
Merge remote-tracking branch 'linus/master' into testing
|
#
6f60f889 |
| 23-Jul-2013 |
Yan, Zheng <zheng.z.yan@intel.com> |
ceph: fix freeing inode vs removing session caps race remove_session_caps() uses iterate_session_caps() to remove caps, but iterate_session_caps() skips inodes that are being deleted.
ceph: fix freeing inode vs removing session caps race remove_session_caps() uses iterate_session_caps() to remove caps, but iterate_session_caps() skips inodes that are being deleted. So session->s_nr_caps can be non-zero after iterate_session_caps() return. We can fix the issue by waiting until deletions are complete. __wait_on_freeing_inode() is designed for the job, but it is not exported, so we use lookup inode function to access it. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
show more ...
|
#
85ce127a |
| 21-Jul-2013 |
Yan, Zheng <zheng.z.yan@intel.com> |
ceph: wake up writer if vmtruncate work get blocked To write data, the writer first acquires the i_mutex, then try getting caps. The writer may sleep while holding the i_mutex. If the MD
ceph: wake up writer if vmtruncate work get blocked To write data, the writer first acquires the i_mutex, then try getting caps. The writer may sleep while holding the i_mutex. If the MDS revokes Fb cap in this case, vmtruncate work can't do its job because i_mutex is locked. We should wake up the writer and let it truncate the pages. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>
show more ...
|
#
9a5889ae |
| 09-Jul-2013 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph updates from Sage Weil: "There is some follow-on RBD cleanup after the last window's
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph updates from Sage Weil: "There is some follow-on RBD cleanup after the last window's code drop, a series from Yan fixing multi-mds behavior in cephfs, and then a sprinkling of bug fixes all around. Some warnings, sleeping while atomic, a null dereference, and cleanups" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (36 commits) libceph: fix invalid unsigned->signed conversion for timespec encoding libceph: call r_unsafe_callback when unsafe reply is received ceph: fix race between cap issue and revoke ceph: fix cap revoke race ceph: fix pending vmtruncate race ceph: avoid accessing invalid memory libceph: Fix NULL pointer dereference in auth client code ceph: Reconstruct the func ceph_reserve_caps. ceph: Free mdsc if alloc mdsc->mdsmap failed. ceph: remove sb_start/end_write in ceph_aio_write. ceph: avoid meaningless calling ceph_caps_revoking if sync_mode == WB_SYNC_ALL. ceph: fix sleeping function called from invalid context. ceph: move inode to proper flushing list when auth MDS changes rbd: fix a couple warnings ceph: clear migrate seq when MDS restarts ceph: check migrate seq before changing auth cap ceph: fix race between page writeback and truncate ceph: reset iov_len when discarding cap release messages ceph: fix cap release race libceph: fix truncate size calculation ...
show more ...
|
#
84d08fa8 |
| 05-Jul-2013 |
Al Viro <viro@zeniv.linux.org.uk> |
helper for reading ->d_count Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
b415bf4f |
| 01-Jul-2013 |
Yan, Zheng <zheng.z.yan@intel.com> |
ceph: fix pending vmtruncate race The locking order for pending vmtruncate is wrong, it can lead to following race: write wmtruncate work ----------
ceph: fix pending vmtruncate race The locking order for pending vmtruncate is wrong, it can lead to following race: write wmtruncate work ------------------------ ---------------------- lock i_mutex check i_truncate_pending check i_truncate_pending truncate_inode_pages() lock i_mutex (blocked) copy data to page cache unlock i_mutex truncate_inode_pages() The fix is take i_mutex before calling __ceph_do_pending_vmtruncate() Fixes: http://tracker.ceph.com/issues/5453 Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>
show more ...
|