#
95eaefbd |
| 09-Feb-2013 |
Theodore Ts'o <tytso@mit.edu> |
ext4: fix the number of credits needed for acl ops with inline data
Operations which modify extended attributes may need extra journal credits if inline data is used, since there is a chance that so
ext4: fix the number of credits needed for acl ops with inline data
Operations which modify extended attributes may need extra journal credits if inline data is used, since there is a chance that some extended attributes may need to get pushed to an external attribute block.
Changes to reflect this was made in xattr.c, but they were missed in fs/ext4/acl.c. To fix this, abstract the calculation of the number of credits needed for xattr operations to an inline function defined in ext4_jbd2.h, and use it in acl.c and xattr.c.
Also move the function declarations used in inline.c from xattr.h (where they are non-obviously hidden, and caused problems since ext4_jbd2.h needs to use the function ext4_has_inline_data), and move them to ext4.h.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Tao Ma <boyu.mt@taobao.com> Reviewed-by: Jan Kara <jack@suse.cz>
show more ...
|
#
9924a92a |
| 08-Feb-2013 |
Theodore Ts'o <tytso@mit.edu> |
ext4: pass context information to jbd2__journal_start()
So we can better understand what bits of ext4 are responsible for long-running jbd2 handles, use jbd2__journal_start() so we can pass context
ext4: pass context information to jbd2__journal_start()
So we can better understand what bits of ext4 are responsible for long-running jbd2 handles, use jbd2__journal_start() so we can pass context information for logging purposes.
The recommended way for finding the longer-running handles is:
T=/sys/kernel/debug/tracing EVENT=$T/events/jbd2/jbd2_handle_stats echo "interval > 5" > $EVENT/filter echo 1 > $EVENT/enable
./run-my-fs-benchmark
cat $T/trace > /tmp/problem-handles
This will list handles that were active for longer than 20ms. Having longer-running handles is bad, because a commit started at the wrong time could stall for those 20+ milliseconds, which could delay an fsync() or an O_SYNC operation. Here is an example line from the trace file describing a handle which lived on for 311 jiffies, or over 1.2 seconds:
postmark-2917 [000] .... 196.435786: jbd2_handle_stats: dev 254,32 tid 570 type 2 line_no 2541 interval 311 sync 0 requested_blocks 1 dirtied_blocks 0
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
Revision tags: v3.8-rc7, v3.8-rc6, v3.8-rc5, v3.8-rc4 |
|
#
aebf0243 |
| 12-Jan-2013 |
Wang Shilong <wangsl-fnst@cn.fujitsu.com> |
ext4: use unlikely to improve the efficiency of the kernel
Because the function 'sb_getblk' seldomly fails to return NULL value,it will be better to use 'unlikely' to optimize it.
Signed-off-by: Wa
ext4: use unlikely to improve the efficiency of the kernel
Because the function 'sb_getblk' seldomly fails to return NULL value,it will be better to use 'unlikely' to optimize it.
Signed-off-by: Wang Shilong <wangsl-fnst@cn.fujitsu.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
#
860d21e2 |
| 12-Jan-2013 |
Theodore Ts'o <tytso@mit.edu> |
ext4: return ENOMEM if sb_getblk() fails
The only reason for sb_getblk() failing is if it can't allocate the buffer_head. So ENOMEM is more appropriate than EIO. In addition, make sure that the fi
ext4: return ENOMEM if sb_getblk() fails
The only reason for sb_getblk() failing is if it can't allocate the buffer_head. So ENOMEM is more appropriate than EIO. In addition, make sure that the file system is marked as being inconsistent if sb_getblk() fails.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
show more ...
|
Revision tags: v3.8-rc3, v3.8-rc2, v3.8-rc1 |
|
#
bd9926e8 |
| 11-Dec-2012 |
Theodore Ts'o <tytso@mit.edu> |
ext4: zero out inline data using memset() instead of empty_zero_page
Not all architectures (in particular, sparc64) have empty_zero_page. So instead of copying from empty_zero_page, use memset to cl
ext4: zero out inline data using memset() instead of empty_zero_page
Not all architectures (in particular, sparc64) have empty_zero_page. So instead of copying from empty_zero_page, use memset to clear the inline data by signalling to ext4_xattr_set_entry() via a magic pointer value, EXT4_ZERO_ATTR_VALUE, which is defined by casting -1 to a pointer.
This fixes a build failure on sparc64, and the memset() should be more efficient than using memcpy() anyway.
Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
Revision tags: v3.7 |
|
#
0d812f77 |
| 10-Dec-2012 |
Tao Ma <boyu.mt@taobao.com> |
ext4: evict inline data out if we need to strore xattr in inode
Now we that store data in the inode, in case we need to store some xattrs and inode doesn't have enough space, Andreas suggested that
ext4: evict inline data out if we need to strore xattr in inode
Now we that store data in the inode, in case we need to store some xattrs and inode doesn't have enough space, Andreas suggested that we should keep the xattr(metadata) in and data should be pushed out. So this patch does the work.
Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
#
879b3825 |
| 05-Dec-2012 |
Tao Ma <boyu.mt@taobao.com> |
ext4: export inline xattr functions
The inline data feature will need some inline xattr functions, so export them from fs/ext4/xattr.c so that inline.c can use them.
Signed-off-by: Tao Ma <boyu.mt@
ext4: export inline xattr functions
The inline data feature will need some inline xattr functions, so export them from fs/ext4/xattr.c so that inline.c can use them.
Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
Revision tags: v3.7-rc8, v3.7-rc7, v3.7-rc6, v3.7-rc5 |
|
#
37be2f59 |
| 08-Nov-2012 |
Eric Sandeen <sandeen@redhat.com> |
ext4: remove ext4_handle_release_buffer()
ext4_handle_release_buffer() was intended to remove journal write access from a buffer, but it doesn't actually do anything at all other than add a BUFFER_T
ext4: remove ext4_handle_release_buffer()
ext4_handle_release_buffer() was intended to remove journal write access from a buffer, but it doesn't actually do anything at all other than add a BUFFER_TRACE point, but it's not reliably used for that either. Remove all the associated dead code.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
show more ...
|
Revision tags: v3.7-rc4, v3.7-rc3, v3.7-rc2, v3.7-rc1, v3.6, v3.6-rc7, v3.6-rc6, v3.6-rc5, v3.6-rc4, v3.6-rc3, v3.6-rc2, v3.6-rc1, v3.5, v3.5-rc7 |
|
#
41eb70dd |
| 09-Jul-2012 |
Tao Ma <boyu.mt@taobao.com> |
ext4: use s_csum_seed instead of i_csum_seed for xattr block
In xattr block operation, we use h_refcount to indicate whether the xattr block is shared among many inodes. And xattr block csum uses s_
ext4: use s_csum_seed instead of i_csum_seed for xattr block
In xattr block operation, we use h_refcount to indicate whether the xattr block is shared among many inodes. And xattr block csum uses s_csum_seed if it is shared and i_csum_seed if it belongs to one inode. But this has a problem. So consider the block is shared first bewteen inode A and B, and B has some xattr update and CoW the xattr block. When it updates the *old* xattr block(because of the h_refcount change) and calls ext4_xattr_release_block, we has no idea that inode A is the real owner of the *old* xattr block and we can't use the i_csum_seed of inode A either in xattr block csum calculation. And I don't think we have an easy way to find inode A.
So this patch just removes the tricky i_csum_seed and we now uses s_csum_seed every time for the xattr block csum. The corresponding patch for the e2fsprogs will be sent in another patch.
This is spotted by xfstests 117.
Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Acked-by: Darrick J. Wong <djwong@us.ibm.com>
show more ...
|
Revision tags: v3.5-rc6, v3.5-rc5, v3.5-rc4, v3.5-rc3, v3.5-rc2, v3.5-rc1, v3.4, v3.4-rc7, v3.4-rc6 |
|
#
cc8e94fd |
| 29-Apr-2012 |
Darrick J. Wong <djwong@us.ibm.com> |
ext4: Calculate and verify checksums of extended attribute blocks
Calculate and verify the checksums of extended attribute blocks. This only applies to separate EA blocks that are pointed to by ino
ext4: Calculate and verify checksums of extended attribute blocks
Calculate and verify the checksums of extended attribute blocks. This only applies to separate EA blocks that are pointed to by inode->i_file_acl (i.e. external EA blocks); the checksum lives in the EA header.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
Revision tags: v3.4-rc5, v3.4-rc4, v3.4-rc3, v3.4-rc2, v3.4-rc1 |
|
#
ace36ad4 |
| 19-Mar-2012 |
Joe Perches <joe@perches.com> |
ext4: add no_printk argument validation, fix fallout
Add argument validation to debug functions. Use ##__VA_ARGS__.
Fix format and argument mismatches.
Signed-off-by: Joe Perches <joe@perches.com>
ext4: add no_printk argument validation, fix fallout
Add argument validation to debug functions. Use ##__VA_ARGS__.
Fix format and argument mismatches.
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
Revision tags: v3.3, v3.3-rc7, v3.3-rc6, v3.3-rc5 |
|
#
c1bb05a6 |
| 20-Feb-2012 |
Eric Sandeen <sandeen@redhat.com> |
ext4: avoid deadlock on sync-mounted FS w/o journal
Processes hang forever on a sync-mounted ext2 file system that is mounted with the ext4 module (default in Fedora 16).
I can reproduce this relia
ext4: avoid deadlock on sync-mounted FS w/o journal
Processes hang forever on a sync-mounted ext2 file system that is mounted with the ext4 module (default in Fedora 16).
I can reproduce this reliably by mounting an ext2 partition with "-o sync" and opening a new file an that partition with vim. vim will hang in "D" state forever. The same happens on ext4 without a journal.
I am attaching a small patch here that solves this issue for me. In the sync mounted case without a journal, ext4_handle_dirty_metadata() may call sync_dirty_buffer(), which can't be called with buffer lock held.
Also move mb_cache_entry_release inside lock to avoid race fixed previously by 8a2bfdcb ext[34]: EA block reference count racing fix Note too that ext2 fixed this same problem in 2006 with b2f49033 [PATCH] fix deadlock in ext2
Signed-off-by: Martin.Wilck@ts.fujitsu.com [sandeen@redhat.com: move mb_cache_entry_release before unlock, edit commit msg] Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
#
f1b3a2a7 |
| 20-Feb-2012 |
Zheng Liu <wenqing.lz@taobao.com> |
ext4: remove unneeded variable in ext4_xattr_check_block()
We could return directly from ext4_xattr_check_block(). Thus, we shouldn't need to define a 'error' variable.
Signed-off-by: Zheng Liu <we
ext4: remove unneeded variable in ext4_xattr_check_block()
We could return directly from ext4_xattr_check_block(). Thus, we shouldn't need to define a 'error' variable.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
Revision tags: v3.3-rc4, v3.3-rc3, v3.3-rc2, v3.3-rc1, v3.2, v3.2-rc7, v3.2-rc6, v3.2-rc5, v3.2-rc4, v3.2-rc3, v3.2-rc2, v3.2-rc1 |
|
#
6d6a4351 |
| 29-Oct-2011 |
Eric Sandeen <sandeen@redhat.com> |
ext4: fix race in xattr block allocation path
Ceph users reported that when using Ceph on ext4, the filesystem would often become corrupted, containing inodes with incorrect i_blocks counters.
I ma
ext4: fix race in xattr block allocation path
Ceph users reported that when using Ceph on ext4, the filesystem would often become corrupted, containing inodes with incorrect i_blocks counters.
I managed to reproduce this with a very hacked-up "streamtest" binary from the Ceph tree.
Ceph is doing a lot of xattr writes, to out-of-inode blocks. There is also another thread which does sync_file_range and close, of the same files. The problem appears to happen due to this race:
sync/flush thread xattr-set thread ----------------- ----------------
do_writepages ext4_xattr_set ext4_da_writepages ext4_xattr_set_handle mpage_da_map_blocks ext4_xattr_block_set set DELALLOC_RESERVE ext4_new_meta_blocks ext4_mb_new_blocks if (!i_delalloc_reserved_flag) vfs_dq_alloc_block ext4_get_blocks down_write(i_data_sem) set i_delalloc_reserved_flag ... up_write(i_data_sem) if (i_delalloc_reserved_flag) vfs_dq_alloc_block_nofail
In other words, the sync/flush thread pops in and sets i_delalloc_reserved_flag on the inode, which makes the xattr thread think that it's in a delalloc path in ext4_new_meta_blocks(), and add the block for a second time, after already having added it once in the !i_delalloc_reserved_flag case in ext4_mb_new_blocks
The real problem is that we shouldn't be using the DELALLOC_RESERVED state flag, and instead we should be passing EXT4_GET_BLOCKS_DELALLOC_RESERVE down to ext4_map_blocks() instead of using an inode state flag. We'll fix this for now with using i_data_sem to prevent this race, but this is really not the right way to fix things.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org
show more ...
|
#
66543617 |
| 26-Oct-2011 |
Eric Sandeen <sandeen@redhat.com> |
ext4: use ext4_reserve_inode_write in ext4_xattr_set_handle
ext4_mark_iloc_dirty() says:
* The caller must have previously called ext4_reserve_inode_write(). * Give this, we know that the caller
ext4: use ext4_reserve_inode_write in ext4_xattr_set_handle
ext4_mark_iloc_dirty() says:
* The caller must have previously called ext4_reserve_inode_write(). * Give this, we know that the caller already has write access to iloc->bh.
ext4_xattr_set_handle, however, just open-codes it. May as well use the helper function for consistency.
No bug here, just tidiness.
(Note: on cleanup path, ext4_reserve_inode_write sets the bh to NULL if it returns an error, and brelse() of a null bh is handled gracefully).
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
Revision tags: v3.1, v3.1-rc10, v3.1-rc9, v3.1-rc8, v3.1-rc7, v3.1-rc6, v3.1-rc5, v3.1-rc4, v3.1-rc3, v3.1-rc2, v3.1-rc1, v3.0, v3.0-rc7, v3.0-rc6, v3.0-rc5, v3.0-rc4, v3.0-rc3, v3.0-rc2, v3.0-rc1 |
|
#
55f020db |
| 25-May-2011 |
Allison Henderson <achender@linux.vnet.ibm.com> |
ext4: add flag to ext4_has_free_blocks
This patch adds an allocation request flag to the ext4_has_free_blocks function which enables the use of reserved blocks. This will allow a punch hole to proc
ext4: add flag to ext4_has_free_blocks
This patch adds an allocation request flag to the ext4_has_free_blocks function which enables the use of reserved blocks. This will allow a punch hole to proceed even if the disk is full. Punching a hole may require additional blocks to first split the extents.
Because ext4_has_free_blocks is a low level function, the flag needs to be passed down through several functions listed below:
ext4_ext_insert_extent ext4_ext_create_new_leaf ext4_ext_grow_indepth ext4_ext_split ext4_ext_new_meta_block ext4_mb_new_blocks ext4_claim_free_blocks ext4_has_free_blocks
[ext4 punch hole patch series 1/5 v7]
Signed-off-by: Allison Henderson <achender@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Mingming Cao <cmm@us.ibm.com>
show more ...
|
Revision tags: v2.6.39, v2.6.39-rc7, v2.6.39-rc6, v2.6.39-rc5, v2.6.39-rc4, v2.6.39-rc3, v2.6.39-rc2, v2.6.39-rc1 |
|
#
537a0310 |
| 20-Mar-2011 |
Amir Goldstein <amir73il@gmail.com> |
ext4: unify the ext4_handle_release_buffer() api
There are two wrapper functions which do exactly the same thing: ext4_journal_release_buffer(), and ext4_handle_release_buffer(). In addition, ext4_
ext4: unify the ext4_handle_release_buffer() api
There are two wrapper functions which do exactly the same thing: ext4_journal_release_buffer(), and ext4_handle_release_buffer(). In addition, ext4_xattr_block_set() calls jbd2_journal_release_buffer() directly.
Unify all of the code to use ext4_handle_release_buffer(), and get rid of ext4_journal_release_buffer().
Signed-off-by: Amir Goldstein <amir73il@users.sf.net> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
Revision tags: v2.6.38, v2.6.38-rc8, v2.6.38-rc7 |
|
#
7dc57615 |
| 21-Feb-2011 |
Peter Huewe <peterhuewe@gmx.de> |
ext4: Fix sparse warning: Using plain integer as NULL pointer
This patch fixes the warning "Using plain integer as NULL pointer", generated by sparse, by replacing the offending 0s with NULL.
Signe
ext4: Fix sparse warning: Using plain integer as NULL pointer
This patch fixes the warning "Using plain integer as NULL pointer", generated by sparse, by replacing the offending 0s with NULL.
Signed-off-by: Peter Huewe <peterhuewe@gmx.de> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
Revision tags: v2.6.38-rc6, v2.6.38-rc5, v2.6.38-rc4, v2.6.38-rc3, v2.6.38-rc2, v2.6.38-rc1 |
|
#
6e9510b0 |
| 10-Jan-2011 |
Wang Sheng-Hui <crosslonelyover@gmail.com> |
ext2,ext3,ext4: clarify comment for extN_xattr_set_handle
Signed-off-by: Wang Sheng-Hui <crosslonelyover@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
#
eaeef867 |
| 10-Jan-2011 |
Theodore Ts'o <tytso@mit.edu> |
ext4: clean up ext4_xattr_list()'s error code checking and return strategy
Any time you see code that tries to add error codes together, you should want to claw your eyes out...
Signed-off-by: "The
ext4: clean up ext4_xattr_list()'s error code checking and return strategy
Any time you see code that tries to add error codes together, you should want to claw your eyes out...
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
Revision tags: v2.6.37, v2.6.37-rc8, v2.6.37-rc7, v2.6.37-rc6, v2.6.37-rc5, v2.6.37-rc4, v2.6.37-rc3, v2.6.37-rc2, v2.6.37-rc1 |
|
#
5dabfc78 |
| 27-Oct-2010 |
Theodore Ts'o <tytso@mit.edu> |
ext4: rename {exit,init}_ext4_*() to ext4_{exit,init}_*()
This is a cleanup to avoid namespace leaks out of fs/ext4
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
Revision tags: v2.6.36, v2.6.36-rc8, v2.6.36-rc7, v2.6.36-rc6, v2.6.36-rc5, v2.6.36-rc4, v2.6.36-rc3, v2.6.36-rc2, v2.6.36-rc1, v2.6.35, v2.6.35-rc6 |
|
#
2aec7c52 |
| 19-Jul-2010 |
Andreas Gruenbacher <agruen@suse.de> |
mbcache: Remove unused features
The mbcache code was written to support a variable number of indexes, but all the existing users use exactly one index. Simplify to code to support only that case.
mbcache: Remove unused features
The mbcache code was written to support a variable number of indexes, but all the existing users use exactly one index. Simplify to code to support only that case.
There are also no users of the cache entry free operation, and none of the users keep extra data in cache entries. Remove those features as well.
Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
show more ...
|
Revision tags: v2.6.35-rc5, v2.6.35-rc4 |
|
#
a0375156 |
| 11-Jun-2010 |
Theodore Ts'o <tytso@mit.edu> |
ext4: Clean up s_dirt handling
We don't need to set s_dirt in most of the ext4 code when journaling is enabled. In ext3/4 some of the summary statistics for # of free inodes, blocks, and directorie
ext4: Clean up s_dirt handling
We don't need to set s_dirt in most of the ext4 code when journaling is enabled. In ext3/4 some of the summary statistics for # of free inodes, blocks, and directories are calculated from the per-block group statistics when the file system is mounted or unmounted. As a result the superblock doesn't have to be updated, either via the journal or by setting s_dirt. There are a few exceptions, most notably when resizing the file system, where the superblock needs to be modified --- and in that case it should be done as a journalled operation if possible, and s_dirt set only in no-journal mode.
This patch will optimize out some unneeded disk writes when using ext4 with a journal.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
Revision tags: v2.6.35-rc3, v2.6.35-rc2, v2.6.35-rc1, v2.6.34 |
|
#
11e27528 |
| 13-May-2010 |
Stephen Hemminger <shemminger@vyatta.com> |
ext4: constify xattr_handler
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
12e9b892 |
| 16-May-2010 |
Dmitry Monakhov <dmonakhov@openvz.org> |
ext4: Use bitops to read/modify i_flags in struct ext4_inode_info
At several places we modify EXT4_I(inode)->i_flags without holding i_mutex (ext4_do_update_inode, ...). These modifications are racy
ext4: Use bitops to read/modify i_flags in struct ext4_inode_info
At several places we modify EXT4_I(inode)->i_flags without holding i_mutex (ext4_do_update_inode, ...). These modifications are racy and we can lose updates to i_flags. So convert handling of i_flags to use bitops which are atomic.
https://bugzilla.kernel.org/show_bug.cgi?id=15792
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|