#
1f109d5a |
| 27-Oct-2010 |
Theodore Ts'o <tytso@mit.edu> |
ext4: make various ext4 functions be static
These functions have no need to be exported beyond file context.
No functions needed to be moved for this commit; just some function declarations changed
ext4: make various ext4 functions be static
These functions have no need to be exported beyond file context.
No functions needed to be moved for this commit; just some function declarations changed to be static and removed from header files.
(A similar patch was submitted by Eric Sandeen, but I wanted to handle code movement in separate patches to make sure code changes didn't accidentally get dropped.)
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
#
5dabfc78 |
| 27-Oct-2010 |
Theodore Ts'o <tytso@mit.edu> |
ext4: rename {exit,init}_ext4_*() to ext4_{exit,init}_*()
This is a cleanup to avoid namespace leaks out of fs/ext4
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
#
7360d173 |
| 27-Oct-2010 |
Lukas Czerner <lczerner@redhat.com> |
ext4: Add batched discard support for ext4
Walk through allocation groups and trim all free extents. It can be invoked through FITRIM ioctl on the file system. The main idea is to provide a way to t
ext4: Add batched discard support for ext4
Walk through allocation groups and trim all free extents. It can be invoked through FITRIM ioctl on the file system. The main idea is to provide a way to trim the whole file system if needed, since some SSD's may suffer from performance loss after the whole device was filled (it does not mean that fs is full!).
It search for free extents in allocation groups specified by Byte range start -> start+len. When the free extent is within this range, blocks are marked as used and then trimmed. Afterwards these blocks are marked as free in per-group bitmap.
Since fstrim is a long operation it is good to have an ability to interrupt it by a signal. This was added by Dmitry Monakhov. Thanks Dimitry.
Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
#
bd2d0210 |
| 27-Oct-2010 |
Theodore Ts'o <tytso@mit.edu> |
ext4: use bio layer instead of buffer layer in mpage_da_submit_io
Call the block I/O layer directly instad of going through the buffer layer. This should give us much better performance and scalabi
ext4: use bio layer instead of buffer layer in mpage_da_submit_io
Call the block I/O layer directly instad of going through the buffer layer. This should give us much better performance and scalability, as well as lowering our CPU utilization when doing buffered writeback.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
#
640e9396 |
| 27-Oct-2010 |
Eric Sandeen <sandeen@redhat.com> |
ext4: remove unused ext4_sb_info members
Not that these take up a lot of room, but the structure is long enough as it is, and there's no need to confuse people with these various undocumented & unus
ext4: remove unused ext4_sb_info members
Not that these take up a lot of room, but the structure is long enough as it is, and there's no need to confuse people with these various undocumented & unused structure members...
Signed-off-by: Eric Sandeen <sandeen@redaht.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
#
e0d10bfa |
| 27-Oct-2010 |
Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com> |
ext4: improve llseek error handling for overly large seek offsets
The llseek system call should return EINVAL if passed a seek offset which results in a write error. What this maximum offset should
ext4: improve llseek error handling for overly large seek offsets
The llseek system call should return EINVAL if passed a seek offset which results in a write error. What this maximum offset should be depends on whether or not the huge_file file system feature is set, and whether or not the file is extent based or not.
If the file has no "EXT4_EXTENTS_FL" flag, the maximum size which can be written (write systemcall) is different from the maximum size which can be sought (lseek systemcall).
For example, the following 2 cases demonstrates the differences between the maximum size which can be written, versus the seek offset allowed by the llseek system call:
#1: mkfs.ext3 <dev>; mount -t ext4 <dev> #2: mkfs.ext3 <dev>; tune2fs -Oextent,huge_file <dev>; mount -t ext4 <dev>
Table. the max file size which we can write or seek at each filesystem feature tuning and file flag setting +============+===============================+===============================+ | \ File flag| | | | \ | !EXT4_EXTENTS_FL | EXT4_EXTETNS_FL | |case \| | | +------------+-------------------------------+-------------------------------+ | #1 | write: 2194719883264 | write: -------------- | | | seek: 2199023251456 | seek: -------------- | +------------+-------------------------------+-------------------------------+ | #2 | write: 4402345721856 | write: 17592186044415 | | | seek: 17592186044415 | seek: 17592186044415 | +------------+-------------------------------+-------------------------------+
The differences exist because ext4 has 2 maxbytes which are sb->s_maxbytes (= extent-mapped maxbytes) and EXT4_SB(sb)->s_bitmap_maxbytes (= block-mapped maxbytes). Although generic_file_llseek uses only extent-mapped maxbytes. (llseek of ext4_file_operations is generic_file_llseek which uses sb->s_maxbytes.)
Therefore we create ext4 llseek function which uses 2 maxbytes.
The new own function originates from generic_file_llseek(). If the file flag, "EXT4_EXTENTS_FL" is not set, the function alters inode->i_sb->s_maxbytes into EXT4_SB(inode->i_sb)->s_bitmap_maxbytes.
Signed-off-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Andreas Dilger <adilger.kernel@dilger.ca>
show more ...
|
#
857ac889 |
| 27-Oct-2010 |
Lukas Czerner <lczerner@redhat.com> |
ext4: add interface to advertise ext4 features in sysfs
User-space should have the opportunity to check what features doest ext4 support in each particular copy. This adds easy interface by creating
ext4: add interface to advertise ext4 features in sysfs
User-space should have the opportunity to check what features doest ext4 support in each particular copy. This adds easy interface by creating new "features" directory in sys/fs/ext4/. In that directory files advertising feature names can be created.
Add lazy_itable_init to the feature list.
Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
#
bfff6873 |
| 27-Oct-2010 |
Lukas Czerner <lczerner@redhat.com> |
ext4: add support for lazy inode table initialization
When the lazy_itable_init extended option is passed to mke2fs, it considerably speeds up filesystem creation because inode tables are not zeroed
ext4: add support for lazy inode table initialization
When the lazy_itable_init extended option is passed to mke2fs, it considerably speeds up filesystem creation because inode tables are not zeroed out. The fact that parts of the inode table are uninitialized is not a problem so long as the block group descriptors, which contain information regarding how much of the inode table has been initialized, has not been corrupted However, if the block group checksums are not valid, e2fsck must scan the entire inode table, and the the old, uninitialized data could potentially cause e2fsck to report false problems.
Hence, it is important for the inode tables to be initialized as soon as possble. This commit adds this feature so that mke2fs can safely use the lazy inode table initialization feature to speed up formatting file systems.
This is done via a new new kernel thread called ext4lazyinit, which is created on demand and destroyed, when it is no longer needed. There is only one thread for all ext4 filesystems in the system. When the first filesystem with inititable mount option is mounted, ext4lazyinit thread is created, then the filesystem can register its request in the request list.
This thread then walks through the list of requests picking up scheduled requests and invoking ext4_init_inode_table(). Next schedule time for the request is computed by multiplying the time it took to zero out last inode table with wait multiplier, which can be set with the (init_itable=n) mount option (default is 10). We are doing this so we do not take the whole I/O bandwidth. When the thread is no longer necessary (request list is empty) it frees the appropriate structures and exits (and can be created later later by another filesystem).
We do not disturb regular inode allocations in any way, it just do not care whether the inode table is, or is not zeroed. But when zeroing, we have to skip used inodes, obviously. Also we should prevent new inode allocations from the group, while zeroing is on the way. For that we take write alloc_sem lock in ext4_init_inode_table() and read alloc_sem in the ext4_claim_inode, so when we are unlucky and allocator hits the group which is currently being zeroed, it just has to wait.
This can be suppresed using the mount option no_init_itable.
Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
#
fb1813f4 |
| 27-Oct-2010 |
Curt Wohlgemuth <curtw@google.com> |
ext4: use dedicated slab caches for group_info structures
ext4_group_info structures are currently allocated with kmalloc(). With a typical 4K block size, these are 136 bytes each -- meaning they'll
ext4: use dedicated slab caches for group_info structures
ext4_group_info structures are currently allocated with kmalloc(). With a typical 4K block size, these are 136 bytes each -- meaning they'll each consume a 256-byte slab object. On a system with many ext4 large partitions, that's a lot of wasted kernel slab space. (E.g., a single 1TB partition will have about 8000 block groups, using about 2MB of slab, of which nearly 1MB is wasted.)
This patch creates an array of slab pointers created as needed -- depending on the superblock block size -- and uses these slabs to allocate the group info objects.
Google-Bug-Id: 2980809
Signed-off-by: Curt Wohlgemuth <curtw@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
Revision tags: v2.6.36, v2.6.36-rc8, v2.6.36-rc7, v2.6.36-rc6, v2.6.36-rc5, v2.6.36-rc4, v2.6.36-rc3, v2.6.36-rc2, v2.6.36-rc1, v2.6.35, v2.6.35-rc6, v2.6.35-rc5, v2.6.35-rc4, v2.6.35-rc3 |
|
#
0930fcc1 |
| 07-Jun-2010 |
Al Viro <viro@zeniv.linux.org.uk> |
convert ext4 to ->evict_inode()
pretty much brute-force...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
0cfc9255 |
| 05-Aug-2010 |
Eric Sandeen <sandeen@redhat.com> |
ext4: re-inline ext4_rec_len_(to|from)_disk functions
commit 3d0518f4, "ext4: New rec_len encoding for very large blocksizes" made several changes to this path, but from a perf perspective, un-inlin
ext4: re-inline ext4_rec_len_(to|from)_disk functions
commit 3d0518f4, "ext4: New rec_len encoding for very large blocksizes" made several changes to this path, but from a perf perspective, un-inlining ext4_rec_len_from_disk() seems most significant. This function is called from ext4_check_dir_entry(), which on a file-creation workload is called extremely often.
I tested this with bonnie:
# bonnie++ -u root -s 0 -f -x 200 -d /mnt/test -n 32
(this does 200 iterations) and got this for the file creations:
ext4 stock: Average = 21206.8 files/s ext4 inlined: Average = 22346.7 files/s (+5%)
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
#
8b67f04a |
| 01-Aug-2010 |
Theodore Ts'o <tytso@mit.edu> |
ext4: Add mount options in superblock
Allow mount options to be stored in the superblock. Also add default mount option bits for nobarrier, block_validity, discard, and nodelalloc.
Signed-off-by:
ext4: Add mount options in superblock
Allow mount options to be stored in the superblock. Also add default mount option bits for nobarrier, block_validity, discard, and nodelalloc.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
#
79e83036 |
| 27-Jul-2010 |
Eric Sandeen <sandeen@redhat.com> |
ext4: fix ext4_get_blocks references
ext4_get_blocks got renamed to ext4_map_blocks, but left stale comments and a prototype littered around.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed
ext4: fix ext4_get_blocks references
ext4_get_blocks got renamed to ext4_map_blocks, but left stale comments and a prototype littered around.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
#
5b3ff237 |
| 27-Jul-2010 |
jiayingz@google.com (Jiaying Zhang) <> |
ext4: move aio completion after unwritten extent conversion
This patch is to be applied upon Christoph's "direct-io: move aio_complete into ->end_io" patch. It adds iocb and result fields to struct
ext4: move aio completion after unwritten extent conversion
This patch is to be applied upon Christoph's "direct-io: move aio_complete into ->end_io" patch. It adds iocb and result fields to struct ext4_io_end_t, so that we can call aio_complete from ext4_end_io_nolock() after the extent conversion has finished.
I have verified with Christoph's aio-dio test that used to fail after a few runs on an original kernel but now succeeds on the patched kernel.
See http://thread.gmane.org/gmane.comp.file-systems.ext4/19659 for details.
Signed-off-by: Jiaying Zhang <jiayingz@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
#
89eeddf0 |
| 27-Jul-2010 |
Theodore Ts'o <tytso@mit.edu> |
ext4: Define s_jnl_backup_type in superblock
This has been in use by e2fsprogs for a while; define it to keep the super block fields in sync.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
#
66e61a9e |
| 27-Jul-2010 |
Theodore Ts'o <tytso@mit.edu> |
ext4: Once a day, printk file system error information to dmesg
This allows us to grab any file system error messages by scraping /var/log/messages. This will make it easy for us to do error analys
ext4: Once a day, printk file system error information to dmesg
This allows us to grab any file system error messages by scraping /var/log/messages. This will make it easy for us to do error analysis across the very large number of machines as we deploy ext4 across the fleet.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
#
1c13d5c0 |
| 27-Jul-2010 |
Theodore Ts'o <tytso@mit.edu> |
ext4: Save error information to the superblock for analysis
Save number of file system errors, and the time function name, line number, block number, and inode number of the first and most recent er
ext4: Save error information to the superblock for analysis
Save number of file system errors, and the time function name, line number, block number, and inode number of the first and most recent errors reported on the file system in the superblock.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
#
c398eda0 |
| 27-Jul-2010 |
Theodore Ts'o <tytso@mit.edu> |
ext4: Pass line numbers to ext4_error() and friends
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
#
60fd4da3 |
| 27-Jul-2010 |
Theodore Ts'o <tytso@mit.edu> |
ext4: Cleanup ext4_check_dir_entry so __func__ is now implicit Also start passing the line number to ext4_check_dir since we're going to need it in upcoming patch. Signed-off-by: "Theodore
ext4: Cleanup ext4_check_dir_entry so __func__ is now implicit Also start passing the line number to ext4_check_dir since we're going to need it in upcoming patch. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
#
e29136f8 |
| 29-Jun-2010 |
Theodore Ts'o <tytso@mit.edu> |
ext4: Enhance ext4_grp_locked_error() to take block and function numbers
Also use a macro definition so that __func__ and __LINE__ is implicit.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
#
c67d859e |
| 29-Jun-2010 |
Theodore Ts'o <tytso@mit.edu> |
ext4: clean up ext4_abort() so __func__ is now implicit
Use a macro definition for ext4_abort() to clean up the .c files a wee bit.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
#
4a9cdec7 |
| 29-Jun-2010 |
Theodore Ts'o <tytso@mit.edu> |
ext4: Add new superblock fields reserved for the Next3 snapshot feature
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
#
206f7ab4 |
| 14-Jun-2010 |
Christoph Hellwig <hch@lst.de> |
ext4: remove vestiges of nobh support
The nobh option was only supported for writeback mode, but given that all write paths actually create buffer heads it effectively was a no-op already.
Signed-o
ext4: remove vestiges of nobh support
The nobh option was only supported for writeback mode, but given that all write paths actually create buffer heads it effectively was a no-op already.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
#
a0375156 |
| 11-Jun-2010 |
Theodore Ts'o <tytso@mit.edu> |
ext4: Clean up s_dirt handling
We don't need to set s_dirt in most of the ext4 code when journaling is enabled. In ext3/4 some of the summary statistics for # of free inodes, blocks, and directorie
ext4: Clean up s_dirt handling
We don't need to set s_dirt in most of the ext4 code when journaling is enabled. In ext3/4 some of the summary statistics for # of free inodes, blocks, and directories are calculated from the per-block group statistics when the file system is mounted or unmounted. As a result the superblock doesn't have to be updated, either via the journal or by setting s_dirt. There are a few exceptions, most notably when resizing the file system, where the superblock needs to be modified --- and in that case it should be done as a journalled operation if possible, and s_dirt set only in no-journal mode.
This patch will optimize out some unneeded disk writes when using ext4 with a journal.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
Revision tags: v2.6.35-rc2, v2.6.35-rc1 |
|
#
7ea80859 |
| 26-May-2010 |
Christoph Hellwig <hch@lst.de> |
drop unused dentry argument to ->fsync
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|