#
8c3bf8a0 |
| 27-Oct-2008 |
Eric Sandeen <sandeen@redhat.com> |
merge ext4_claim_free_blocks & ext4_has_free_blocks
Mingming pointed out that ext4_claim_free_blocks & ext4_has_free_blocks are largely cut & pasted; they can be collapsed/merged as follows.
Signed
merge ext4_claim_free_blocks & ext4_has_free_blocks
Mingming pointed out that ext4_claim_free_blocks & ext4_has_free_blocks are largely cut & pasted; they can be collapsed/merged as follows.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
Revision tags: v2.6.28-rc2, v2.6.28-rc1 |
|
#
01436ef2 |
| 17-Oct-2008 |
Theodore Ts'o <tytso@mit.edu> |
ext4: Remove unused mount options: nomballoc, mballoc, nocheck
These mount options don't actually do anything any more, so remove them.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
#
5bf5683a |
| 10-Oct-2008 |
Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> |
ext4: add an option to control error handling on file data
If the journal doesn't abort when it gets an IO error in file data blocks, the file data corruption will spread silently. Because most of
ext4: add an option to control error handling on file data
If the journal doesn't abort when it gets an IO error in file data blocks, the file data corruption will spread silently. Because most of applications and commands do buffered writes without fsync(), they don't notice the IO error. It's scary for mission critical systems. On the other hand, if the journal aborts whenever it gets an IO error in file data blocks, the system will easily become inoperable. So this patch introduces a filesystem option to determine whether it aborts the journal or just call printk() when it gets an IO error in file data.
If you mount an ext4 fs with data_err=abort option, it aborts on file data write error. If you mount it with data_err=ignore, it doesn't abort, just call printk(). data_err=ignore is the default.
Here is the corresponding patch of the ext3 version: http://kerneltrap.org/mailarchive/linux-kernel/2008/9/9/3239374
Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
show more ...
|
Revision tags: v2.6.27 |
|
#
6873fa0d |
| 06-Oct-2008 |
Eric Sandeen <sandeen@redhat.com> |
Hook ext4 to the vfs fiemap interface.
ext4_ext_walk_space() was reinstated to be used for iterating over file extents with a callback; it is used by the ext4 fiemap implementation.
Signed-off-by:
Hook ext4 to the vfs fiemap interface.
ext4_ext_walk_space() was reinstated to be used for iterating over file extents with a callback; it is used by the ext4 fiemap implementation.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org
show more ...
|
#
c2ea3fde |
| 10-Oct-2008 |
Theodore Ts'o <tytso@mit.edu> |
ext4: Remove old legacy block allocator
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
#
240799cd |
| 09-Oct-2008 |
Theodore Ts'o <tytso@mit.edu> |
ext4: Use readahead when reading an inode from the inode table
With modern hard drives, reading 64k takes roughly the same time as reading a 4k block. So request readahead for adjacent inode table
ext4: Use readahead when reading an inode from the inode table
With modern hard drives, reading 64k takes roughly the same time as reading a 4k block. So request readahead for adjacent inode table blocks to reduce the time it takes when iterating over directories (especially when doing this in htree sort order) in a cold cache case. With this patch, the time it takes to run "git status" on a kernel tree after flushing the caches via "echo 3 > /proc/sys/vm/drop_caches" is reduced by 21%.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
Revision tags: v2.6.27-rc9, v2.6.27-rc8 |
|
#
5e8814f2 |
| 23-Sep-2008 |
Theodore Ts'o <tytso@mit.edu> |
ext4: Combine proc file handling into a single set of functions
Previously mballoc created a separate set of functions for each proc file. This combines the tunables into a single set of functions
ext4: Combine proc file handling into a single set of functions
Previously mballoc created a separate set of functions for each proc file. This combines the tunables into a single set of functions which gets used for all of the per-superblock proc files, saving approximately 2k of compiled object code.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
#
9f6200bb |
| 23-Sep-2008 |
Theodore Ts'o <tytso@mit.edu> |
ext4: move /proc setup and teardown out of mballoc.c
...and into the core setup/teardown code in fs/ext4/super.c so that other parts of ext4 can define tuning parameters.
Signed-off-by: "Theodore T
ext4: move /proc setup and teardown out of mballoc.c
...and into the core setup/teardown code in fs/ext4/super.c so that other parts of ext4 can define tuning parameters.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
Revision tags: v2.6.27-rc7 |
|
#
8eea80d5 |
| 13-Sep-2008 |
Theodore Ts'o <tytso@mit.edu> |
ext4: Renumber EXT4_IOC_MIGRATE
Pick an ioctl number for EXT4_IOC_MIGRATE that won't conflict with other ext4 ioctl's. Since there haven't been any major userspace users of this ioctl, we can affor
ext4: Renumber EXT4_IOC_MIGRATE
Pick an ioctl number for EXT4_IOC_MIGRATE that won't conflict with other ext4 ioctl's. Since there haven't been any major userspace users of this ioctl, we can afford to change this now, to avoid potential problems later.
Also, reorder the ioctl numbers in ext4.h to avoid this sort of mistake in the future.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
#
4db46fc2 |
| 08-Oct-2008 |
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> |
ext4: hook the ext3 migration interface to the EXT4_IOC_SETFLAGS ioctl
This patch hooks the ext3 to ext4 migrate interface to EXT4_IOC_SETFLAGS ioctl. The userspace interface is via chattr +e. We o
ext4: hook the ext3 migration interface to the EXT4_IOC_SETFLAGS ioctl
This patch hooks the ext3 to ext4 migrate interface to EXT4_IOC_SETFLAGS ioctl. The userspace interface is via chattr +e. We only allow setting extent flags. Clearing extent flag (migrating from ext4 to ext3) is not supported.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
#
2a43a878 |
| 13-Sep-2008 |
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> |
ext4: elevate write count for migrate ioctl
The migrate ioctl writes to the filsystem, so we need to elevate the write count.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signe
ext4: elevate write count for migrate ioctl
The migrate ioctl writes to the filsystem, so we need to elevate the write count.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
#
cf17fea6 |
| 13-Sep-2008 |
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> |
ext4: Properly update i_disksize.
With delayed allocation we use i_data_sem to update i_disksize. We need to update i_disksize only if the new size specified is greater than the current value and w
ext4: Properly update i_disksize.
With delayed allocation we use i_data_sem to update i_disksize. We need to update i_disksize only if the new size specified is greater than the current value and we need to make sure we don't race with other i_disksize update. With delayed allocation we will switch to the write_begin function for non-delayed allocation if we are low on free blocks. This means the write_begin function for non-delayed allocation also needs to use the same locking.
We also need to check and update i_disksize even if the new size is less that inode.i_size because of delayed allocation.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
#
5c791616 |
| 08-Oct-2008 |
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> |
ext4: Signed arithmetic fix
This patch converts some usage of ext4_fsblk_t to s64. This is needed so that some of the sign conversion works as expected in if loops.
Signed-off-by: Aneesh Kumar K.V
ext4: Signed arithmetic fix
This patch converts some usage of ext4_fsblk_t to s64. This is needed so that some of the sign conversion works as expected in if loops.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
#
a30d542a |
| 09-Oct-2008 |
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> |
ext4: Make sure all the block allocation paths reserve blocks
With delayed allocation we need to make sure block are reserved before we attempt to allocate them. Otherwise we get block allocation fa
ext4: Make sure all the block allocation paths reserve blocks
With delayed allocation we need to make sure block are reserved before we attempt to allocate them. Otherwise we get block allocation failure (ENOSPC) during writepages which cannot be handled. This would mean silent data loss (We do a printk stating data will be lost). This patch updates the DIO and fallocate code path to do block reservation before block allocation. This is needed to make sure parallel DIO and fallocate request doesn't take block out of delayed reserve space.
When free blocks count go below a threshold we switch to a slow patch which looks at other CPU's accumulated percpu counter values.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
Revision tags: v2.6.27-rc6 |
|
#
af5bc92d |
| 08-Sep-2008 |
Theodore Ts'o <tytso@mit.edu> |
ext4: Fix whitespace checkpatch warnings/errors
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
Revision tags: v2.6.27-rc5, v2.6.27-rc4 |
|
#
f3bd1f3f |
| 19-Aug-2008 |
Mingming Cao <cmm@us.ibm.com> |
ext4: journal credits reservation fixes for DIO, fallocate
DIO and fallocate credit calculation is different than writepage, as they do start a new journal right for each call to ext4_get_blocks_wra
ext4: journal credits reservation fixes for DIO, fallocate
DIO and fallocate credit calculation is different than writepage, as they do start a new journal right for each call to ext4_get_blocks_wrap(). This patch uses the helper function in DIO and fallocate case, passing a flag indicating that the modified data are contigous thus could account less indirect/index blocks.
This patch also fixed the journal credit reservation for direct I/O (DIO). Previously the estimated credits for DIO only was calculated for non-extent files, which was not enough if the file is extent-based.
Also fixed was fallocate double-counting credits for modifying the the superblock.
Signed-off-by: Mingming Cao <cmm@us.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
#
a02908f1 |
| 19-Aug-2008 |
Mingming Cao <cmm@us.ibm.com> |
ext4: journal credits calulation cleanup and fix for non-extent writepage
When considering how many journal credits are needed for modifying a chunk of data, we need to account for the super block,
ext4: journal credits calulation cleanup and fix for non-extent writepage
When considering how many journal credits are needed for modifying a chunk of data, we need to account for the super block, inode block, quota blocks and xattr block, indirect/index blocks, also, group bitmap and group descriptor blocks for new allocation (including data and indirect/index blocks). There are many places in ext4 do the calculation on their own and often missed one or two meta blocks, and often they assume single block allocation, and did not considering the multile chunk of allocation case.
This patch is trying to cleanup current journal credit code, provides some common helper funtion to calculate the journal credits, to be used for writepage, writepages, DIO, fallocate, migration, defrag, and for both nonextent and extent files.
This patch modified the writepage/write_begin credit caculation for nonextent files, to use the new helper function. It also fixed the problem that writepage on nonextent files did not consider the case blocksize <pagesize, thus could possibelly need multiple block allocation in a single transaction.
Signed-off-by: Mingming Cao <cmm@us.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
Revision tags: v2.6.27-rc3, v2.6.27-rc2, v2.6.27-rc1 |
|
#
12219aea |
| 17-Jul-2008 |
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> |
ext4: Cleanup the block reservation code path
The truncate patch should not use the i_allocated_meta_blocks value. So add seperate functions to be used in the truncate and alloc path. We also need t
ext4: Cleanup the block reservation code path
The truncate patch should not use the i_allocated_meta_blocks value. So add seperate functions to be used in the truncate and alloc path. We also need to release the meta-data block that we reserved for the blocks that we are truncating.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
show more ...
|
Revision tags: v2.6.26 |
|
#
3e3398a0 |
| 11-Jul-2008 |
Mingming Cao <cmm@us.ibm.com> |
ext4: delayed allocation i_blocks fix for stat
Right now i_blocks is not getting updated until the blocks are actually allocaed on disk. This means with delayed allocation, right after files are co
ext4: delayed allocation i_blocks fix for stat
Right now i_blocks is not getting updated until the blocks are actually allocaed on disk. This means with delayed allocation, right after files are copied, "ls -sF" shoes the file as taking 0 blocks on disk. "du" also shows the files taking zero space, which is highly confusing to the user.
Since delayed allocation already keeps track of per-inode total number of blocks that are subject to delayed allocation, this patch fix this by using that to adjust the value returned by stat(2). When real block allocation is done, the i_blocks will get updated. Since the reserved blocks for delayed allocation will be decreased, this will be keep value returned by stat(2) consistent.
Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
#
d2a17637 |
| 14-Jul-2008 |
Mingming Cao <cmm@us.ibm.com> |
ext4: delayed allocation ENOSPC handling
This patch does block reservation for delayed allocation, to avoid ENOSPC later at page flush time.
Blocks(data and metadata) are reserved at da_write_begin
ext4: delayed allocation ENOSPC handling
This patch does block reservation for delayed allocation, to avoid ENOSPC later at page flush time.
Blocks(data and metadata) are reserved at da_write_begin() time, the freeblocks counter is updated by then, and the number of reserved blocks is store in per inode counter. At the writepage time, the unused reserved meta blocks are returned back. At unlink/truncate time, reserved blocks are properly released.
Updated fix from Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> to fix the oldallocator block reservation accounting with delalloc, added lock to guard the counters and also fix the reservation for meta blocks.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
show more ...
|
#
64769240 |
| 11-Jul-2008 |
Alex Tomas <alex@clusterfs.com> |
ext4: Add delayed allocation support in data=writeback mode
Updated with fixes from Mingming Cao <cmm@us.ibm.com> to unlock and release the page from page cache if the delalloc write_begin failed, a
ext4: Add delayed allocation support in data=writeback mode
Updated with fixes from Mingming Cao <cmm@us.ibm.com> to unlock and release the page from page cache if the delalloc write_begin failed, and properly handle preallocated blocks. Also added a fix to clear buffer_delay in block_write_full_page() after allocating a delayed buffer.
Updated with fixes from Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> to update i_disksize properly and to add bmap support for delayed allocation.
Updated with a fix from Valerie Clement <valerie.clement@bull.net> to avoid filesystem corruption when the filesystem is mounted with the delalloc option and blocksize < pagesize.
Signed-off-by: Alex Tomas <alex@clusterfs.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
show more ...
|
#
cf108bca |
| 11-Jul-2008 |
Jan Kara <jack@suse.cz> |
ext4: Invert the locking order of page_lock and transaction start
This changes are needed to support data=ordered mode handling via inodes. This enables us to get rid of the journal heads and buffe
ext4: Invert the locking order of page_lock and transaction start
This changes are needed to support data=ordered mode handling via inodes. This enables us to get rid of the journal heads and buffer heads for data buffers in the ordered mode. With the changes, during tranasaction commit we writeout the inode pages using the writepages()/writepage(). That implies we take page lock during transaction commit. This can cause a deadlock with the locking order page_lock -> jbd2_journal_start, since the jbd2_journal_start can wait for the journal_commit to happen and the journal_commit now needs to take the page lock. To avoid this dead lock reverse the locking order.
Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
#
2e9ee850 |
| 11-Jul-2008 |
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> |
ext4: Use page_mkwrite vma_operations to get mmap write notification.
We would like to get notified when we are doing a write on mmap section. This is needed with respect to preallocated area. We sp
ext4: Use page_mkwrite vma_operations to get mmap write notification.
We would like to get notified when we are doing a write on mmap section. This is needed with respect to preallocated area. We split the preallocated area into initialzed extent and uninitialzed extent in the call back. This let us handle ENOSPC better. Otherwise we get ENOSPC in the writepage and that would result in data loss. The changes are also needed to handle ENOSPC when writing to an mmap section of files with holes.
Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|
#
5f21b0e6 |
| 11-Jul-2008 |
Frederic Bohe <frederic.bohe@bull.net> |
ext4: fix online resize with mballoc
Update group infos when updating a group's descriptor. Add group infos when adding a group's descriptor. Refresh cache pages used by mb_alloc when changes occur.
ext4: fix online resize with mballoc
Update group infos when updating a group's descriptor. Add group infos when adding a group's descriptor. Refresh cache pages used by mb_alloc when changes occur. This will probably need modifications when META_BG resizing will be allowed.
Signed-off-by: Frederic Bohe <frederic.bohe@bull.net> Signed-off-by: Mingming Cao <cmm@us.ibm.com>
show more ...
|
#
07031431 |
| 11-Jul-2008 |
Mingming Cao <cmm@us.ibm.com> |
ext4: mballoc avoid use root reserved blocks for non root allocation
mballoc allocation missed check for blocks reserved for root users. Add ext4_has_free_blocks() check before allocation. Also modi
ext4: mballoc avoid use root reserved blocks for non root allocation
mballoc allocation missed check for blocks reserved for root users. Add ext4_has_free_blocks() check before allocation. Also modified ext4_has_free_blocks() to support multiple block allocation request.
Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
show more ...
|