#
d6a079e8 |
| 11-Mar-2011 |
Dave Chinner <dchinner@redhat.com> |
GFS2: introduce AIL lock
The log lock is currently used to protect the AIL lists and the movements of buffers into and out of them. The lists are self contained and no log specific items outside the
GFS2: introduce AIL lock
The log lock is currently used to protect the AIL lists and the movements of buffers into and out of them. The lists are self contained and no log specific items outside the lists are accessed when starting or emptying the AIL lists.
Hence the operation of the AIL does not require the protection of the log lock so split them out into a new AIL specific lock to reduce the amount of traffic on the log lock. This will also reduce the amount of serialisation that occurs when the gfs2_logd pushes on the AIL to move it forward.
This reduces the impact of log pushing on sequential write throughput.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
show more ...
|
#
721a9602 |
| 09-Mar-2011 |
Jens Axboe <jaxboe@fusionio.com> |
block: kill off REQ_UNPLUG
With the plugging now being explicitly controlled by the submitter, callers need not pass down unplugging hints to the block layer. If they want to unplug, it's because th
block: kill off REQ_UNPLUG
With the plugging now being explicitly controlled by the submitter, callers need not pass down unplugging hints to the block layer. If they want to unplug, it's because they manually plugged on their own - in which case, they should just unplug at will.
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
show more ...
|
Revision tags: v2.6.38-rc8, v2.6.38-rc7, v2.6.38-rc6, v2.6.38-rc5, v2.6.38-rc4, v2.6.38-rc3, v2.6.38-rc2 |
|
#
bc015cb8 |
| 19-Jan-2011 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Use RCU for glock hash table
This has a number of advantages:
- Reduces contention on the hash table lock - Makes the code smaller and simpler - Should speed up glock dumps when under load
GFS2: Use RCU for glock hash table
This has a number of advantages:
- Reduces contention on the hash table lock - Makes the code smaller and simpler - Should speed up glock dumps when under load - Removes ref count changing in examine_bucket - No longer need hash chain lock in glock_put() in common case
There are some further changes which this enables and which we may do in the future. One is to look at using SLAB_RCU, and another is to look at using a per-cpu counter for the per-sb glock counter, since that is touched twice in the lifetime of each glock (but only used at umount time).
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
show more ...
|
Revision tags: v2.6.38-rc1, v2.6.37, v2.6.37-rc8, v2.6.37-rc7, v2.6.37-rc6, v2.6.37-rc5, v2.6.37-rc4, v2.6.37-rc3, v2.6.37-rc2, v2.6.37-rc1, v2.6.36, v2.6.36-rc8, v2.6.36-rc7, v2.6.36-rc6, v2.6.36-rc5, v2.6.36-rc4, v2.6.36-rc3, v2.6.36-rc2, v2.6.36-rc1, v2.6.35, v2.6.35-rc6, v2.6.35-rc5, v2.6.35-rc4, v2.6.35-rc3, v2.6.35-rc2, v2.6.35-rc1, v2.6.34, v2.6.34-rc7 |
|
#
5e687eac |
| 04-May-2010 |
Benjamin Marzinski <bmarzins@redhat.com> |
GFS2: Various gfs2_logd improvements
This patch contains various tweaks to how log flushes and active item writeback work. gfs2_logd is now managed by a waitqueue, and gfs2_log_reseve now waits for
GFS2: Various gfs2_logd improvements
This patch contains various tweaks to how log flushes and active item writeback work. gfs2_logd is now managed by a waitqueue, and gfs2_log_reseve now waits for gfs2_logd to do the log flushing. Multiple functions were rewritten to remove the need to call gfs2_log_lock(). Instead of using one test to see if gfs2_logd had work to do, there are now seperate tests to check if there are two many buffers in the incore log or if there are two many items on the active items list.
This patch is a port of a patch Steve Whitehouse wrote about a year ago, with some minor changes. Since gfs2_ail1_start always submits all the active items, it no longer needs to keep track of the first ai submitted, so this has been removed. In gfs2_log_reserve(), the order of the calls to prepare_to_wait_exclusive() and wake_up() when firing off the logd thread has been switched. If it called wake_up first there was a small window for a race, where logd could run and return before gfs2_log_reserve was ready to get woken up. If gfs2_logd ran, but did not free up enough blocks, gfs2_log_reserve() would be left waiting for gfs2_logd to eventualy run because it timed out. Finally, gt_logd_secs, which controls how long to wait before gfs2_logd times out, and flushes the log, can now be set on mount with ar_commit.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
show more ...
|
Revision tags: v2.6.34-rc6, v2.6.34-rc5, v2.6.34-rc4, v2.6.34-rc3, v2.6.34-rc2, v2.6.34-rc1, v2.6.33, v2.6.33-rc8, v2.6.33-rc7 |
|
#
e5884636 |
| 04-Feb-2010 |
Dave Chinner <dchinner@redhat.com> |
GFS2: ordered writes are backwards
When we queue data buffers for ordered write, the buffers are added to the head of the ordered write list. When the log needs to push these buffers to disk, it als
GFS2: ordered writes are backwards
When we queue data buffers for ordered write, the buffers are added to the head of the ordered write list. When the log needs to push these buffers to disk, it also walks the list from the head. The result is that the the ordered buffers are submitted to disk in reverse order.
For large writes, this means that whenever the log flushes large streams of reverse sequential order buffers are pushed down into the block layers. The elevators don't handle this particularly well, so IO rates tend to be significantly lower than if the IO was issued in ascending block order.
Queue new ordered buffers to the tail of the ordered buffer list to ensure that IO is dispatched in the order it was submitted. This should significantly improve large sequential write speeds. On a disk capable of 85MB/s, speeds increase from 50MB/s to 65MB/s for noop and from 38MB/s to 50MB/s for cfq.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
show more ...
|
Revision tags: v2.6.33-rc6, v2.6.33-rc5, v2.6.33-rc4, v2.6.33-rc3, v2.6.33-rc2, v2.6.33-rc1, v2.6.32, v2.6.32-rc8, v2.6.32-rc7 |
|
#
0ab7d13f |
| 06-Nov-2009 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Tag all metadata with jid
There are two spare field in the header common to all GFS2 metadata. One is just the right size to fit a journal id in it, and this patch updates the journal code so
GFS2: Tag all metadata with jid
There are two spare field in the header common to all GFS2 metadata. One is just the right size to fit a journal id in it, and this patch updates the journal code so that each time a metadata block is modified, we tag it with the journal id of the node which is performing the modification.
The reason for this is that it should make it much easier to debug issues which arise if we can tell which node was the last to modify a particular metadata block.
Since the field is updated before the block is written into the journal, each journal should only contain metadata which is tagged with its own journal id. The one exception to this is the journal header block, which might have a different node's id in it, if that journal was recovered by another node in the cluster.
Thus each journal will contain a record of which nodes recovered it, via the journal header.
The other field in the metadata header could potentially be used to hold information about what kind of operation was performed, but for the time being we just zero it on each transaction so that if we use it for that in future, we'll know that the information (where it exists) is reliable.
I did consider using the other field to hold the journal sequence number, however since in GFS2's journaling we write the modified data into the journal and not the original data, this gives no information as to what action caused the modification, so I think we can probably come up with a better use for those 64 bits in the future.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
show more ...
|
Revision tags: v2.6.32-rc6, v2.6.32-rc5, v2.6.32-rc4, v2.6.32-rc3, v2.6.32-rc1, v2.6.32-rc2, v2.6.31, v2.6.31-rc9, v2.6.31-rc8, v2.6.31-rc7, v2.6.31-rc6, v2.6.31-rc5, v2.6.31-rc4, v2.6.31-rc3, v2.6.31-rc2, v2.6.31-rc1 |
|
#
63997775 |
| 12-Jun-2009 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Add tracepoints
This patch adds the ability to trace various aspects of the GFS2 filesystem. The trace points are divided into three groups, glocks, logging and bmap. These points have been ch
GFS2: Add tracepoints
This patch adds the ability to trace various aspects of the GFS2 filesystem. The trace points are divided into three groups, glocks, logging and bmap. These points have been chosen because they allow inspection of the major internal functions of GFS2 and they are also generic enough that they are unlikely to need any major changes as the filesystem evolves.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
show more ...
|
Revision tags: v2.6.30, v2.6.30-rc8, v2.6.30-rc7, v2.6.30-rc6, v2.6.30-rc5, v2.6.30-rc4, v2.6.30-rc3, v2.6.30-rc2, v2.6.30-rc1 |
|
#
c969f58c |
| 07-Apr-2009 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Update the rw flags
After Jens recent updates: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=a1f242524c3c1f5d40f1c9c343427e34d1aadd6e et al. this is a patch t
GFS2: Update the rw flags
After Jens recent updates: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=a1f242524c3c1f5d40f1c9c343427e34d1aadd6e et al. this is a patch to bring gfs2 uptodate with the core code. Also I've managed to squash another call to ll_rw_block() along the way.
There is still one part of the GFS2 I/O paths which are not correctly annotated and that is due to the sharing of the writeback code between the data and metadata address spaces. I would like to change that too, but this patch is still worth doing on its own, I think.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
show more ...
|
Revision tags: v2.6.29, v2.6.29-rc8, v2.6.29-rc7, v2.6.29-rc6, v2.6.29-rc5, v2.6.29-rc4, v2.6.29-rc3, v2.6.29-rc2 |
|
#
f057f6cd |
| 12-Jan-2009 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Merge lock_dlm module into GFS2
This is the big patch that I've been working on for some time now. There are many reasons for wanting to make this change such as: o Reducing overhead by elimi
GFS2: Merge lock_dlm module into GFS2
This is the big patch that I've been working on for some time now. There are many reasons for wanting to make this change such as: o Reducing overhead by eliminating duplicated fields between structures o Simplifcation of the code (reduces the code size by a fair bit) o The locking interface is now the DLM interface itself as proposed some time ago. o Fewer lookups of glocks when processing replies from the DLM o Fewer memory allocations/deallocations for each glock o Scope to do further optimisations in the future (but this patch is more than big enough for now!)
Please note that (a) this patch relates to the lock_dlm module and not the DLM itself, that is still a separate module; and (b) that we retain the ability to build GFS2 as a standalone single node filesystem with out requiring the DLM.
This patch needs a lot of testing, hence my keeping it I restarted my -git tree after the last merge window. That way, this has the maximum exposure before its merged. This is (modulo a few minor bug fixes) the same patch that I've been posting on and off the the last three months and its passed a number of different tests so far.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
show more ...
|
Revision tags: v2.6.29-rc1, v2.6.28, v2.6.28-rc9, v2.6.28-rc8, v2.6.28-rc7, v2.6.28-rc6, v2.6.28-rc5, v2.6.28-rc4, v2.6.28-rc3, v2.6.28-rc2, v2.6.28-rc1, v2.6.27, v2.6.27-rc9, v2.6.27-rc8, v2.6.27-rc7, v2.6.27-rc6, v2.6.27-rc5, v2.6.27-rc4, v2.6.27-rc3, v2.6.27-rc2, v2.6.27-rc1, v2.6.26, v2.6.26-rc9, v2.6.26-rc8, v2.6.26-rc7, v2.6.26-rc6, v2.6.26-rc5, v2.6.26-rc4, v2.6.26-rc3, v2.6.26-rc2, v2.6.26-rc1, v2.6.25, v2.6.25-rc9, v2.6.25-rc8, v2.6.25-rc7, v2.6.25-rc6, v2.6.25-rc5, v2.6.25-rc4, v2.6.25-rc3, v2.6.25-rc2, v2.6.25-rc1 |
|
#
3ad62e87 |
| 28-Jan-2008 |
Bob Peterson <rpeterso@redhat.com> |
[GFS2] Plug an unlikely leak
Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
d0109bfa |
| 28-Jan-2008 |
Bob Peterson <rpeterso@redhat.com> |
[GFS2] Only do lo_incore_commit once
This patch is performance related. When we're doing a log flush, I noticed we were calling buf_lo_incore_commit twice: once for data bufs and once for metadata
[GFS2] Only do lo_incore_commit once
This patch is performance related. When we're doing a log flush, I noticed we were calling buf_lo_incore_commit twice: once for data bufs and once for metadata bufs. Since this is the same function and does the same thing in both cases, there should be no reason to call it twice. Since we only need to call it once, we can also make it faster by removing it from the generic "lops" code and making it a stand-along static function.
Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
show more ...
|
Revision tags: v2.6.24, v2.6.24-rc8, v2.6.24-rc7, v2.6.24-rc6, v2.6.24-rc5, v2.6.24-rc4, v2.6.24-rc3 |
|
#
2bcd610d |
| 08-Nov-2007 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Don't add glocks to the journal
The only reason for adding glocks to the journal was to keep track of which locks required a log flush prior to release. We add a flag to the glock to allow th
[GFS2] Don't add glocks to the journal
The only reason for adding glocks to the journal was to keep track of which locks required a log flush prior to release. We add a flag to the glock to allow this check to be made in a simpler way.
This reduces the size of a glock (by 12 bytes on i386, 24 on x86_64) and means that we can avoid extra work during the journal flush.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
show more ...
|
Revision tags: v2.6.24-rc2, v2.6.24-rc1, v2.6.23, v2.6.23-rc9 |
|
#
9ff8ec32 |
| 28-Sep-2007 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Split gfs2_writepage into three cases
This patch splits gfs2_writepage into separate functions for each of the three cases: writeback, ordered and journalled. As a result it becomes a lot eas
[GFS2] Split gfs2_writepage into three cases
This patch splits gfs2_writepage into separate functions for each of the three cases: writeback, ordered and journalled. As a result it becomes a lot easier to see what each one is doing. The common code is moved into gfs2_writepage_common.
This fixes a performance bug where we were doing more work than strictly required in the ordered write case.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
show more ...
|
Revision tags: v2.6.23-rc8, v2.6.23-rc7 |
|
#
16615be1 |
| 17-Sep-2007 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Clean up journaled data writing
This patch cleans up the code for writing journaled data into the log. It also removes the need to allocate a small "tag" structure for each block written into
[GFS2] Clean up journaled data writing
This patch cleans up the code for writing journaled data into the log. It also removes the need to allocate a small "tag" structure for each block written into the log. Instead we just keep count of the outstanding I/O so that we can be sure that its all been written at the correct time. Another result of this patch is that a number of ll_rw_block() calls have become submit_bh() calls, closing some races at the same time.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
show more ...
|
Revision tags: v2.6.23-rc6 |
|
#
1ad38c43 |
| 03-Sep-2007 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Clean up gfs2_trans_add_revoke()
The following alters gfs2_trans_add_revoke() to take a struct gfs2_bufdata as an argument. This eliminates the memory allocation which was previously required
[GFS2] Clean up gfs2_trans_add_revoke()
The following alters gfs2_trans_add_revoke() to take a struct gfs2_bufdata as an argument. This eliminates the memory allocation which was previously required by making use of the already existing struct gfs2_bufdata. It makes some sanity checks to ensure that the gfs2_bufdata has been removed from all the lists before its recycled as a revoke structure. This saves one memory allocation and one free per revoke structure.
Also as a result, and to simplify the locking, since there is no longer any blocking code in gfs2_trans_add_revoke() we must hold the log lock whenever this function is called. This reduces the amount of times we take and unlock the log lock.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
show more ...
|
#
0820ab51 |
| 02-Sep-2007 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Use slab operations for all gfs2_bufdata allocations
The old revoke structure was allocated using kalloc/kfree but there is a slab cache for gfs2_bufdata, so we should use that now that the s
[GFS2] Use slab operations for all gfs2_bufdata allocations
The old revoke structure was allocated using kalloc/kfree but there is a slab cache for gfs2_bufdata, so we should use that now that the structures have been converted.
This is part two of the patch series to merge the revoke and gfs2_bufdata structures.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
show more ...
|
#
82e86087 |
| 02-Sep-2007 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Replace revoke structure with bufdata structure
Both the revoke structure and the bufdata structure are quite similar. They are basically small tags which are put on lists. In addition to whi
[GFS2] Replace revoke structure with bufdata structure
Both the revoke structure and the bufdata structure are quite similar. They are basically small tags which are put on lists. In addition to which the revoke structure is always allocated when there is a bufdata structure which is (or can be) freed. As such it should be possible to reduce the number of frees and allocations by using the same structure for both purposes.
This patch is the first step along that path. It replaces existing uses of the revoke structure with the bufdata structure.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
show more ...
|
#
d7b616e2 |
| 02-Sep-2007 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Clean up ordered write code
The following patch removes the ordered write processing from databuf_lo_before_commit() and moves it to log.c. This has the effect of greatly simplyfying databuf_
[GFS2] Clean up ordered write code
The following patch removes the ordered write processing from databuf_lo_before_commit() and moves it to log.c. This has the effect of greatly simplyfying databuf_lo_before_commit() and well as potentially making the ordered write code more efficient.
As a side effect of this, its now possible to remove ordered buffers from the ordered buffer list at any time, so we now make use of this in invalidatepage and releasepage to ensure timely release of these buffers.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
show more ...
|
Revision tags: v2.6.23-rc5, v2.6.23-rc4 |
|
#
9b9107a5 |
| 27-Aug-2007 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Move pin/unpin into lops.c, clean up locking
gfs2_pin and gfs2_unpin are only used in lops.c, despite being defined in meta_io.c, so this patch moves them into lops.c and makes them static. A
[GFS2] Move pin/unpin into lops.c, clean up locking
gfs2_pin and gfs2_unpin are only used in lops.c, despite being defined in meta_io.c, so this patch moves them into lops.c and makes them static. At the same time, its possible to clean up the locking in the buf and databuf _lo_add() functions so that we only need to grab the spinlock once. Also we have to move lock_buffer() around the _lo_add() functions since we can't do that in gfs2_pin() any more since we hold the spinlock for the duration of that function.
As a result, the code shrinks by 12 lines and we do far fewer operations when adding buffers to the log. It also makes the code somewhat easier to read & understand.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
show more ...
|
#
ec217e0e |
| 22-Aug-2007 |
Bob Peterson <rpeterso@redhat.com> |
[GFS2] Patch to protect sd_log_num_jdata
This is a patch to GFS2 to protect sd_log_num_jdata with the gfs2_log_lock. Without this patch, there is a timing window where you can get hit the following
[GFS2] Patch to protect sd_log_num_jdata
This is a patch to GFS2 to protect sd_log_num_jdata with the gfs2_log_lock. Without this patch, there is a timing window where you can get hit the following assert from function gfs2_log_flush():
gfs2_assert_withdraw(sdp, sdp->sd_log_num_buf + sdp->sd_log_num_jdata == sdp->sd_log_commited_buf + sdp->sd_log_commited_databuf);
I've tested it on my roth cluster and it fixes the problem.
Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
show more ...
|
Revision tags: v2.6.23-rc3, v2.6.23-rc2 |
|
#
905d2aef |
| 24-Jul-2007 |
Bob Peterson <rpeterso@redhat.com> |
[GFS2] Move some code inside the log lock
This is the first of five patches for bug #248176:
There were still some critical variables being manipulated outside the log_lock spinlock. That usually
[GFS2] Move some code inside the log lock
This is the first of five patches for bug #248176:
There were still some critical variables being manipulated outside the log_lock spinlock. That usually resulted in a hang.
Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
show more ...
|
Revision tags: v2.6.23-rc1 |
|
#
bdcb8856 |
| 11-Jul-2007 |
Bob Peterson <rpeterso@redhat.com> |
[GFS2] soft lockup detected in databuf_lo_before_commit
This is part 2 of the patch for bug #245832, part 1 of which is already in the git tree.
The problem was that sdp->sd_log_num_databuf was not
[GFS2] soft lockup detected in databuf_lo_before_commit
This is part 2 of the patch for bug #245832, part 1 of which is already in the git tree.
The problem was that sdp->sd_log_num_databuf was not always being protected by the gfs2_log_lock spinlock, but the sd_log_le_databuf (which it is supposed to reflect) was protected. That meant there was a timing window during which gfs2_log_flush called databuf_lo_before_commit and the count didn't match what was really on the linked list in that window. So when it ran out of items on the linked list, it decremented total_dbuf from 0 to -1 and thus never left the "while(total_dbuf)" loop.
The solution is to protect the variable sdp->sd_log_num_databuf so that the value will always match the contents of the linked list, and therefore the number will never go negative, and therefore, the loop will be exited properly.
Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
show more ...
|
Revision tags: v2.6.22, v2.6.22-rc7, v2.6.22-rc6 |
|
#
773ed1a0 |
| 20-Jun-2007 |
Robert Peterson <rpeterso@redhat.com> |
[GFS2] Addendum to the journaled file/unmount patch
This patch is an addendum to the previous journaled file/unmount patch. It fixes a problem discovered during testing.
Signed-off-by: Bob Peterson
[GFS2] Addendum to the journaled file/unmount patch
This patch is an addendum to the previous journaled file/unmount patch. It fixes a problem discovered during testing.
Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
show more ...
|
#
2332c443 |
| 18-Jun-2007 |
Robert Peterson <rpeterso@redhat.com> |
[GFS2] assertion failure after writing to journaled file, umount
This patch passes all my nasty tests that were causing the code to fail under one circumstance or another. Here is a complete summar
[GFS2] assertion failure after writing to journaled file, umount
This patch passes all my nasty tests that were causing the code to fail under one circumstance or another. Here is a complete summary of all changes from today's git tree, in order of appearance:
1. There are now separate variables for metadata buffer accounting. 2. Variable sd_log_num_hdrs is no longer needed, since the header accounting is taken care of by the reserve/refund sequence. 3. Fixed a tiny grammatical problem in a comment. 4. Added a new function "calc_reserved" to calculate the reserved log space. This isn't entirely necessary, but it has two benefits: First, it simplifies the gfs2_log_refund function greatly. Second, it allows for easier debugging because I could sprinkle the code with calls to this function to make sure the accounting is proper (by adding asserts and printks) at strategic point of the code. 5. In log_pull_tail there apparently was a kludge to fix up the accounting based on a "pull" parameter. The buffer accounting is now done properly, so the kludge was removed. 6. File sync operations were making a call to gfs2_log_flush that writes another journal header. Since that header was unplanned for (reserved) by the reserve/refund sequence, the free space had to be decremented so that when log_pull_tail gets called, the free space is be adjusted properly. (Did I hear you call that a kludge? well, maybe, but a lot more justifiable than the one I removed). 7. In the gfs2_log_shutdown code, it optionally syncs the log by specifying the PULL parameter to log_write_header. I'm not sure this is necessary anymore. It just seems to me there could be cases where shutdown is called while there are outstanding log buffers. 8. In the (data)buf_lo_before_commit functions, I changed some offset values from being calculated on the fly to being constants. That simplified some code and we might as well let the compiler do the calculation once rather than redoing those cycles at run time. 9. This version has my rewritten databuf_lo_add function. This version is much more like its predecessor, buf_lo_add, which makes it easier to understand. Again, this might not be necessary, but it seems as if this one works as well as the previous one, maybe even better, so I decided to leave it in. 10. In databuf_lo_before_commit, a previous data corruption problem was caused by going off the end of the buffer. The proper solution is to have the proper limit in place, rather than stopping earlier. (Thus my previous attempt to fix it is wrong). If you don't wrap the buffer, you're stopping too early and that causes more log buffer accounting problems. 11. In lops.h there are two new (previously mentioned) constants for figuring out the data offset for the journal buffers. 12. There are also two new functions, buf_limit and databuf_limit to calculate how many entries will fit in the buffer. 13. In function gfs2_meta_wipe, it needs to distinguish between pinned metadata buffers and journaled data buffers for proper journal buffer accounting. It can't use the JDATA gfs2_inode flag because it's sometimes passed the "real" inode and sometimes the "metadata inode" and the inode flags will be random bits in a metadata gfs2_inode. It needs to base its decision on which was passed in.
Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
show more ...
|
Revision tags: v2.6.22-rc5 |
|
#
8fb68595 |
| 12-Jun-2007 |
Robert Peterson <rpeterso@redhat.com> |
[GFS2] Journaled file write/unstuff bug
This patch is for bugzilla bug 283162, which uncovered a number of bugs pertaining to writing to files that have the journaled bit on. These bugs happen most
[GFS2] Journaled file write/unstuff bug
This patch is for bugzilla bug 283162, which uncovered a number of bugs pertaining to writing to files that have the journaled bit on. These bugs happen most often when writing to the meta_fs because the files are always journaled. So operations like gfs2_grow were particularly vulnerable, although many of the problems could be recreated with normal files after setting the journaled bit on. The problems fixed are:
-GFS2 wasn't ever writing unstuffed journaled data blocks to their in-place location on disk. Now it does.
-If you unmounted too quickly after doing IO to a journaled file, GFS2 was crashing because you would discard a buffer whose bufdata was still on the active items list. GFS2 now deals with this gracefully.
-GFS2 was losing track of the bufdata for journaled data blocks, and it wasn't getting freed, causing an error when you tried to unmount the module. GFS2 now frees all the bufdata structures.
-There was a memory corruption occurring because GFS2 wrote twice as many log entries for journaled buffers.
-It was occasionally trying to write journal headers in buffers that weren't currently mapped.
Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
show more ...
|