#
4670c46d |
| 01-Feb-2008 |
Joel Becker <joel.becker@oracle.com> |
ocfs2: Introduce the new ocfs2_cluster_connect/disconnect() API.
This step introduces a cluster stack agnostic API for initializing and exiting. fs/ocfs2/dlmglue.c no longer uses o2cb/o2dlm knowled
ocfs2: Introduce the new ocfs2_cluster_connect/disconnect() API.
This step introduces a cluster stack agnostic API for initializing and exiting. fs/ocfs2/dlmglue.c no longer uses o2cb/o2dlm knowledge to connect to the stack. It is all handled in stackglue.c.
heartbeat.c no longer needs to know how it gets called. ocfs2_do_node_down() is now a clean recovery trigger.
The big gotcha is the ordering of initializations and de-initializations done underneath ocfs2_cluster_connect(). ocfs2_dlm_init() used to do all o2dlm initialization in one block. Thus, the o2dlm functionality of ocfs2_cluster_connect() is very straightforward. ocfs2_dlm_shutdown(), however, did a few things between de-registration of the eviction callback and actually shutting down the domain. Now de-registration and shutdown of the domain are wrapped within the single ocfs2_cluster_disconnect() call. I've checked the code paths to make sure we can safely tear down things in ocfs2_dlm_shutdown() before calling ocfs2_cluster_disconnect(). The filesystem has already set itself to ignore the callback.
Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
show more ...
|
#
8f2c9c1b |
| 01-Feb-2008 |
Joel Becker <joel.becker@oracle.com> |
ocfs2: Create the lock status block union.
Wrap the lock status block (lksb) in a union. Later we will add a union element for the fs/dlm lksb. Create accessors for the status and lvb fields.
Oth
ocfs2: Create the lock status block union.
Wrap the lock status block (lksb) in a union. Later we will add a union element for the fs/dlm lksb. Create accessors for the status and lvb fields.
Other than a debugging function, dlmglue.c does not directly reference the o2dlm locking path anymore.
Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
show more ...
|
#
386a2ef8 |
| 01-Feb-2008 |
Joel Becker <joel.becker@oracle.com> |
ocfs2: New slot map format
The old slot map had a few limitations:
- It was limited to one block, so the maximum slot count was 255. - Each slot was signed 16bits, limiting node numbers to INT16_MA
ocfs2: New slot map format
The old slot map had a few limitations:
- It was limited to one block, so the maximum slot count was 255. - Each slot was signed 16bits, limiting node numbers to INT16_MAX. - An empty slot was marked by the magic 0xFFFF (-1).
The new slot map format provides 32bit node numbers (UINT32_MAX), a separate space to mark a slot in use, and extra room to grow. The slot map is now bounded by i_size, not a block.
Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
show more ...
|
#
fc881fa0 |
| 01-Feb-2008 |
Joel Becker <joel.becker@oracle.com> |
ocfs2: De-magic the in-memory slot map.
The in-memory slot map uses the same magic as the on-disk one. There is a special value to mark a slot as invalid. It relies on the size of certain types an
ocfs2: De-magic the in-memory slot map.
The in-memory slot map uses the same magic as the on-disk one. There is a special value to mark a slot as invalid. It relies on the size of certain types and so on.
Write a new in-memory map that keeps validity as a separate field. Outside of the I/O functions, OCFS2_INVALID_SLOT now means what it is supposed to. It also is no longer tied to the type size.
This also means that only the I/O functions refer to 16bit quantities.
Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
show more ...
|
#
553abd04 |
| 01-Feb-2008 |
Joel Becker <joel.becker@oracle.com> |
ocfs2: Change the recovery map to an array of node numbers.
The old recovery map was a bitmap of node numbers. This was sufficient for the maximum node number of 254. Going forward, we want node n
ocfs2: Change the recovery map to an array of node numbers.
The old recovery map was a bitmap of node numbers. This was sufficient for the maximum node number of 254. Going forward, we want node numbers to be UINT32. Thus, we need a new recovery map.
Note that we can't keep track of slots here. We must write down the node number to recovery *before* we get the locks needed to convert a node number into a slot number.
The recovery map is now an array of unsigned ints, max_slots in size. It moves to journal.c with the rest of recovery.
Because it needs to be initialized, we move all of recovery initialization into a new function, ocfs2_recovery_init(). This actually cleans up ocfs2_initialize_super() a little as well. Following on, recovery cleaup becomes part of ocfs2_recovery_exit().
A number of node map functions are rendered obsolete and are removed.
Finally, waiting on recovery is wrapped in a function rather than naked checks on the recovery_event. This is a cleanup from Mark.
Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
show more ...
|
#
d85b20e4 |
| 01-Feb-2008 |
Joel Becker <joel.becker@oracle.com> |
ocfs2: Make ocfs2_slot_info private.
Just use osb_lock around the ocfs2_slot_info data. This allows us to take the ocfs2_slot_info structure private in slot_info.c. All access is now via accessors
ocfs2: Make ocfs2_slot_info private.
Just use osb_lock around the ocfs2_slot_info data. This allows us to take the ocfs2_slot_info structure private in slot_info.c. All access is now via accessors.
Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
show more ...
|
#
8b5f6883 |
| 08-Feb-2008 |
Marcin Slusarz <marcin.slusarz@gmail.com> |
byteorder: move le32_add_cpu & friends from OCFS2 to core
This patchset moves le*_add_cpu and be*_add_cpu functions from OCFS2 to core header (1st), converts ext3 filesystem to this API (2nd) and re
byteorder: move le32_add_cpu & friends from OCFS2 to core
This patchset moves le*_add_cpu and be*_add_cpu functions from OCFS2 to core header (1st), converts ext3 filesystem to this API (2nd) and replaces XFS different named functions with new ones (3rd).
There are many places where these functions will be useful. Just look at: grep -r 'cpu_to_[ble12346]*([ble12346]*_to_cpu.*[-+]' linux-src/ Patch for ext3 is an example how conversions will probably look like.
This patch:
- move inline functions which add native byte order variable to little/big endian variable to core header * le16_add_cpu(__le16 *var, u16 val) * le32_add_cpu(__le32 *var, u32 val) * le64_add_cpu(__le64 *var, u64 val) * be32_add_cpu(__be32 *var, u32 val) - add for completeness: * be16_add_cpu(__be16 *var, u16 val) * be64_add_cpu(__be64 *var, u64 val)
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Acked-by: Mark Fasheh <mark.fasheh@oracle.com> Cc: David Chinner <dgc@sgi.com> Cc: Timothy Shimmin <tes@sgi.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
#
d24fbcda |
| 25-Jan-2008 |
Joel Becker <Joel.Becker@oracle.com> |
ocfs2: Negotiate locking protocol versions.
Currently, when ocfs2 nodes connect via TCP, they advertise their compatibility level. If the versions do not match, two nodes cannot speak to each other
ocfs2: Negotiate locking protocol versions.
Currently, when ocfs2 nodes connect via TCP, they advertise their compatibility level. If the versions do not match, two nodes cannot speak to each other and they disconnect. As a result, this provides no forward or backwards compatibility.
This patch implements a simple protocol negotiation at the dlm level by introducing a major/minor version number scheme for entities that communicate. Specifically, o2dlm has a major/minor version for interaction with o2dlm on other nodes, and ocfs2 itself has a major/minor version for interacting with the filesystem on other nodes.
This will allow rolling upgrades of ocfs2 clusters when changes to the locking or network protocols can be done in a backwards compatible manner. In those cases, only the minor number is changed and the negotatied protocol minor is returned from dlm join. In the far less likely event that a required protocol change makes backwards compatibility impossible, we simply bump the major number.
Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
show more ...
|
Revision tags: v2.6.24 |
|
#
7ec373cf |
| 23-Jan-2008 |
Mark Fasheh <mark.fasheh@oracle.com> |
ocfs2: document access rules for blocked_lock_list
ocfs2_super->blocked_lock_list and ocfs2_super->blocked_lock_count have some usage restrictions which aren't immediately obvious to anyone reading
ocfs2: document access rules for blocked_lock_list
ocfs2_super->blocked_lock_list and ocfs2_super->blocked_lock_count have some usage restrictions which aren't immediately obvious to anyone reading the code. It's a good idea to document this so that we avoid making costly mistakes in the future.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
show more ...
|
Revision tags: v2.6.24-rc8, v2.6.24-rc7, v2.6.24-rc6 |
|
#
53fc622b |
| 20-Dec-2007 |
Mark Fasheh <mark.fasheh@oracle.com> |
[PATCH 2/2] ocfs2: cluster aware flock()
Hook up ocfs2_flock(), using the new flock lock type in dlmglue.c. A new mount option, "localflocks" is added so that users can revert to old functionality a
[PATCH 2/2] ocfs2: cluster aware flock()
Hook up ocfs2_flock(), using the new flock lock type in dlmglue.c. A new mount option, "localflocks" is added so that users can revert to old functionality as need be.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
show more ...
|
#
cf8e06f1 |
| 20-Dec-2007 |
Mark Fasheh <mark.fasheh@oracle.com> |
[PATCH 1/2] ocfs2: add flock lock type
This adds a new dlmglue lock type which is intended to back flock() requests.
Since these locks are driven from userspace, usage rules are much more liberal t
[PATCH 1/2] ocfs2: add flock lock type
This adds a new dlmglue lock type which is intended to back flock() requests.
Since these locks are driven from userspace, usage rules are much more liberal than the typical Ocfs2 internal cluster lock. As a result, we can't make use of most dlmglue features - lock caching and lock level optimizations in particular. Additionally, userspace is free to deadlock itself, so we have to deal with that in the same way as the rest of the kernel - by allowing a signal to abort a lock request.
In order to keep ocfs2_cluster_lock() complexity down, ocfs2_file_lock() does it's own dlm coordination. We still use the same helper functions though, so duplicated code is kept to a minimum.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
show more ...
|
#
2fbe8d1e |
| 20-Dec-2007 |
Sunil Mushran <sunil.mushran@oracle.com> |
ocfs2: Local alloc window size changeable via mount option
Local alloc is a performance optimization in ocfs2 in which a node takes a window of bits from the global bitmap and then uses that for all
ocfs2: Local alloc window size changeable via mount option
Local alloc is a performance optimization in ocfs2 in which a node takes a window of bits from the global bitmap and then uses that for all small local allocations. This window size is fixed to 8MB currently. This patch allows users to specify the window size in MB including disabling it by passing in 0. If the number specified is too large, the fs will use the default value of 8MB.
mount -o localalloc=X /dev/sdX /mntpoint
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
show more ...
|
Revision tags: v2.6.24-rc5, v2.6.24-rc4, v2.6.24-rc3 |
|
#
d147b3d6 |
| 07-Nov-2007 |
Mark Fasheh <mark.fasheh@oracle.com> |
ocfs2: Support commit= mount option
Mostly taken from ext3. This allows the user to set the jbd commit interval, in seconds. The default of 5 seconds stays the same, but now users can easily increas
ocfs2: Support commit= mount option
Mostly taken from ext3. This allows the user to set the jbd commit interval, in seconds. The default of 5 seconds stays the same, but now users can easily increase the commit interval. Typically, this would be increased in order to benefit performance at the expense of data-safety.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
show more ...
|
Revision tags: v2.6.24-rc2, v2.6.24-rc1, v2.6.23, v2.6.23-rc9, v2.6.23-rc8 |
|
#
34d024f8 |
| 24-Sep-2007 |
Mark Fasheh <mark.fasheh@oracle.com> |
ocfs2: Remove mount/unmount votes
The node maps that are set/unset by these votes are no longer relevant, thus we can remove the mount and umount votes. Since those are the last two remaining votes,
ocfs2: Remove mount/unmount votes
The node maps that are set/unset by these votes are no longer relevant, thus we can remove the mount and umount votes. Since those are the last two remaining votes, we can also remove the entire vote infrastructure.
The vote thread has been renamed to the downconvert thread, and the small amount of functionality related to managing it has been moved into fs/ocfs2/dlmglue.c. All references to votes have been removed or updated.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
show more ...
|
Revision tags: v2.6.23-rc7, v2.6.23-rc6 |
|
#
15b1e36b |
| 07-Sep-2007 |
Mark Fasheh <mark.fasheh@oracle.com> |
ocfs2: Structure updates for inline data
Add the disk, network and memory structures needed to support data in inode.
Struct ocfs2_inline_data is defined and embedded in ocfs2_dinode for storing in
ocfs2: Structure updates for inline data
Add the disk, network and memory structures needed to support data in inode.
Struct ocfs2_inline_data is defined and embedded in ocfs2_dinode for storing inline data.
A new inode field, i_dyn_features, is added to facilitate tracking of dynamic inode state. Since it will be used often, we want to mirror it on ocfs2_inode_info, and transfer it via the meta data lvb.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>
show more ...
|
Revision tags: v2.6.23-rc5, v2.6.23-rc4, v2.6.23-rc3, v2.6.23-rc2, v2.6.23-rc1 |
|
#
7c08d70c |
| 20-Jul-2007 |
Mark Fasheh <mark.fasheh@oracle.com> |
ocfs2: Fix some casting errors related to file writes
ocfs2_align_clusters_to_page_index() needs to cast the clusters shift to pgoff_t and ocfs2_file_buffered_write() needs loff_t when calculating d
ocfs2: Fix some casting errors related to file writes
ocfs2_align_clusters_to_page_index() needs to cast the clusters shift to pgoff_t and ocfs2_file_buffered_write() needs loff_t when calculating destination start for memcpy.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
show more ...
|
Revision tags: v2.6.22, v2.6.22-rc7, v2.6.22-rc6 |
|
#
328d5752 |
| 18-Jun-2007 |
Mark Fasheh <mark.fasheh@oracle.com> |
ocfs2: btree changes for unwritten extents
Writes to a region marked as unwritten might result in a record split or merge. We can support splits by making minor changes to the existing insert code.
ocfs2: btree changes for unwritten extents
Writes to a region marked as unwritten might result in a record split or merge. We can support splits by making minor changes to the existing insert code. Merges require left rotations which mostly re-use right rotation support functions.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
show more ...
|
#
baf4661a |
| 18-Jun-2007 |
Sunil Mushran <Sunil.Mushran@oracle.com> |
ocfs2: Add "preferred slot" mount option
ocfs2 will attempt to assign the node the slot# provided in the mount option. Failure to assign the preferred slot is not an error. This small feature can be
ocfs2: Add "preferred slot" mount option
ocfs2 will attempt to assign the node the slot# provided in the mount option. Failure to assign the preferred slot is not an error. This small feature can be useful for automated testing.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
show more ...
|
Revision tags: v2.6.22-rc5, v2.6.22-rc4, v2.6.22-rc3, v2.6.22-rc2, v2.6.22-rc1 |
|
#
1ca1a111 |
| 27-Apr-2007 |
Mark Fasheh <mark.fasheh@oracle.com> |
ocfs2: fix sparse warnings in fs/ocfs2
None of these are actually harmful, but the noise makes looking for real problems difficult.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
|
Revision tags: v2.6.21, v2.6.21-rc7, v2.6.21-rc6, v2.6.21-rc5, v2.6.21-rc4, v2.6.21-rc3, v2.6.21-rc2, v2.6.21-rc1 |
|
#
60b11392 |
| 16-Feb-2007 |
Mark Fasheh <mark.fasheh@oracle.com> |
ocfs2: zero tail of sparse files on truncate
Since we don't zero on extend anymore, truncate needs to be fixed up to zero the part of a file between i_size and and end of it's cluster. Otherwise a s
ocfs2: zero tail of sparse files on truncate
Since we don't zero on extend anymore, truncate needs to be fixed up to zero the part of a file between i_size and and end of it's cluster. Otherwise a subsequent extend could expose bad data.
This introduced a new helper, which can be used in ocfs2_write().
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
show more ...
|
#
9517bac6 |
| 09-Feb-2007 |
Mark Fasheh <mark.fasheh@oracle.com> |
ocfs2: teach ocfs2_file_aio_write() about sparse files
Unfortunately, ocfs2 can no longer make use of generic_file_aio_write_nlock() because allocating writes will require zeroing of pages adjacent
ocfs2: teach ocfs2_file_aio_write() about sparse files
Unfortunately, ocfs2 can no longer make use of generic_file_aio_write_nlock() because allocating writes will require zeroing of pages adjacent to the I/O for cluster sizes greater than page size.
Implement a custom file write here, which can order page locks for zeroing. This also has the advantage that cluster locks can easily be ordered outside of the page locks.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
show more ...
|
Revision tags: v2.6.20, v2.6.20-rc7, v2.6.20-rc6 |
|
#
363041a5 |
| 17-Jan-2007 |
Mark Fasheh <mark.fasheh@oracle.com> |
ocfs2: temporarily remove extent map caching
The code in extent_map.c is not prepared to deal with a subtree being rotated between lookups. This can happen when filling holes in sparse files. Instea
ocfs2: temporarily remove extent map caching
The code in extent_map.c is not prepared to deal with a subtree being rotated between lookups. This can happen when filling holes in sparse files. Instead of a lengthy patch to update the code (which would likely lose the benefit of caching subtree roots), we remove most of the algorithms and implement a simple path based lookup. A less ambitious extent caching scheme will be added in a later patch.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
show more ...
|
#
dcd0538f |
| 16-Jan-2007 |
Mark Fasheh <mark.fasheh@oracle.com> |
ocfs2: sparse b-tree support
Introduce tree rotations into the b-tree code. This will allow ocfs2 to support sparse files. Much of the added code is designed to be generic (in the ocfs2 sense) so th
ocfs2: sparse b-tree support
Introduce tree rotations into the b-tree code. This will allow ocfs2 to support sparse files. Much of the added code is designed to be generic (in the ocfs2 sense) so that it can later be re-used to implement large extended attributes.
This patch only adds the rotation code and does minimal updates to callers of the extent api.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
show more ...
|
Revision tags: v2.6.20-rc5, v2.6.20-rc4, v2.6.20-rc3, v2.6.20-rc2, v2.6.20-rc1 |
|
#
c271c5c2 |
| 05-Dec-2006 |
Sunil Mushran <Sunil.Mushran@oracle.com> |
ocfs2: local mounts
This allows users to format an ocfs2 file system with a special flag, OCFS2_FEATURE_INCOMPAT_LOCAL_MOUNT. When the file system sees this flag, it will not use any cluster service
ocfs2: local mounts
This allows users to format an ocfs2 file system with a special flag, OCFS2_FEATURE_INCOMPAT_LOCAL_MOUNT. When the file system sees this flag, it will not use any cluster services, nor will it require a cluster configuration, thus acting like a 'local' file system.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
show more ...
|
Revision tags: v2.6.19, v2.6.19-rc6 |
|
#
7f1a37e3 |
| 15-Nov-2006 |
Tiger Yang <tiger.yang@oracle.com> |
ocfs2: core atime update functions
This patch adds the core routines for updating atime in ocfs2.
Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.co
ocfs2: core atime update functions
This patch adds the core routines for updating atime in ocfs2.
Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
show more ...
|