Lines Matching +full:always +full:- +full:wait +full:- +full:for +full:- +full:ack
5 The cluster MD is a shared-device RAID for a cluster, it supports
9 1. On-disk format
12 Separate write-intent-bitmaps are used for each cluster node.
14 and may not yet have finished. The on-disk layout is::
17 -------------------------------------------------------------------
26 - set the appropriate bit (if not already set)
27 - commit the write to all mirrors
28 - schedule the bit to be cleared after a timeout.
35 2. DLM Locks for management
38 There are three groups of locks for managing the device:
41 -------------------------------------
44 the form bitmap000 for node 1, bitmap001 for node 2 and so on. When a
52 The LVB of the bitmap lock for a particular node records the range
53 of sectors that are being re-synced by that node. No other
58 -------------------------
61 resync, and for metadata superblock updates. This communication is
62 managed through three locks: "token", "message", and "ack", together
65 2.3 new-device management
66 -------------------------
68 A single lock: "no-new-dev" is used to coordinate the addition of
69 new devices - this must be synchronized across the array.
70 Normally all nodes hold a concurrent-read lock on this device.
75 Messages can be broadcast to all nodes, and the sender waits for all
80 -----------------
88 been updated, and the node must re-read the md superblock. This is
99 time per-node.
105 the array. Message contains an identifier for that device. See
106 below for further details.
112 array. The slot-number of the device is included in the message.
116 A failed device is being re-activated - the assumption
126 ---------------------------
129 are three resources used for the purpose:
141 3.2.3 ack
151 1. receive status - all nodes have concurrent-reader lock on "ack"::
154 "ack":CR "ack":CR "ack":CR
160 "token":EX "ack":CR "ack":CR
162 "ack":CR
165 received or other events that happened while waiting for the
170 sender down-convert "message" from EX to CW
172 sender try to get EX of "ack"
176 [ wait until all receivers have *processed* the "message" ]
178 [ triggered by bast of "ack" ]
182 [ wait finish ]
183 receiver releases "ack"
189 "ack":EX
191 4. triggered by grant of EX on "ack" (indicating all receivers
194 sender down-converts "ack" from EX to CR
203 receiver get CR of "ack"
207 "ack":CR "ack":CR "ack":CR
214 ----------------
220 - acquires the bitmap<number> lock of the failed node
221 - opens the bitmap
222 - reads the bitmap of the failed node
223 - copies the set bitmap to local node
224 - cleans the bitmap of the failed node
225 - releases bitmap<number> lock of the failed node
226 - initiates resync of the bitmap on the current node
228 then md_check_recovery -> metadata_update_start/finish,
244 A helper function, ->area_resyncing() can be used to check if a
256 ----------------------
258 For adding a new device, it is necessary that all nodes "see" the new
259 device to be added. For this, the following algorithm is used:
261 1. Node 1 issues mdadm --manage /dev/mdX --add /dev/sdYY which issues
266 4. In userspace, the node searches for the disk, perhaps
267 using blkid -t SUB_UUID=""
273 6. Other nodes drop lock on "no-new-devs" (CR) if device is found
274 7. Node 1 attempts EX lock on "no-new-dev"
277 9. If not (get "no-new-dev" lock), it fails the operation and sends
285 There are 17 call-backs which the md core can make to the cluster
290 ---------------------------
298 -----------------
301 Range is from 0 to nodes-1.
304 ------------------------
308 end point is always the end of the array.
312 -----------------------------------
323 -------------------------------------------------------------------------------
332 --------------------
338 then the caller will avoid writing or read-balancing in that
342 all areas are resyncing for READ requests. This avoids races
343 between the cluster-filesystem and the cluster-RAID handling
347 ---------------------------------------------------------------
349 These are used to manage the new-disk protocol described above.
359 -----------------
365 --------------------
369 bitmap is then used to recovery the re-added device.
372 ------------------------------------------------
385 - change array_sectors.