1*6cf2a73cSMauro Carvalho Chehab============================== 2*6cf2a73cSMauro Carvalho ChehabDevice-mapper snapshot support 3*6cf2a73cSMauro Carvalho Chehab============================== 4*6cf2a73cSMauro Carvalho Chehab 5*6cf2a73cSMauro Carvalho ChehabDevice-mapper allows you, without massive data copying: 6*6cf2a73cSMauro Carvalho Chehab 7*6cf2a73cSMauro Carvalho Chehab- To create snapshots of any block device i.e. mountable, saved states of 8*6cf2a73cSMauro Carvalho Chehab the block device which are also writable without interfering with the 9*6cf2a73cSMauro Carvalho Chehab original content; 10*6cf2a73cSMauro Carvalho Chehab- To create device "forks", i.e. multiple different versions of the 11*6cf2a73cSMauro Carvalho Chehab same data stream. 12*6cf2a73cSMauro Carvalho Chehab- To merge a snapshot of a block device back into the snapshot's origin 13*6cf2a73cSMauro Carvalho Chehab device. 14*6cf2a73cSMauro Carvalho Chehab 15*6cf2a73cSMauro Carvalho ChehabIn the first two cases, dm copies only the chunks of data that get 16*6cf2a73cSMauro Carvalho Chehabchanged and uses a separate copy-on-write (COW) block device for 17*6cf2a73cSMauro Carvalho Chehabstorage. 18*6cf2a73cSMauro Carvalho Chehab 19*6cf2a73cSMauro Carvalho ChehabFor snapshot merge the contents of the COW storage are merged back into 20*6cf2a73cSMauro Carvalho Chehabthe origin device. 21*6cf2a73cSMauro Carvalho Chehab 22*6cf2a73cSMauro Carvalho Chehab 23*6cf2a73cSMauro Carvalho ChehabThere are three dm targets available: 24*6cf2a73cSMauro Carvalho Chehabsnapshot, snapshot-origin, and snapshot-merge. 25*6cf2a73cSMauro Carvalho Chehab 26*6cf2a73cSMauro Carvalho Chehab- snapshot-origin <origin> 27*6cf2a73cSMauro Carvalho Chehab 28*6cf2a73cSMauro Carvalho Chehabwhich will normally have one or more snapshots based on it. 29*6cf2a73cSMauro Carvalho ChehabReads will be mapped directly to the backing device. For each write, the 30*6cf2a73cSMauro Carvalho Chehaboriginal data will be saved in the <COW device> of each snapshot to keep 31*6cf2a73cSMauro Carvalho Chehabits visible content unchanged, at least until the <COW device> fills up. 32*6cf2a73cSMauro Carvalho Chehab 33*6cf2a73cSMauro Carvalho Chehab 34*6cf2a73cSMauro Carvalho Chehab- snapshot <origin> <COW device> <persistent?> <chunksize> 35*6cf2a73cSMauro Carvalho Chehab [<# feature args> [<arg>]*] 36*6cf2a73cSMauro Carvalho Chehab 37*6cf2a73cSMauro Carvalho ChehabA snapshot of the <origin> block device is created. Changed chunks of 38*6cf2a73cSMauro Carvalho Chehab<chunksize> sectors will be stored on the <COW device>. Writes will 39*6cf2a73cSMauro Carvalho Chehabonly go to the <COW device>. Reads will come from the <COW device> or 40*6cf2a73cSMauro Carvalho Chehabfrom <origin> for unchanged data. <COW device> will often be 41*6cf2a73cSMauro Carvalho Chehabsmaller than the origin and if it fills up the snapshot will become 42*6cf2a73cSMauro Carvalho Chehabuseless and be disabled, returning errors. So it is important to monitor 43*6cf2a73cSMauro Carvalho Chehabthe amount of free space and expand the <COW device> before it fills up. 44*6cf2a73cSMauro Carvalho Chehab 45*6cf2a73cSMauro Carvalho Chehab<persistent?> is P (Persistent) or N (Not persistent - will not survive 46*6cf2a73cSMauro Carvalho Chehabafter reboot). O (Overflow) can be added as a persistent store option 47*6cf2a73cSMauro Carvalho Chehabto allow userspace to advertise its support for seeing "Overflow" in the 48*6cf2a73cSMauro Carvalho Chehabsnapshot status. So supported store types are "P", "PO" and "N". 49*6cf2a73cSMauro Carvalho Chehab 50*6cf2a73cSMauro Carvalho ChehabThe difference between persistent and transient is with transient 51*6cf2a73cSMauro Carvalho Chehabsnapshots less metadata must be saved on disk - they can be kept in 52*6cf2a73cSMauro Carvalho Chehabmemory by the kernel. 53*6cf2a73cSMauro Carvalho Chehab 54*6cf2a73cSMauro Carvalho ChehabWhen loading or unloading the snapshot target, the corresponding 55*6cf2a73cSMauro Carvalho Chehabsnapshot-origin or snapshot-merge target must be suspended. A failure to 56*6cf2a73cSMauro Carvalho Chehabsuspend the origin target could result in data corruption. 57*6cf2a73cSMauro Carvalho Chehab 58*6cf2a73cSMauro Carvalho ChehabOptional features: 59*6cf2a73cSMauro Carvalho Chehab 60*6cf2a73cSMauro Carvalho Chehab discard_zeroes_cow - a discard issued to the snapshot device that 61*6cf2a73cSMauro Carvalho Chehab maps to entire chunks to will zero the corresponding exception(s) in 62*6cf2a73cSMauro Carvalho Chehab the snapshot's exception store. 63*6cf2a73cSMauro Carvalho Chehab 64*6cf2a73cSMauro Carvalho Chehab discard_passdown_origin - a discard to the snapshot device is passed 65*6cf2a73cSMauro Carvalho Chehab down to the snapshot-origin's underlying device. This doesn't cause 66*6cf2a73cSMauro Carvalho Chehab copy-out to the snapshot exception store because the snapshot-origin 67*6cf2a73cSMauro Carvalho Chehab target is bypassed. 68*6cf2a73cSMauro Carvalho Chehab 69*6cf2a73cSMauro Carvalho Chehab The discard_passdown_origin feature depends on the discard_zeroes_cow 70*6cf2a73cSMauro Carvalho Chehab feature being enabled. 71*6cf2a73cSMauro Carvalho Chehab 72*6cf2a73cSMauro Carvalho Chehab 73*6cf2a73cSMauro Carvalho Chehab- snapshot-merge <origin> <COW device> <persistent> <chunksize> 74*6cf2a73cSMauro Carvalho Chehab [<# feature args> [<arg>]*] 75*6cf2a73cSMauro Carvalho Chehab 76*6cf2a73cSMauro Carvalho Chehabtakes the same table arguments as the snapshot target except it only 77*6cf2a73cSMauro Carvalho Chehabworks with persistent snapshots. This target assumes the role of the 78*6cf2a73cSMauro Carvalho Chehab"snapshot-origin" target and must not be loaded if the "snapshot-origin" 79*6cf2a73cSMauro Carvalho Chehabis still present for <origin>. 80*6cf2a73cSMauro Carvalho Chehab 81*6cf2a73cSMauro Carvalho ChehabCreates a merging snapshot that takes control of the changed chunks 82*6cf2a73cSMauro Carvalho Chehabstored in the <COW device> of an existing snapshot, through a handover 83*6cf2a73cSMauro Carvalho Chehabprocedure, and merges these chunks back into the <origin>. Once merging 84*6cf2a73cSMauro Carvalho Chehabhas started (in the background) the <origin> may be opened and the merge 85*6cf2a73cSMauro Carvalho Chehabwill continue while I/O is flowing to it. Changes to the <origin> are 86*6cf2a73cSMauro Carvalho Chehabdeferred until the merging snapshot's corresponding chunk(s) have been 87*6cf2a73cSMauro Carvalho Chehabmerged. Once merging has started the snapshot device, associated with 88*6cf2a73cSMauro Carvalho Chehabthe "snapshot" target, will return -EIO when accessed. 89*6cf2a73cSMauro Carvalho Chehab 90*6cf2a73cSMauro Carvalho Chehab 91*6cf2a73cSMauro Carvalho ChehabHow snapshot is used by LVM2 92*6cf2a73cSMauro Carvalho Chehab============================ 93*6cf2a73cSMauro Carvalho ChehabWhen you create the first LVM2 snapshot of a volume, four dm devices are used: 94*6cf2a73cSMauro Carvalho Chehab 95*6cf2a73cSMauro Carvalho Chehab1) a device containing the original mapping table of the source volume; 96*6cf2a73cSMauro Carvalho Chehab2) a device used as the <COW device>; 97*6cf2a73cSMauro Carvalho Chehab3) a "snapshot" device, combining #1 and #2, which is the visible snapshot 98*6cf2a73cSMauro Carvalho Chehab volume; 99*6cf2a73cSMauro Carvalho Chehab4) the "original" volume (which uses the device number used by the original 100*6cf2a73cSMauro Carvalho Chehab source volume), whose table is replaced by a "snapshot-origin" mapping 101*6cf2a73cSMauro Carvalho Chehab from device #1. 102*6cf2a73cSMauro Carvalho Chehab 103*6cf2a73cSMauro Carvalho ChehabA fixed naming scheme is used, so with the following commands:: 104*6cf2a73cSMauro Carvalho Chehab 105*6cf2a73cSMauro Carvalho Chehab lvcreate -L 1G -n base volumeGroup 106*6cf2a73cSMauro Carvalho Chehab lvcreate -L 100M --snapshot -n snap volumeGroup/base 107*6cf2a73cSMauro Carvalho Chehab 108*6cf2a73cSMauro Carvalho Chehabwe'll have this situation (with volumes in above order):: 109*6cf2a73cSMauro Carvalho Chehab 110*6cf2a73cSMauro Carvalho Chehab # dmsetup table|grep volumeGroup 111*6cf2a73cSMauro Carvalho Chehab 112*6cf2a73cSMauro Carvalho Chehab volumeGroup-base-real: 0 2097152 linear 8:19 384 113*6cf2a73cSMauro Carvalho Chehab volumeGroup-snap-cow: 0 204800 linear 8:19 2097536 114*6cf2a73cSMauro Carvalho Chehab volumeGroup-snap: 0 2097152 snapshot 254:11 254:12 P 16 115*6cf2a73cSMauro Carvalho Chehab volumeGroup-base: 0 2097152 snapshot-origin 254:11 116*6cf2a73cSMauro Carvalho Chehab 117*6cf2a73cSMauro Carvalho Chehab # ls -lL /dev/mapper/volumeGroup-* 118*6cf2a73cSMauro Carvalho Chehab brw------- 1 root root 254, 11 29 ago 18:15 /dev/mapper/volumeGroup-base-real 119*6cf2a73cSMauro Carvalho Chehab brw------- 1 root root 254, 12 29 ago 18:15 /dev/mapper/volumeGroup-snap-cow 120*6cf2a73cSMauro Carvalho Chehab brw------- 1 root root 254, 13 29 ago 18:15 /dev/mapper/volumeGroup-snap 121*6cf2a73cSMauro Carvalho Chehab brw------- 1 root root 254, 10 29 ago 18:14 /dev/mapper/volumeGroup-base 122*6cf2a73cSMauro Carvalho Chehab 123*6cf2a73cSMauro Carvalho Chehab 124*6cf2a73cSMauro Carvalho ChehabHow snapshot-merge is used by LVM2 125*6cf2a73cSMauro Carvalho Chehab================================== 126*6cf2a73cSMauro Carvalho ChehabA merging snapshot assumes the role of the "snapshot-origin" while 127*6cf2a73cSMauro Carvalho Chehabmerging. As such the "snapshot-origin" is replaced with 128*6cf2a73cSMauro Carvalho Chehab"snapshot-merge". The "-real" device is not changed and the "-cow" 129*6cf2a73cSMauro Carvalho Chehabdevice is renamed to <origin name>-cow to aid LVM2's cleanup of the 130*6cf2a73cSMauro Carvalho Chehabmerging snapshot after it completes. The "snapshot" that hands over its 131*6cf2a73cSMauro Carvalho ChehabCOW device to the "snapshot-merge" is deactivated (unless using lvchange 132*6cf2a73cSMauro Carvalho Chehab--refresh); but if it is left active it will simply return I/O errors. 133*6cf2a73cSMauro Carvalho Chehab 134*6cf2a73cSMauro Carvalho ChehabA snapshot will merge into its origin with the following command:: 135*6cf2a73cSMauro Carvalho Chehab 136*6cf2a73cSMauro Carvalho Chehab lvconvert --merge volumeGroup/snap 137*6cf2a73cSMauro Carvalho Chehab 138*6cf2a73cSMauro Carvalho Chehabwe'll now have this situation:: 139*6cf2a73cSMauro Carvalho Chehab 140*6cf2a73cSMauro Carvalho Chehab # dmsetup table|grep volumeGroup 141*6cf2a73cSMauro Carvalho Chehab 142*6cf2a73cSMauro Carvalho Chehab volumeGroup-base-real: 0 2097152 linear 8:19 384 143*6cf2a73cSMauro Carvalho Chehab volumeGroup-base-cow: 0 204800 linear 8:19 2097536 144*6cf2a73cSMauro Carvalho Chehab volumeGroup-base: 0 2097152 snapshot-merge 254:11 254:12 P 16 145*6cf2a73cSMauro Carvalho Chehab 146*6cf2a73cSMauro Carvalho Chehab # ls -lL /dev/mapper/volumeGroup-* 147*6cf2a73cSMauro Carvalho Chehab brw------- 1 root root 254, 11 29 ago 18:15 /dev/mapper/volumeGroup-base-real 148*6cf2a73cSMauro Carvalho Chehab brw------- 1 root root 254, 12 29 ago 18:16 /dev/mapper/volumeGroup-base-cow 149*6cf2a73cSMauro Carvalho Chehab brw------- 1 root root 254, 10 29 ago 18:16 /dev/mapper/volumeGroup-base 150*6cf2a73cSMauro Carvalho Chehab 151*6cf2a73cSMauro Carvalho Chehab 152*6cf2a73cSMauro Carvalho ChehabHow to determine when a merging is complete 153*6cf2a73cSMauro Carvalho Chehab=========================================== 154*6cf2a73cSMauro Carvalho ChehabThe snapshot-merge and snapshot status lines end with: 155*6cf2a73cSMauro Carvalho Chehab 156*6cf2a73cSMauro Carvalho Chehab <sectors_allocated>/<total_sectors> <metadata_sectors> 157*6cf2a73cSMauro Carvalho Chehab 158*6cf2a73cSMauro Carvalho ChehabBoth <sectors_allocated> and <total_sectors> include both data and metadata. 159*6cf2a73cSMauro Carvalho ChehabDuring merging, the number of sectors allocated gets smaller and 160*6cf2a73cSMauro Carvalho Chehabsmaller. Merging has finished when the number of sectors holding data 161*6cf2a73cSMauro Carvalho Chehabis zero, in other words <sectors_allocated> == <metadata_sectors>. 162*6cf2a73cSMauro Carvalho Chehab 163*6cf2a73cSMauro Carvalho ChehabHere is a practical example (using a hybrid of lvm and dmsetup commands):: 164*6cf2a73cSMauro Carvalho Chehab 165*6cf2a73cSMauro Carvalho Chehab # lvs 166*6cf2a73cSMauro Carvalho Chehab LV VG Attr LSize Origin Snap% Move Log Copy% Convert 167*6cf2a73cSMauro Carvalho Chehab base volumeGroup owi-a- 4.00g 168*6cf2a73cSMauro Carvalho Chehab snap volumeGroup swi-a- 1.00g base 18.97 169*6cf2a73cSMauro Carvalho Chehab 170*6cf2a73cSMauro Carvalho Chehab # dmsetup status volumeGroup-snap 171*6cf2a73cSMauro Carvalho Chehab 0 8388608 snapshot 397896/2097152 1560 172*6cf2a73cSMauro Carvalho Chehab ^^^^ metadata sectors 173*6cf2a73cSMauro Carvalho Chehab 174*6cf2a73cSMauro Carvalho Chehab # lvconvert --merge -b volumeGroup/snap 175*6cf2a73cSMauro Carvalho Chehab Merging of volume snap started. 176*6cf2a73cSMauro Carvalho Chehab 177*6cf2a73cSMauro Carvalho Chehab # lvs volumeGroup/snap 178*6cf2a73cSMauro Carvalho Chehab LV VG Attr LSize Origin Snap% Move Log Copy% Convert 179*6cf2a73cSMauro Carvalho Chehab base volumeGroup Owi-a- 4.00g 17.23 180*6cf2a73cSMauro Carvalho Chehab 181*6cf2a73cSMauro Carvalho Chehab # dmsetup status volumeGroup-base 182*6cf2a73cSMauro Carvalho Chehab 0 8388608 snapshot-merge 281688/2097152 1104 183*6cf2a73cSMauro Carvalho Chehab 184*6cf2a73cSMauro Carvalho Chehab # dmsetup status volumeGroup-base 185*6cf2a73cSMauro Carvalho Chehab 0 8388608 snapshot-merge 180480/2097152 712 186*6cf2a73cSMauro Carvalho Chehab 187*6cf2a73cSMauro Carvalho Chehab # dmsetup status volumeGroup-base 188*6cf2a73cSMauro Carvalho Chehab 0 8388608 snapshot-merge 16/2097152 16 189*6cf2a73cSMauro Carvalho Chehab 190*6cf2a73cSMauro Carvalho ChehabMerging has finished. 191*6cf2a73cSMauro Carvalho Chehab 192*6cf2a73cSMauro Carvalho Chehab:: 193*6cf2a73cSMauro Carvalho Chehab 194*6cf2a73cSMauro Carvalho Chehab # lvs 195*6cf2a73cSMauro Carvalho Chehab LV VG Attr LSize Origin Snap% Move Log Copy% Convert 196*6cf2a73cSMauro Carvalho Chehab base volumeGroup owi-a- 4.00g 197