xref: /openbmc/linux/Documentation/admin-guide/device-mapper/snapshot.rst (revision 0898782247ae533d1f4e47a06bc5d4870931b284)
1*6cf2a73cSMauro Carvalho Chehab==============================
2*6cf2a73cSMauro Carvalho ChehabDevice-mapper snapshot support
3*6cf2a73cSMauro Carvalho Chehab==============================
4*6cf2a73cSMauro Carvalho Chehab
5*6cf2a73cSMauro Carvalho ChehabDevice-mapper allows you, without massive data copying:
6*6cf2a73cSMauro Carvalho Chehab
7*6cf2a73cSMauro Carvalho Chehab-  To create snapshots of any block device i.e. mountable, saved states of
8*6cf2a73cSMauro Carvalho Chehab   the block device which are also writable without interfering with the
9*6cf2a73cSMauro Carvalho Chehab   original content;
10*6cf2a73cSMauro Carvalho Chehab-  To create device "forks", i.e. multiple different versions of the
11*6cf2a73cSMauro Carvalho Chehab   same data stream.
12*6cf2a73cSMauro Carvalho Chehab-  To merge a snapshot of a block device back into the snapshot's origin
13*6cf2a73cSMauro Carvalho Chehab   device.
14*6cf2a73cSMauro Carvalho Chehab
15*6cf2a73cSMauro Carvalho ChehabIn the first two cases, dm copies only the chunks of data that get
16*6cf2a73cSMauro Carvalho Chehabchanged and uses a separate copy-on-write (COW) block device for
17*6cf2a73cSMauro Carvalho Chehabstorage.
18*6cf2a73cSMauro Carvalho Chehab
19*6cf2a73cSMauro Carvalho ChehabFor snapshot merge the contents of the COW storage are merged back into
20*6cf2a73cSMauro Carvalho Chehabthe origin device.
21*6cf2a73cSMauro Carvalho Chehab
22*6cf2a73cSMauro Carvalho Chehab
23*6cf2a73cSMauro Carvalho ChehabThere are three dm targets available:
24*6cf2a73cSMauro Carvalho Chehabsnapshot, snapshot-origin, and snapshot-merge.
25*6cf2a73cSMauro Carvalho Chehab
26*6cf2a73cSMauro Carvalho Chehab-  snapshot-origin <origin>
27*6cf2a73cSMauro Carvalho Chehab
28*6cf2a73cSMauro Carvalho Chehabwhich will normally have one or more snapshots based on it.
29*6cf2a73cSMauro Carvalho ChehabReads will be mapped directly to the backing device. For each write, the
30*6cf2a73cSMauro Carvalho Chehaboriginal data will be saved in the <COW device> of each snapshot to keep
31*6cf2a73cSMauro Carvalho Chehabits visible content unchanged, at least until the <COW device> fills up.
32*6cf2a73cSMauro Carvalho Chehab
33*6cf2a73cSMauro Carvalho Chehab
34*6cf2a73cSMauro Carvalho Chehab-  snapshot <origin> <COW device> <persistent?> <chunksize>
35*6cf2a73cSMauro Carvalho Chehab   [<# feature args> [<arg>]*]
36*6cf2a73cSMauro Carvalho Chehab
37*6cf2a73cSMauro Carvalho ChehabA snapshot of the <origin> block device is created. Changed chunks of
38*6cf2a73cSMauro Carvalho Chehab<chunksize> sectors will be stored on the <COW device>.  Writes will
39*6cf2a73cSMauro Carvalho Chehabonly go to the <COW device>.  Reads will come from the <COW device> or
40*6cf2a73cSMauro Carvalho Chehabfrom <origin> for unchanged data.  <COW device> will often be
41*6cf2a73cSMauro Carvalho Chehabsmaller than the origin and if it fills up the snapshot will become
42*6cf2a73cSMauro Carvalho Chehabuseless and be disabled, returning errors.  So it is important to monitor
43*6cf2a73cSMauro Carvalho Chehabthe amount of free space and expand the <COW device> before it fills up.
44*6cf2a73cSMauro Carvalho Chehab
45*6cf2a73cSMauro Carvalho Chehab<persistent?> is P (Persistent) or N (Not persistent - will not survive
46*6cf2a73cSMauro Carvalho Chehabafter reboot).  O (Overflow) can be added as a persistent store option
47*6cf2a73cSMauro Carvalho Chehabto allow userspace to advertise its support for seeing "Overflow" in the
48*6cf2a73cSMauro Carvalho Chehabsnapshot status.  So supported store types are "P", "PO" and "N".
49*6cf2a73cSMauro Carvalho Chehab
50*6cf2a73cSMauro Carvalho ChehabThe difference between persistent and transient is with transient
51*6cf2a73cSMauro Carvalho Chehabsnapshots less metadata must be saved on disk - they can be kept in
52*6cf2a73cSMauro Carvalho Chehabmemory by the kernel.
53*6cf2a73cSMauro Carvalho Chehab
54*6cf2a73cSMauro Carvalho ChehabWhen loading or unloading the snapshot target, the corresponding
55*6cf2a73cSMauro Carvalho Chehabsnapshot-origin or snapshot-merge target must be suspended. A failure to
56*6cf2a73cSMauro Carvalho Chehabsuspend the origin target could result in data corruption.
57*6cf2a73cSMauro Carvalho Chehab
58*6cf2a73cSMauro Carvalho ChehabOptional features:
59*6cf2a73cSMauro Carvalho Chehab
60*6cf2a73cSMauro Carvalho Chehab   discard_zeroes_cow - a discard issued to the snapshot device that
61*6cf2a73cSMauro Carvalho Chehab   maps to entire chunks to will zero the corresponding exception(s) in
62*6cf2a73cSMauro Carvalho Chehab   the snapshot's exception store.
63*6cf2a73cSMauro Carvalho Chehab
64*6cf2a73cSMauro Carvalho Chehab   discard_passdown_origin - a discard to the snapshot device is passed
65*6cf2a73cSMauro Carvalho Chehab   down to the snapshot-origin's underlying device.  This doesn't cause
66*6cf2a73cSMauro Carvalho Chehab   copy-out to the snapshot exception store because the snapshot-origin
67*6cf2a73cSMauro Carvalho Chehab   target is bypassed.
68*6cf2a73cSMauro Carvalho Chehab
69*6cf2a73cSMauro Carvalho Chehab   The discard_passdown_origin feature depends on the discard_zeroes_cow
70*6cf2a73cSMauro Carvalho Chehab   feature being enabled.
71*6cf2a73cSMauro Carvalho Chehab
72*6cf2a73cSMauro Carvalho Chehab
73*6cf2a73cSMauro Carvalho Chehab-  snapshot-merge <origin> <COW device> <persistent> <chunksize>
74*6cf2a73cSMauro Carvalho Chehab   [<# feature args> [<arg>]*]
75*6cf2a73cSMauro Carvalho Chehab
76*6cf2a73cSMauro Carvalho Chehabtakes the same table arguments as the snapshot target except it only
77*6cf2a73cSMauro Carvalho Chehabworks with persistent snapshots.  This target assumes the role of the
78*6cf2a73cSMauro Carvalho Chehab"snapshot-origin" target and must not be loaded if the "snapshot-origin"
79*6cf2a73cSMauro Carvalho Chehabis still present for <origin>.
80*6cf2a73cSMauro Carvalho Chehab
81*6cf2a73cSMauro Carvalho ChehabCreates a merging snapshot that takes control of the changed chunks
82*6cf2a73cSMauro Carvalho Chehabstored in the <COW device> of an existing snapshot, through a handover
83*6cf2a73cSMauro Carvalho Chehabprocedure, and merges these chunks back into the <origin>.  Once merging
84*6cf2a73cSMauro Carvalho Chehabhas started (in the background) the <origin> may be opened and the merge
85*6cf2a73cSMauro Carvalho Chehabwill continue while I/O is flowing to it.  Changes to the <origin> are
86*6cf2a73cSMauro Carvalho Chehabdeferred until the merging snapshot's corresponding chunk(s) have been
87*6cf2a73cSMauro Carvalho Chehabmerged.  Once merging has started the snapshot device, associated with
88*6cf2a73cSMauro Carvalho Chehabthe "snapshot" target, will return -EIO when accessed.
89*6cf2a73cSMauro Carvalho Chehab
90*6cf2a73cSMauro Carvalho Chehab
91*6cf2a73cSMauro Carvalho ChehabHow snapshot is used by LVM2
92*6cf2a73cSMauro Carvalho Chehab============================
93*6cf2a73cSMauro Carvalho ChehabWhen you create the first LVM2 snapshot of a volume, four dm devices are used:
94*6cf2a73cSMauro Carvalho Chehab
95*6cf2a73cSMauro Carvalho Chehab1) a device containing the original mapping table of the source volume;
96*6cf2a73cSMauro Carvalho Chehab2) a device used as the <COW device>;
97*6cf2a73cSMauro Carvalho Chehab3) a "snapshot" device, combining #1 and #2, which is the visible snapshot
98*6cf2a73cSMauro Carvalho Chehab   volume;
99*6cf2a73cSMauro Carvalho Chehab4) the "original" volume (which uses the device number used by the original
100*6cf2a73cSMauro Carvalho Chehab   source volume), whose table is replaced by a "snapshot-origin" mapping
101*6cf2a73cSMauro Carvalho Chehab   from device #1.
102*6cf2a73cSMauro Carvalho Chehab
103*6cf2a73cSMauro Carvalho ChehabA fixed naming scheme is used, so with the following commands::
104*6cf2a73cSMauro Carvalho Chehab
105*6cf2a73cSMauro Carvalho Chehab  lvcreate -L 1G -n base volumeGroup
106*6cf2a73cSMauro Carvalho Chehab  lvcreate -L 100M --snapshot -n snap volumeGroup/base
107*6cf2a73cSMauro Carvalho Chehab
108*6cf2a73cSMauro Carvalho Chehabwe'll have this situation (with volumes in above order)::
109*6cf2a73cSMauro Carvalho Chehab
110*6cf2a73cSMauro Carvalho Chehab  # dmsetup table|grep volumeGroup
111*6cf2a73cSMauro Carvalho Chehab
112*6cf2a73cSMauro Carvalho Chehab  volumeGroup-base-real: 0 2097152 linear 8:19 384
113*6cf2a73cSMauro Carvalho Chehab  volumeGroup-snap-cow: 0 204800 linear 8:19 2097536
114*6cf2a73cSMauro Carvalho Chehab  volumeGroup-snap: 0 2097152 snapshot 254:11 254:12 P 16
115*6cf2a73cSMauro Carvalho Chehab  volumeGroup-base: 0 2097152 snapshot-origin 254:11
116*6cf2a73cSMauro Carvalho Chehab
117*6cf2a73cSMauro Carvalho Chehab  # ls -lL /dev/mapper/volumeGroup-*
118*6cf2a73cSMauro Carvalho Chehab  brw-------  1 root root 254, 11 29 ago 18:15 /dev/mapper/volumeGroup-base-real
119*6cf2a73cSMauro Carvalho Chehab  brw-------  1 root root 254, 12 29 ago 18:15 /dev/mapper/volumeGroup-snap-cow
120*6cf2a73cSMauro Carvalho Chehab  brw-------  1 root root 254, 13 29 ago 18:15 /dev/mapper/volumeGroup-snap
121*6cf2a73cSMauro Carvalho Chehab  brw-------  1 root root 254, 10 29 ago 18:14 /dev/mapper/volumeGroup-base
122*6cf2a73cSMauro Carvalho Chehab
123*6cf2a73cSMauro Carvalho Chehab
124*6cf2a73cSMauro Carvalho ChehabHow snapshot-merge is used by LVM2
125*6cf2a73cSMauro Carvalho Chehab==================================
126*6cf2a73cSMauro Carvalho ChehabA merging snapshot assumes the role of the "snapshot-origin" while
127*6cf2a73cSMauro Carvalho Chehabmerging.  As such the "snapshot-origin" is replaced with
128*6cf2a73cSMauro Carvalho Chehab"snapshot-merge".  The "-real" device is not changed and the "-cow"
129*6cf2a73cSMauro Carvalho Chehabdevice is renamed to <origin name>-cow to aid LVM2's cleanup of the
130*6cf2a73cSMauro Carvalho Chehabmerging snapshot after it completes.  The "snapshot" that hands over its
131*6cf2a73cSMauro Carvalho ChehabCOW device to the "snapshot-merge" is deactivated (unless using lvchange
132*6cf2a73cSMauro Carvalho Chehab--refresh); but if it is left active it will simply return I/O errors.
133*6cf2a73cSMauro Carvalho Chehab
134*6cf2a73cSMauro Carvalho ChehabA snapshot will merge into its origin with the following command::
135*6cf2a73cSMauro Carvalho Chehab
136*6cf2a73cSMauro Carvalho Chehab  lvconvert --merge volumeGroup/snap
137*6cf2a73cSMauro Carvalho Chehab
138*6cf2a73cSMauro Carvalho Chehabwe'll now have this situation::
139*6cf2a73cSMauro Carvalho Chehab
140*6cf2a73cSMauro Carvalho Chehab  # dmsetup table|grep volumeGroup
141*6cf2a73cSMauro Carvalho Chehab
142*6cf2a73cSMauro Carvalho Chehab  volumeGroup-base-real: 0 2097152 linear 8:19 384
143*6cf2a73cSMauro Carvalho Chehab  volumeGroup-base-cow: 0 204800 linear 8:19 2097536
144*6cf2a73cSMauro Carvalho Chehab  volumeGroup-base: 0 2097152 snapshot-merge 254:11 254:12 P 16
145*6cf2a73cSMauro Carvalho Chehab
146*6cf2a73cSMauro Carvalho Chehab  # ls -lL /dev/mapper/volumeGroup-*
147*6cf2a73cSMauro Carvalho Chehab  brw-------  1 root root 254, 11 29 ago 18:15 /dev/mapper/volumeGroup-base-real
148*6cf2a73cSMauro Carvalho Chehab  brw-------  1 root root 254, 12 29 ago 18:16 /dev/mapper/volumeGroup-base-cow
149*6cf2a73cSMauro Carvalho Chehab  brw-------  1 root root 254, 10 29 ago 18:16 /dev/mapper/volumeGroup-base
150*6cf2a73cSMauro Carvalho Chehab
151*6cf2a73cSMauro Carvalho Chehab
152*6cf2a73cSMauro Carvalho ChehabHow to determine when a merging is complete
153*6cf2a73cSMauro Carvalho Chehab===========================================
154*6cf2a73cSMauro Carvalho ChehabThe snapshot-merge and snapshot status lines end with:
155*6cf2a73cSMauro Carvalho Chehab
156*6cf2a73cSMauro Carvalho Chehab  <sectors_allocated>/<total_sectors> <metadata_sectors>
157*6cf2a73cSMauro Carvalho Chehab
158*6cf2a73cSMauro Carvalho ChehabBoth <sectors_allocated> and <total_sectors> include both data and metadata.
159*6cf2a73cSMauro Carvalho ChehabDuring merging, the number of sectors allocated gets smaller and
160*6cf2a73cSMauro Carvalho Chehabsmaller.  Merging has finished when the number of sectors holding data
161*6cf2a73cSMauro Carvalho Chehabis zero, in other words <sectors_allocated> == <metadata_sectors>.
162*6cf2a73cSMauro Carvalho Chehab
163*6cf2a73cSMauro Carvalho ChehabHere is a practical example (using a hybrid of lvm and dmsetup commands)::
164*6cf2a73cSMauro Carvalho Chehab
165*6cf2a73cSMauro Carvalho Chehab  # lvs
166*6cf2a73cSMauro Carvalho Chehab    LV      VG          Attr   LSize Origin  Snap%  Move Log Copy%  Convert
167*6cf2a73cSMauro Carvalho Chehab    base    volumeGroup owi-a- 4.00g
168*6cf2a73cSMauro Carvalho Chehab    snap    volumeGroup swi-a- 1.00g base  18.97
169*6cf2a73cSMauro Carvalho Chehab
170*6cf2a73cSMauro Carvalho Chehab  # dmsetup status volumeGroup-snap
171*6cf2a73cSMauro Carvalho Chehab  0 8388608 snapshot 397896/2097152 1560
172*6cf2a73cSMauro Carvalho Chehab                                    ^^^^ metadata sectors
173*6cf2a73cSMauro Carvalho Chehab
174*6cf2a73cSMauro Carvalho Chehab  # lvconvert --merge -b volumeGroup/snap
175*6cf2a73cSMauro Carvalho Chehab    Merging of volume snap started.
176*6cf2a73cSMauro Carvalho Chehab
177*6cf2a73cSMauro Carvalho Chehab  # lvs volumeGroup/snap
178*6cf2a73cSMauro Carvalho Chehab    LV      VG          Attr   LSize Origin  Snap%  Move Log Copy%  Convert
179*6cf2a73cSMauro Carvalho Chehab    base    volumeGroup Owi-a- 4.00g          17.23
180*6cf2a73cSMauro Carvalho Chehab
181*6cf2a73cSMauro Carvalho Chehab  # dmsetup status volumeGroup-base
182*6cf2a73cSMauro Carvalho Chehab  0 8388608 snapshot-merge 281688/2097152 1104
183*6cf2a73cSMauro Carvalho Chehab
184*6cf2a73cSMauro Carvalho Chehab  # dmsetup status volumeGroup-base
185*6cf2a73cSMauro Carvalho Chehab  0 8388608 snapshot-merge 180480/2097152 712
186*6cf2a73cSMauro Carvalho Chehab
187*6cf2a73cSMauro Carvalho Chehab  # dmsetup status volumeGroup-base
188*6cf2a73cSMauro Carvalho Chehab  0 8388608 snapshot-merge 16/2097152 16
189*6cf2a73cSMauro Carvalho Chehab
190*6cf2a73cSMauro Carvalho ChehabMerging has finished.
191*6cf2a73cSMauro Carvalho Chehab
192*6cf2a73cSMauro Carvalho Chehab::
193*6cf2a73cSMauro Carvalho Chehab
194*6cf2a73cSMauro Carvalho Chehab  # lvs
195*6cf2a73cSMauro Carvalho Chehab    LV      VG          Attr   LSize Origin  Snap%  Move Log Copy%  Convert
196*6cf2a73cSMauro Carvalho Chehab    base    volumeGroup owi-a- 4.00g
197