xref: /openbmc/qemu/docs/interop/qcow2.rst (revision 2553d2d26a9d0f46386bf8c37d184567e5cede6c)
1=======================
2Qcow2 Image File Format
3=======================
4
5A ``qcow2`` image file is organized in units of constant size, which are called
6(host) clusters. A cluster is the unit in which all allocations are done,
7both for actual guest data and for image metadata.
8
9Likewise, the virtual disk as seen by the guest is divided into (guest)
10clusters of the same size.
11
12All numbers in qcow2 are stored in Big Endian byte order.
13
14Header
15------
16
17The first cluster of a qcow2 image contains the file header::
18
19    Byte  0 -  3:   magic
20                    QCOW magic string ("QFI\xfb")
21
22          4 -  7:   version
23                    Version number (valid values are 2 and 3)
24
25          8 - 15:   backing_file_offset
26                    Offset into the image file at which the backing file name
27                    is stored (NB: The string is not null terminated). 0 if the
28                    image doesn't have a backing file.
29
30                    Note: backing files are incompatible with raw external data
31                    files (auto-clear feature bit 1).
32
33         16 - 19:   backing_file_size
34                    Length of the backing file name in bytes. Must not be
35                    longer than 1023 bytes. Undefined if the image doesn't have
36                    a backing file.
37
38         20 - 23:   cluster_bits
39                    Number of bits that are used for addressing an offset
40                    within a cluster (1 << cluster_bits is the cluster size).
41                    Must not be less than 9 (i.e. 512 byte clusters).
42
43                    Note: QEMU as of today has an implementation limit of 2 MB
44                    as the maximum cluster size and won't be able to open images
45                    with larger cluster sizes.
46
47                    Note: if the image has Extended L2 Entries then cluster_bits
48                    must be at least 14 (i.e. 16384 byte clusters).
49
50         24 - 31:   size
51                    Virtual disk size in bytes.
52
53                    Note: QEMU has an implementation limit of 32 MB as
54                    the maximum L1 table size.  With a 2 MB cluster
55                    size, it is unable to populate a virtual cluster
56                    beyond 2 EB (61 bits); with a 512 byte cluster
57                    size, it is unable to populate a virtual size
58                    larger than 128 GB (37 bits).  Meanwhile, L1/L2
59                    table layouts limit an image to no more than 64 PB
60                    (56 bits) of populated clusters, and an image may
61                    hit other limits first (such as a file system's
62                    maximum size).
63
64         32 - 35:   crypt_method
65                    0 for no encryption
66                    1 for AES encryption
67                    2 for LUKS encryption
68
69         36 - 39:   l1_size
70                    Number of entries in the active L1 table
71
72         40 - 47:   l1_table_offset
73                    Offset into the image file at which the active L1 table
74                    starts. Must be aligned to a cluster boundary.
75
76         48 - 55:   refcount_table_offset
77                    Offset into the image file at which the refcount table
78                    starts. Must be aligned to a cluster boundary.
79
80         56 - 59:   refcount_table_clusters
81                    Number of clusters that the refcount table occupies
82
83         60 - 63:   nb_snapshots
84                    Number of snapshots contained in the image
85
86         64 - 71:   snapshots_offset
87                    Offset into the image file at which the snapshot table
88                    starts. Must be aligned to a cluster boundary.
89
90For version 2, the header is exactly 72 bytes in length, and finishes here.
91For version 3 or higher, the header length is at least 104 bytes, including
92the next fields through ``header_length``.
93::
94
95         72 -  79:  incompatible_features
96                    Bitmask of incompatible features. An implementation must
97                    fail to open an image if an unknown bit is set.
98
99                    Bit 0:      Dirty bit.  If this bit is set then refcounts
100                                may be inconsistent, make sure to scan L1/L2
101                                tables to repair refcounts before accessing the
102                                image.
103
104                    Bit 1:      Corrupt bit.  If this bit is set then any data
105                                structure may be corrupt and the image must not
106                                be written to (unless for regaining
107                                consistency).
108
109                    Bit 2:      External data file bit.  If this bit is set, an
110                                external data file is used. Guest clusters are
111                                then stored in the external data file. For such
112                                images, clusters in the external data file are
113                                not refcounted. The offset field in the
114                                Standard Cluster Descriptor must match the
115                                guest offset and neither compressed clusters
116                                nor internal snapshots are supported.
117
118                                An External Data File Name header extension may
119                                be present if this bit is set.
120
121                    Bit 3:      Compression type bit.  If this bit is set,
122                                a non-default compression is used for compressed
123                                clusters. The compression_type field must be
124                                present and not zero.
125
126                    Bit 4:      Extended L2 Entries.  If this bit is set then
127                                L2 table entries use an extended format that
128                                allows subcluster-based allocation. See the
129                                Extended L2 Entries section for more details.
130
131                    Bits 5-63:  Reserved (set to 0)
132
133         80 -  87:  compatible_features
134                    Bitmask of compatible features. An implementation can
135                    safely ignore any unknown bits that are set.
136
137                    Bit 0:      Lazy refcounts bit.  If this bit is set then
138                                lazy refcount updates can be used.  This means
139                                marking the image file dirty and postponing
140                                refcount metadata updates.
141
142                    Bits 1-63:  Reserved (set to 0)
143
144         88 -  95:  autoclear_features
145                    Bitmask of auto-clear features. An implementation may only
146                    write to an image with unknown auto-clear features if it
147                    clears the respective bits from this field first.
148
149                    Bit 0:      Bitmaps extension bit
150                                This bit indicates consistency for the bitmaps
151                                extension data.
152
153                                It is an error if this bit is set without the
154                                bitmaps extension present.
155
156                                If the bitmaps extension is present but this
157                                bit is unset, the bitmaps extension data must be
158                                considered inconsistent.
159
160                    Bit 1:      Raw external data bit
161                                If this bit is set, the external data file can
162                                be read as a consistent standalone raw image
163                                without looking at the qcow2 metadata.
164
165                                Setting this bit has a performance impact for
166                                some operations on the image (e.g. writing
167                                zeros requires writing to the data file instead
168                                of only setting the zero flag in the L2 table
169                                entry) and conflicts with backing files.
170
171                                This bit may only be set if the External Data
172                                File bit (incompatible feature bit 1) is also
173                                set.
174
175                    Bits 2-63:  Reserved (set to 0)
176
177         96 -  99:  refcount_order
178                    Describes the width of a reference count block entry (width
179                    in bits: refcount_bits = 1 << refcount_order). For version 2
180                    images, the order is always assumed to be 4
181                    (i.e. refcount_bits = 16).
182                    This value may not exceed 6 (i.e. refcount_bits = 64).
183
184        100 - 103:  header_length
185                    Length of the header structure in bytes. For version 2
186                    images, the length is always assumed to be 72 bytes.
187                    For version 3 it's at least 104 bytes and must be a multiple
188                    of 8.
189
190
191Additional fields (version 3 and higher)
192----------------------------------------
193
194In general, these fields are optional and may be safely ignored by the software,
195as well as filled by zeros (which is equal to field absence), if software needs
196to set field B, but does not care about field A which precedes B. More
197formally, additional fields have the following compatibility rules:
198
1991. If the value of the additional field must not be ignored for correct
200   handling of the file, it will be accompanied by a corresponding incompatible
201   feature bit.
202
2032. If there are no unrecognized incompatible feature bits set, an unknown
204   additional field may be safely ignored other than preserving its value when
205   rewriting the image header.
206
207.. _ref_rules_3:
208
2093. An explicit value of 0 will have the same behavior as when the field is not
210   present*, if not altered by a specific incompatible bit.
211
212(*) A field is considered not present when ``header_length`` is less than or equal
213to the field's offset. Also, all additional fields are not present for
214version 2.
215
216::
217
218        104:        compression_type
219
220                    Defines the compression method used for compressed clusters.
221                    All compressed clusters in an image use the same compression
222                    type.
223
224                    If the incompatible bit "Compression type" is set: the field
225                    must be present and non-zero (which means non-deflate
226                    compression type). Otherwise, this field must not be present
227                    or must be zero (which means deflate).
228
229                    Available compression type values:
230                       - 0: deflate <https://www.ietf.org/rfc/rfc1951.txt>
231                       - 1: zstd <http://github.com/facebook/zstd>
232
233                    The deflate compression type is called "zlib"
234                    <https://www.zlib.net/> in QEMU. However, clusters with the
235                    deflate compression type do not have zlib headers.
236
237        105 - 111:  Padding, contents defined below.
238
239Header padding
240--------------
241
242``header_length`` must be a multiple of 8, which means that if the end of the last
243additional field is not aligned, some padding is needed. This padding must be
244zeroed, so that if some existing (or future) additional field will fall into
245the padding, it will be interpreted accordingly to point `[3.] <#ref_rules_3>`_ of the previous
246paragraph, i.e.  in the same manner as when this field is not present.
247
248
249Header extensions
250-----------------
251
252Directly after the image header, optional sections called header extensions can
253be stored. Each extension has a structure like the following::
254
255    Byte  0 -  3:   Header extension type:
256                        0x00000000 - End of the header extension area
257                        0xe2792aca - Backing file format name string
258                        0x6803f857 - Feature name table
259                        0x23852875 - Bitmaps extension
260                        0x0537be77 - Full disk encryption header pointer
261                        0x44415441 - External data file name string
262                        other      - Unknown header extension, can be safely
263                                     ignored
264
265          4 -  7:   Length of the header extension data
266
267          8 -  n:   Header extension data
268
269          n -  m:   Padding to round up the header extension size to the next
270                    multiple of 8.
271
272Unless stated otherwise, each header extension type shall appear at most once
273in the same image.
274
275If the image has a backing file then the backing file name should be stored in
276the remaining space between the end of the header extension area and the end of
277the first cluster. It is not allowed to store other data here, so that an
278implementation can safely modify the header and add extensions without harming
279data of compatible features that it doesn't support. Compatible features that
280need space for additional data can use a header extension.
281
282
283String header extensions
284------------------------
285
286Some header extensions (such as the backing file format name and the external
287data file name) are just a single string. In this case, the header extension
288length is the string length and the string is not ``\0`` terminated. (The header
289extension padding can make it look like a string is ``\0`` terminated, but
290neither is padding always necessary nor is there a guarantee that zero bytes
291are used for padding.)
292
293
294Feature name table
295------------------
296
297The feature name table is an optional header extension that contains the name
298for features used by the image. It can be used by applications that don't know
299the respective feature (e.g. because the feature was introduced only later) to
300display a useful error message.
301
302The number of entries in the feature name table is determined by the length of
303the header extension data. Each entry looks like this::
304
305    Byte       0:   Type of feature (select feature bitmap)
306                        0: Incompatible feature
307                        1: Compatible feature
308                        2: Autoclear feature
309
310               1:   Bit number within the selected feature bitmap (valid
311                    values: 0-63)
312
313          2 - 47:   Feature name (padded with zeros, but not necessarily null
314                    terminated if it has full length)
315
316
317Bitmaps extension
318-----------------
319
320The bitmaps extension is an optional header extension. It provides the ability
321to store bitmaps related to a virtual disk. For now, there is only one bitmap
322type: the dirty tracking bitmap, which tracks virtual disk changes from some
323point in time.
324
325The data of the extension should be considered consistent only if the
326corresponding auto-clear feature bit is set, see ``autoclear_features`` above.
327
328The fields of the bitmaps extension are::
329
330    Byte  0 -  3:  nb_bitmaps
331                   The number of bitmaps contained in the image. Must be
332                   greater than or equal to 1.
333
334                   Note: QEMU currently only supports up to 65535 bitmaps per
335                   image.
336
337          4 -  7:  Reserved, must be zero.
338
339          8 - 15:  bitmap_directory_size
340                   Size of the bitmap directory in bytes. It is the cumulative
341                   size of all (nb_bitmaps) bitmap directory entries.
342
343         16 - 23:  bitmap_directory_offset
344                   Offset into the image file at which the bitmap directory
345                   starts. Must be aligned to a cluster boundary.
346
347Full disk encryption header pointer
348-----------------------------------
349
350The full disk encryption header must be present if, and only if, the
351``crypt_method`` header requires metadata. Currently this is only true
352of the ``LUKS`` crypt method. The header extension must be absent for
353other methods.
354
355This header provides the offset at which the crypt method can store
356its additional data, as well as the length of such data.
357::
358
359    Byte  0 -  7:   Offset into the image file at which the encryption
360                    header starts in bytes. Must be aligned to a cluster
361                    boundary.
362    Byte  8 - 15:   Length of the written encryption header in bytes.
363                    Note actual space allocated in the qcow2 file may
364                    be larger than this value, since it will be rounded
365                    to the nearest multiple of the cluster size. Any
366                    unused bytes in the allocated space will be initialized
367                    to 0.
368
369For the LUKS crypt method, the encryption header works as follows.
370
371The first 592 bytes of the header clusters will contain the LUKS
372partition header. This is then followed by the key material data areas.
373The size of the key material data areas is determined by the number of
374stripes in the key slot and key size. Refer to the LUKS format
375specification (``docs/on-disk-format.pdf`` in the cryptsetup source
376package) for details of the LUKS partition header format.
377
378In the LUKS partition header, the ``payload-offset`` field will be
379calculated as normal for the LUKS spec. ie the size of the LUKS
380header, plus key material regions, plus padding, relative to the
381start of the LUKS header. This offset value is not required to be
382qcow2 cluster aligned. Its value is currently never used in the
383context of qcow2, since the qcow2 file format itself defines where
384the real payload offset is, but none the less a valid payload offset
385should always be present.
386
387In the LUKS key slots header, the ``key-material-offset`` is relative
388to the start of the LUKS header clusters in the qcow2 container,
389not the start of the qcow2 file.
390
391Logically the layout looks like
392::
393
394  +-----------------------------+
395  | QCow2 header                |
396  | QCow2 header extension X    |
397  | QCow2 header extension FDE  |
398  | QCow2 header extension ...  |
399  | QCow2 header extension Z    |
400  +-----------------------------+
401  | ....other QCow2 tables....  |
402  .                             .
403  .                             .
404  +-----------------------------+
405  | +-------------------------+ |
406  | | LUKS partition header   | |
407  | +-------------------------+ |
408  | | LUKS key material 1     | |
409  | +-------------------------+ |
410  | | LUKS key material 2     | |
411  | +-------------------------+ |
412  | | LUKS key material ...   | |
413  | +-------------------------+ |
414  | | LUKS key material 8     | |
415  | +-------------------------+ |
416  +-----------------------------+
417  | QCow2 cluster payload       |
418  .                             .
419  .                             .
420  .                             .
421  |                             |
422  +-----------------------------+
423
424Data encryption
425---------------
426
427When an encryption method is requested in the header, the image payload
428data must be encrypted/decrypted on every write/read. The image headers
429and metadata are never encrypted.
430
431The algorithms used for encryption vary depending on the method
432
433 - ``AES``:
434
435   The AES cipher, in CBC mode, with 256 bit keys.
436
437   Initialization vectors generated using plain64 method, with
438   the virtual disk sector as the input tweak.
439
440   This format is no longer supported in QEMU system emulators, due
441   to a number of design flaws affecting its security. It is only
442   supported in the command line tools for the sake of back compatibility
443   and data liberation.
444
445 - ``LUKS``:
446
447   The algorithms are specified in the LUKS header.
448
449   Initialization vectors generated using the method specified
450   in the LUKS header, with the physical disk sector as the
451   input tweak.
452
453Host cluster management
454-----------------------
455
456qcow2 manages the allocation of host clusters by maintaining a reference count
457for each host cluster. A refcount of 0 means that the cluster is free, 1 means
458that it is used, and >= 2 means that it is used and any write access must
459perform a COW (copy on write) operation.
460
461The refcounts are managed in a two-level table. The first level is called
462refcount table and has a variable size (which is stored in the header). The
463refcount table can cover multiple clusters, however it needs to be contiguous
464in the image file.
465
466It contains pointers to the second level structures which are called refcount
467blocks and are exactly one cluster in size.
468
469Although a large enough refcount table can reserve clusters past 64 PB
470(56 bits) (assuming the underlying protocol can even be sized that
471large), note that some qcow2 metadata such as L1/L2 tables must point
472to clusters prior to that point.
473
474.. note::
475    QEMU has an implementation limit of 8 MB as the maximum refcount
476    table size.  With a 2 MB cluster size and a default refcount_order of
477    4, it is unable to reference host resources beyond 2 EB (61 bits); in
478    the worst case, with a 512 cluster size and refcount_order of 6, it is
479    unable to access beyond 32 GB (35 bits).
480
481Given an offset into the image file, the refcount of its cluster can be
482obtained as follows::
483
484    refcount_block_entries = (cluster_size * 8 / refcount_bits)
485
486    refcount_block_index = (offset / cluster_size) % refcount_block_entries
487    refcount_table_index = (offset / cluster_size) / refcount_block_entries
488
489    refcount_block = load_cluster(refcount_table[refcount_table_index]);
490    return refcount_block[refcount_block_index];
491
492Refcount table entry::
493
494    Bit  0 -  8:    Reserved (set to 0)
495
496         9 - 63:    Bits 9-63 of the offset into the image file at which the
497                    refcount block starts. Must be aligned to a cluster
498                    boundary.
499
500                    If this is 0, the corresponding refcount block has not yet
501                    been allocated. All refcounts managed by this refcount block
502                    are 0.
503
504Refcount block entry ``(x = refcount_bits - 1)``::
505
506    Bit  0 -  x:    Reference count of the cluster. If refcount_bits implies a
507                    sub-byte width, note that bit 0 means the least significant
508                    bit in this context.
509
510
511Cluster mapping
512---------------
513
514Just as for refcounts, qcow2 uses a two-level structure for the mapping of
515guest clusters to host clusters. They are called L1 and L2 table.
516
517The L1 table has a variable size (stored in the header) and may use multiple
518clusters, however it must be contiguous in the image file. L2 tables are
519exactly one cluster in size.
520
521The L1 and L2 tables have implications on the maximum virtual file
522size; for a given L1 table size, a larger cluster size is required for
523the guest to have access to more space.  Furthermore, a virtual
524cluster must currently map to a host offset below 64 PB (56 bits)
525(although this limit could be relaxed by putting reserved bits into
526use).  Additionally, as cluster size increases, the maximum host
527offset for a compressed cluster is reduced (a 2M cluster size requires
528compressed clusters to reside below 512 TB (49 bits), and this limit
529cannot be relaxed without an incompatible layout change).
530
531Given an offset into the virtual disk, the offset into the image file can be
532obtained as follows::
533
534    l2_entries = (cluster_size / sizeof(uint64_t))        [*]
535
536    l2_index = (offset / cluster_size) % l2_entries
537    l1_index = (offset / cluster_size) / l2_entries
538
539    l2_table = load_cluster(l1_table[l1_index]);
540    cluster_offset = l2_table[l2_index];
541
542    return cluster_offset + (offset % cluster_size)
543
544    [*] this changes if Extended L2 Entries are enabled, see next section
545
546L1 table entry::
547
548    Bit  0 -  8:    Reserved (set to 0)
549
550         9 - 55:    Bits 9-55 of the offset into the image file at which the L2
551                    table starts. Must be aligned to a cluster boundary. If the
552                    offset is 0, the L2 table and all clusters described by this
553                    L2 table are unallocated.
554
555        56 - 62:    Reserved (set to 0)
556
557             63:    0 for an L2 table that is unused or requires COW, 1 if its
558                    refcount is exactly one. This information is only accurate
559                    in the active L1 table.
560
561L2 table entry::
562
563    Bit  0 -  61:   Cluster descriptor
564
565              62:   0 for standard clusters
566                    1 for compressed clusters
567
568              63:   0 for clusters that are unused, compressed or require COW.
569                    1 for standard clusters whose refcount is exactly one.
570                    This information is only accurate in L2 tables
571                    that are reachable from the active L1 table.
572
573                    With external data files, all guest clusters have an
574                    implicit refcount of 1 (because of the fixed host = guest
575                    mapping for guest cluster offsets), so this bit should be 1
576                    for all allocated clusters.
577
578Standard Cluster Descriptor::
579
580    Bit       0:    If set to 1, the cluster reads as all zeros. The host
581                    cluster offset can be used to describe a preallocation,
582                    but it won't be used for reading data from this cluster,
583                    nor is data read from the backing file if the cluster is
584                    unallocated.
585
586                    With version 2 or with extended L2 entries (see the next
587                    section), this is always 0.
588
589         1 -  8:    Reserved (set to 0)
590
591         9 - 55:    Bits 9-55 of host cluster offset. Must be aligned to a
592                    cluster boundary. If the offset is 0 and bit 63 is clear,
593                    the cluster is unallocated. The offset may only be 0 with
594                    bit 63 set (indicating a host cluster offset of 0) when an
595                    external data file is used.
596
597        56 - 61:    Reserved (set to 0)
598
599
600Compressed Clusters Descriptor ``(x = 62 - (cluster_bits - 8))``::
601
602    Bit  0 - x-1:   Host cluster offset. This is usually _not_ aligned to a
603                    cluster or sector boundary!  If cluster_bits is
604                    small enough that this field includes bits beyond
605                    55, those upper bits must be set to 0.
606
607         x - 61:    Number of additional 512-byte sectors used for the
608                    compressed data, beyond the sector containing the offset
609                    in the previous field. Some of these sectors may reside
610                    in the next contiguous host cluster.
611
612                    Note that the compressed data does not necessarily occupy
613                    all of the bytes in the final sector; rather, decompression
614                    stops when it has produced a cluster of data.
615
616                    Another compressed cluster may map to the tail of the final
617                    sector used by this compressed cluster.
618
619If a cluster is unallocated, read requests shall read the data from the backing
620file (except if bit 0 in the Standard Cluster Descriptor is set). If there is
621no backing file or the backing file is smaller than the image, they shall read
622zeros for all parts that are not covered by the backing file.
623
624Extended L2 Entries
625-------------------
626
627An image uses Extended L2 Entries if bit 4 is set on the incompatible_features
628field of the header.
629
630In these images standard data clusters are divided into 32 subclusters of the
631same size. They are contiguous and start from the beginning of the cluster.
632Subclusters can be allocated independently and the L2 entry contains information
633indicating the status of each one of them. Compressed data clusters don't have
634subclusters so they are treated the same as in images without this feature.
635
636The size of an extended L2 entry is 128 bits so the number of entries per table
637is calculated using this formula:
638
639.. code::
640
641    l2_entries = (cluster_size / (2 * sizeof(uint64_t)))
642
643The first 64 bits have the same format as the standard L2 table entry described
644in the previous section, with the exception of bit 0 of the standard cluster
645descriptor.
646
647The last 64 bits contain a subcluster allocation bitmap with this format:
648
649Subcluster Allocation Bitmap (for standard clusters)::
650
651    Bit  0 - 31:    Allocation status (one bit per subcluster)
652
653                    1: the subcluster is allocated. In this case the
654                       host cluster offset field must contain a valid
655                       offset.
656                    0: the subcluster is not allocated. In this case
657                       read requests shall go to the backing file or
658                       return zeros if there is no backing file data.
659
660                    Bits are assigned starting from the least significant
661                    one (i.e. bit x is used for subcluster x).
662
663        32 - 63     Subcluster reads as zeros (one bit per subcluster)
664
665                    1: the subcluster reads as zeros. In this case the
666                       allocation status bit must be unset. The host
667                       cluster offset field may or may not be set.
668                    0: no effect.
669
670                    Bits are assigned starting from the least significant
671                    one (i.e. bit x is used for subcluster x - 32).
672
673Subcluster Allocation Bitmap (for compressed clusters)::
674
675    Bit  0 - 63:    Reserved (set to 0)
676                    Compressed clusters don't have subclusters,
677                    so this field is not used.
678
679Snapshots
680---------
681
682qcow2 supports internal snapshots. Their basic principle of operation is to
683switch the active L1 table, so that a different set of host clusters are
684exposed to the guest.
685
686When creating a snapshot, the L1 table should be copied and the refcount of all
687L2 tables and clusters reachable from this L1 table must be increased, so that
688a write causes a COW and isn't visible in other snapshots.
689
690When loading a snapshot, bit 63 of all entries in the new active L1 table and
691all L2 tables referenced by it must be reconstructed from the refcount table
692as it doesn't need to be accurate in inactive L1 tables.
693
694A directory of all snapshots is stored in the snapshot table, a contiguous area
695in the image file, whose starting offset and length are given by the header
696fields snapshots_offset and nb_snapshots. The entries of the snapshot table
697have variable length, depending on the length of ID, name and extra data.
698
699Snapshot table entry::
700
701    Byte 0 -  7:    Offset into the image file at which the L1 table for the
702                    snapshot starts. Must be aligned to a cluster boundary.
703
704         8 - 11:    Number of entries in the L1 table of the snapshots
705
706        12 - 13:    Length of the unique ID string describing the snapshot
707
708        14 - 15:    Length of the name of the snapshot
709
710        16 - 19:    Time at which the snapshot was taken in seconds since the
711                    Epoch
712
713        20 - 23:    Subsecond part of the time at which the snapshot was taken
714                    in nanoseconds
715
716        24 - 31:    Time that the guest was running until the snapshot was
717                    taken in nanoseconds
718
719        32 - 35:    Size of the VM state in bytes. 0 if no VM state is saved.
720                    If there is VM state, it starts at the first cluster
721                    described by first L1 table entry that doesn't describe a
722                    regular guest cluster (i.e. VM state is stored like guest
723                    disk content, except that it is stored at offsets that are
724                    larger than the virtual disk presented to the guest)
725
726        36 - 39:    Size of extra data in the table entry (used for future
727                    extensions of the format)
728
729        variable:   Extra data for future extensions. Unknown fields must be
730                    ignored. Currently defined are (offset relative to snapshot
731                    table entry):
732
733                    Byte 40 - 47:   Size of the VM state in bytes. 0 if no VM
734                                    state is saved. If this field is present,
735                                    the 32-bit value in bytes 32-35 is ignored.
736
737                    Byte 48 - 55:   Virtual disk size of the snapshot in bytes
738
739                    Byte 56 - 63:   icount value which corresponds to
740                                    the record/replay instruction count
741                                    when the snapshot was taken. Set to -1
742                                    if icount was disabled
743
744                    Version 3 images must include extra data at least up to
745                    byte 55.
746
747        variable:   Unique ID string for the snapshot (not null terminated)
748
749        variable:   Name of the snapshot (not null terminated)
750
751        variable:   Padding to round up the snapshot table entry size to the
752                    next multiple of 8.
753
754
755Bitmaps
756-------
757
758As mentioned above, the bitmaps extension provides the ability to store bitmaps
759related to a virtual disk. This section describes how these bitmaps are stored.
760
761All stored bitmaps are related to the virtual disk stored in the same image, so
762each bitmap size is equal to the virtual disk size.
763
764Each bit of the bitmap is responsible for strictly defined range of the virtual
765disk. For bit number bit_nr the corresponding range (in bytes) will be:
766
767.. code::
768
769    [bit_nr * bitmap_granularity .. (bit_nr + 1) * bitmap_granularity - 1]
770
771Granularity is a property of the concrete bitmap, see below.
772
773
774Bitmap directory
775----------------
776
777Each bitmap saved in the image is described in a bitmap directory entry. The
778bitmap directory is a contiguous area in the image file, whose starting offset
779and length are given by the header extension fields ``bitmap_directory_offset`` and
780``bitmap_directory_size``. The entries of the bitmap directory have variable
781length, depending on the lengths of the bitmap name and extra data.
782
783Structure of a bitmap directory entry::
784
785    Byte 0 -  7:    bitmap_table_offset
786                    Offset into the image file at which the bitmap table
787                    (described below) for the bitmap starts. Must be aligned to
788                    a cluster boundary.
789
790         8 - 11:    bitmap_table_size
791                    Number of entries in the bitmap table of the bitmap.
792
793        12 - 15:    flags
794                    Bit
795                      0: in_use
796                         The bitmap was not saved correctly and may be
797                         inconsistent. Although the bitmap metadata is still
798                         well-formed from a qcow2 perspective, the metadata
799                         (such as the auto flag or bitmap size) or data
800                         contents may be outdated.
801
802                      1: auto
803                         The bitmap must reflect all changes of the virtual
804                         disk by any application that would write to this qcow2
805                         file (including writes, snapshot switching, etc.). The
806                         type of this bitmap must be 'dirty tracking bitmap'.
807
808                      2: extra_data_compatible
809                         This flags is meaningful when the extra data is
810                         unknown to the software (currently any extra data is
811                         unknown to QEMU).
812                         If it is set, the bitmap may be used as expected, extra
813                         data must be left as is.
814                         If it is not set, the bitmap must not be used, but
815                         both it and its extra data be left as is.
816
817                    Bits 3 - 31 are reserved and must be 0.
818
819             16:    type
820                    This field describes the sort of the bitmap.
821                    Values:
822                      1: Dirty tracking bitmap
823
824                    Values 0, 2 - 255 are reserved.
825
826             17:    granularity_bits
827                    Granularity bits. Valid values: 0 - 63.
828
829                    Note: QEMU currently supports only values 9 - 31.
830
831                    Granularity is calculated as
832                        granularity = 1 << granularity_bits
833
834                    A bitmap's granularity is how many bytes of the image
835                    accounts for one bit of the bitmap.
836
837        18 - 19:    name_size
838                    Size of the bitmap name. Must be non-zero.
839
840                    Note: QEMU currently doesn't support values greater than
841                    1023.
842
843        20 - 23:    extra_data_size
844                    Size of type-specific extra data.
845
846                    For now, as no extra data is defined, extra_data_size is
847                    reserved and should be zero. If it is non-zero the
848                    behavior is defined by extra_data_compatible flag.
849
850        variable:   extra_data
851                    Extra data for the bitmap, occupying extra_data_size bytes.
852                    Extra data must never contain references to clusters or in
853                    some other way allocate additional clusters.
854
855        variable:   name
856                    The name of the bitmap (not null terminated), occupying
857                    name_size bytes. Must be unique among all bitmap names
858                    within the bitmaps extension.
859
860        variable:   Padding to round up the bitmap directory entry size to the
861                    next multiple of 8. All bytes of the padding must be zero.
862
863
864Bitmap table
865------------
866
867Each bitmap is stored using a one-level structure (as opposed to two-level
868structures like for refcounts and guest clusters mapping) for the mapping of
869bitmap data to host clusters. This structure is called the bitmap table.
870
871Each bitmap table has a variable size (stored in the bitmap directory entry)
872and may use multiple clusters, however, it must be contiguous in the image
873file.
874
875Structure of a bitmap table entry::
876
877    Bit       0:    Reserved and must be zero if bits 9 - 55 are non-zero.
878                    If bits 9 - 55 are zero:
879                      0: Cluster should be read as all zeros.
880                      1: Cluster should be read as all ones.
881
882         1 -  8:    Reserved and must be zero.
883
884         9 - 55:    Bits 9 - 55 of the host cluster offset. Must be aligned to
885                    a cluster boundary. If the offset is 0, the cluster is
886                    unallocated; in that case, bit 0 determines how this
887                    cluster should be treated during reads.
888
889        56 - 63:    Reserved and must be zero.
890
891
892Bitmap data
893-----------
894
895As noted above, bitmap data is stored in separate clusters, described by the
896bitmap table. Given an offset (in bytes) into the bitmap data, the offset into
897the image file can be obtained as follows::
898
899    image_offset(bitmap_data_offset) =
900        bitmap_table[bitmap_data_offset / cluster_size] +
901            (bitmap_data_offset % cluster_size)
902
903This offset is not defined if bits 9 - 55 of bitmap table entry are zero (see
904above).
905
906Given an offset byte_nr into the virtual disk and the bitmap's granularity, the
907bit offset into the image file to the corresponding bit of the bitmap can be
908calculated like this::
909
910    bit_offset(byte_nr) =
911        image_offset(byte_nr / granularity / 8) * 8 +
912            (byte_nr / granularity) % 8
913
914If the size of the bitmap data is not a multiple of the cluster size then the
915last cluster of the bitmap data contains some unused tail bits. These bits must
916be zero.
917
918
919Dirty tracking bitmaps
920----------------------
921
922Bitmaps with ``type`` field equal to one are dirty tracking bitmaps.
923
924When the virtual disk is in use dirty tracking bitmap may be ``enabled`` or
925``disabled``. While the bitmap is ``enabled``, all writes to the virtual disk
926should be reflected in the bitmap. A set bit in the bitmap means that the
927corresponding range of the virtual disk (see above) was written to while the
928bitmap was ``enabled``. An unset bit means that this range was not written to.
929
930The software doesn't have to sync the bitmap in the image file with its
931representation in RAM after each write or metadata change. Flag ``in_use``
932should be set while the bitmap is not synced.
933
934In the image file the ``enabled`` state is reflected by the ``auto`` flag. If this
935flag is set, the software must consider the bitmap as ``enabled`` and start
936tracking virtual disk changes to this bitmap from the first write to the
937virtual disk. If this flag is not set then the bitmap is disabled.
938