xref: /openbmc/qemu/docs/interop/qcow2.txt (revision 8cf108c5)
1== General ==
2
3A qcow2 image file is organized in units of constant size, which are called
4(host) clusters. A cluster is the unit in which all allocations are done,
5both for actual guest data and for image metadata.
6
7Likewise, the virtual disk as seen by the guest is divided into (guest)
8clusters of the same size.
9
10All numbers in qcow2 are stored in Big Endian byte order.
11
12
13== Header ==
14
15The first cluster of a qcow2 image contains the file header:
16
17    Byte  0 -  3:   magic
18                    QCOW magic string ("QFI\xfb")
19
20          4 -  7:   version
21                    Version number (valid values are 2 and 3)
22
23          8 - 15:   backing_file_offset
24                    Offset into the image file at which the backing file name
25                    is stored (NB: The string is not null terminated). 0 if the
26                    image doesn't have a backing file.
27
28         16 - 19:   backing_file_size
29                    Length of the backing file name in bytes. Must not be
30                    longer than 1023 bytes. Undefined if the image doesn't have
31                    a backing file.
32
33         20 - 23:   cluster_bits
34                    Number of bits that are used for addressing an offset
35                    within a cluster (1 << cluster_bits is the cluster size).
36                    Must not be less than 9 (i.e. 512 byte clusters).
37
38                    Note: qemu as of today has an implementation limit of 2 MB
39                    as the maximum cluster size and won't be able to open images
40                    with larger cluster sizes.
41
42         24 - 31:   size
43                    Virtual disk size in bytes.
44
45                    Note: qemu has an implementation limit of 32 MB as
46                    the maximum L1 table size.  With a 2 MB cluster
47                    size, it is unable to populate a virtual cluster
48                    beyond 2 EB (61 bits); with a 512 byte cluster
49                    size, it is unable to populate a virtual size
50                    larger than 128 GB (37 bits).  Meanwhile, L1/L2
51                    table layouts limit an image to no more than 64 PB
52                    (56 bits) of populated clusters, and an image may
53                    hit other limits first (such as a file system's
54                    maximum size).
55
56         32 - 35:   crypt_method
57                    0 for no encryption
58                    1 for AES encryption
59                    2 for LUKS encryption
60
61         36 - 39:   l1_size
62                    Number of entries in the active L1 table
63
64         40 - 47:   l1_table_offset
65                    Offset into the image file at which the active L1 table
66                    starts. Must be aligned to a cluster boundary.
67
68         48 - 55:   refcount_table_offset
69                    Offset into the image file at which the refcount table
70                    starts. Must be aligned to a cluster boundary.
71
72         56 - 59:   refcount_table_clusters
73                    Number of clusters that the refcount table occupies
74
75         60 - 63:   nb_snapshots
76                    Number of snapshots contained in the image
77
78         64 - 71:   snapshots_offset
79                    Offset into the image file at which the snapshot table
80                    starts. Must be aligned to a cluster boundary.
81
82If the version is 3 or higher, the header has the following additional fields.
83For version 2, the values are assumed to be zero, unless specified otherwise
84in the description of a field.
85
86         72 -  79:  incompatible_features
87                    Bitmask of incompatible features. An implementation must
88                    fail to open an image if an unknown bit is set.
89
90                    Bit 0:      Dirty bit.  If this bit is set then refcounts
91                                may be inconsistent, make sure to scan L1/L2
92                                tables to repair refcounts before accessing the
93                                image.
94
95                    Bit 1:      Corrupt bit.  If this bit is set then any data
96                                structure may be corrupt and the image must not
97                                be written to (unless for regaining
98                                consistency).
99
100                    Bit 2:      External data file bit.  If this bit is set, an
101                                external data file is used. Guest clusters are
102                                then stored in the external data file. For such
103                                images, clusters in the external data file are
104                                not refcounted. The offset field in the
105                                Standard Cluster Descriptor must match the
106                                guest offset and neither compressed clusters
107                                nor internal snapshots are supported.
108
109                                An External Data File Name header extension may
110                                be present if this bit is set.
111
112                    Bits 3-63:  Reserved (set to 0)
113
114         80 -  87:  compatible_features
115                    Bitmask of compatible features. An implementation can
116                    safely ignore any unknown bits that are set.
117
118                    Bit 0:      Lazy refcounts bit.  If this bit is set then
119                                lazy refcount updates can be used.  This means
120                                marking the image file dirty and postponing
121                                refcount metadata updates.
122
123                    Bits 1-63:  Reserved (set to 0)
124
125         88 -  95:  autoclear_features
126                    Bitmask of auto-clear features. An implementation may only
127                    write to an image with unknown auto-clear features if it
128                    clears the respective bits from this field first.
129
130                    Bit 0:      Bitmaps extension bit
131                                This bit indicates consistency for the bitmaps
132                                extension data.
133
134                                It is an error if this bit is set without the
135                                bitmaps extension present.
136
137                                If the bitmaps extension is present but this
138                                bit is unset, the bitmaps extension data must be
139                                considered inconsistent.
140
141                    Bit 1:      If this bit is set, the external data file can
142                                be read as a consistent standalone raw image
143                                without looking at the qcow2 metadata.
144
145                                Setting this bit has a performance impact for
146                                some operations on the image (e.g. writing
147                                zeros requires writing to the data file instead
148                                of only setting the zero flag in the L2 table
149                                entry) and conflicts with backing files.
150
151                                This bit may only be set if the External Data
152                                File bit (incompatible feature bit 1) is also
153                                set.
154
155                    Bits 2-63:  Reserved (set to 0)
156
157         96 -  99:  refcount_order
158                    Describes the width of a reference count block entry (width
159                    in bits: refcount_bits = 1 << refcount_order). For version 2
160                    images, the order is always assumed to be 4
161                    (i.e. refcount_bits = 16).
162                    This value may not exceed 6 (i.e. refcount_bits = 64).
163
164        100 - 103:  header_length
165                    Length of the header structure in bytes. For version 2
166                    images, the length is always assumed to be 72 bytes.
167
168Directly after the image header, optional sections called header extensions can
169be stored. Each extension has a structure like the following:
170
171    Byte  0 -  3:   Header extension type:
172                        0x00000000 - End of the header extension area
173                        0xE2792ACA - Backing file format name string
174                        0x6803f857 - Feature name table
175                        0x23852875 - Bitmaps extension
176                        0x0537be77 - Full disk encryption header pointer
177                        0x44415441 - External data file name string
178                        other      - Unknown header extension, can be safely
179                                     ignored
180
181          4 -  7:   Length of the header extension data
182
183          8 -  n:   Header extension data
184
185          n -  m:   Padding to round up the header extension size to the next
186                    multiple of 8.
187
188Unless stated otherwise, each header extension type shall appear at most once
189in the same image.
190
191If the image has a backing file then the backing file name should be stored in
192the remaining space between the end of the header extension area and the end of
193the first cluster. It is not allowed to store other data here, so that an
194implementation can safely modify the header and add extensions without harming
195data of compatible features that it doesn't support. Compatible features that
196need space for additional data can use a header extension.
197
198
199== String header extensions ==
200
201Some header extensions (such as the backing file format name and the external
202data file name) are just a single string. In this case, the header extension
203length is the string length and the string is not '\0' terminated. (The header
204extension padding can make it look like a string is '\0' terminated, but
205neither is padding always necessary nor is there a guarantee that zero bytes
206are used for padding.)
207
208
209== Feature name table ==
210
211The feature name table is an optional header extension that contains the name
212for features used by the image. It can be used by applications that don't know
213the respective feature (e.g. because the feature was introduced only later) to
214display a useful error message.
215
216The number of entries in the feature name table is determined by the length of
217the header extension data. Each entry look like this:
218
219    Byte       0:   Type of feature (select feature bitmap)
220                        0: Incompatible feature
221                        1: Compatible feature
222                        2: Autoclear feature
223
224               1:   Bit number within the selected feature bitmap (valid
225                    values: 0-63)
226
227          2 - 47:   Feature name (padded with zeros, but not necessarily null
228                    terminated if it has full length)
229
230
231== Bitmaps extension ==
232
233The bitmaps extension is an optional header extension. It provides the ability
234to store bitmaps related to a virtual disk. For now, there is only one bitmap
235type: the dirty tracking bitmap, which tracks virtual disk changes from some
236point in time.
237
238The data of the extension should be considered consistent only if the
239corresponding auto-clear feature bit is set, see autoclear_features above.
240
241The fields of the bitmaps extension are:
242
243    Byte  0 -  3:  nb_bitmaps
244                   The number of bitmaps contained in the image. Must be
245                   greater than or equal to 1.
246
247                   Note: Qemu currently only supports up to 65535 bitmaps per
248                   image.
249
250          4 -  7:  Reserved, must be zero.
251
252          8 - 15:  bitmap_directory_size
253                   Size of the bitmap directory in bytes. It is the cumulative
254                   size of all (nb_bitmaps) bitmap directory entries.
255
256         16 - 23:  bitmap_directory_offset
257                   Offset into the image file at which the bitmap directory
258                   starts. Must be aligned to a cluster boundary.
259
260== Full disk encryption header pointer ==
261
262The full disk encryption header must be present if, and only if, the
263'crypt_method' header requires metadata. Currently this is only true
264of the 'LUKS' crypt method. The header extension must be absent for
265other methods.
266
267This header provides the offset at which the crypt method can store
268its additional data, as well as the length of such data.
269
270    Byte  0 -  7:   Offset into the image file at which the encryption
271                    header starts in bytes. Must be aligned to a cluster
272                    boundary.
273    Byte  8 - 15:   Length of the written encryption header in bytes.
274                    Note actual space allocated in the qcow2 file may
275                    be larger than this value, since it will be rounded
276                    to the nearest multiple of the cluster size. Any
277                    unused bytes in the allocated space will be initialized
278                    to 0.
279
280For the LUKS crypt method, the encryption header works as follows.
281
282The first 592 bytes of the header clusters will contain the LUKS
283partition header. This is then followed by the key material data areas.
284The size of the key material data areas is determined by the number of
285stripes in the key slot and key size. Refer to the LUKS format
286specification ('docs/on-disk-format.pdf' in the cryptsetup source
287package) for details of the LUKS partition header format.
288
289In the LUKS partition header, the "payload-offset" field will be
290calculated as normal for the LUKS spec. ie the size of the LUKS
291header, plus key material regions, plus padding, relative to the
292start of the LUKS header. This offset value is not required to be
293qcow2 cluster aligned. Its value is currently never used in the
294context of qcow2, since the qcow2 file format itself defines where
295the real payload offset is, but none the less a valid payload offset
296should always be present.
297
298In the LUKS key slots header, the "key-material-offset" is relative
299to the start of the LUKS header clusters in the qcow2 container,
300not the start of the qcow2 file.
301
302Logically the layout looks like
303
304  +-----------------------------+
305  | QCow2 header                |
306  | QCow2 header extension X    |
307  | QCow2 header extension FDE  |
308  | QCow2 header extension ...  |
309  | QCow2 header extension Z    |
310  +-----------------------------+
311  | ....other QCow2 tables....  |
312  .                             .
313  .                             .
314  +-----------------------------+
315  | +-------------------------+ |
316  | | LUKS partition header   | |
317  | +-------------------------+ |
318  | | LUKS key material 1     | |
319  | +-------------------------+ |
320  | | LUKS key material 2     | |
321  | +-------------------------+ |
322  | | LUKS key material ...   | |
323  | +-------------------------+ |
324  | | LUKS key material 8     | |
325  | +-------------------------+ |
326  +-----------------------------+
327  | QCow2 cluster payload       |
328  .                             .
329  .                             .
330  .                             .
331  |                             |
332  +-----------------------------+
333
334== Data encryption ==
335
336When an encryption method is requested in the header, the image payload
337data must be encrypted/decrypted on every write/read. The image headers
338and metadata are never encrypted.
339
340The algorithms used for encryption vary depending on the method
341
342 - AES:
343
344   The AES cipher, in CBC mode, with 256 bit keys.
345
346   Initialization vectors generated using plain64 method, with
347   the virtual disk sector as the input tweak.
348
349   This format is no longer supported in QEMU system emulators, due
350   to a number of design flaws affecting its security. It is only
351   supported in the command line tools for the sake of back compatibility
352   and data liberation.
353
354 - LUKS:
355
356   The algorithms are specified in the LUKS header.
357
358   Initialization vectors generated using the method specified
359   in the LUKS header, with the physical disk sector as the
360   input tweak.
361
362== Host cluster management ==
363
364qcow2 manages the allocation of host clusters by maintaining a reference count
365for each host cluster. A refcount of 0 means that the cluster is free, 1 means
366that it is used, and >= 2 means that it is used and any write access must
367perform a COW (copy on write) operation.
368
369The refcounts are managed in a two-level table. The first level is called
370refcount table and has a variable size (which is stored in the header). The
371refcount table can cover multiple clusters, however it needs to be contiguous
372in the image file.
373
374It contains pointers to the second level structures which are called refcount
375blocks and are exactly one cluster in size.
376
377Although a large enough refcount table can reserve clusters past 64 PB
378(56 bits) (assuming the underlying protocol can even be sized that
379large), note that some qcow2 metadata such as L1/L2 tables must point
380to clusters prior to that point.
381
382Note: qemu has an implementation limit of 8 MB as the maximum refcount
383table size.  With a 2 MB cluster size and a default refcount_order of
3844, it is unable to reference host resources beyond 2 EB (61 bits); in
385the worst case, with a 512 cluster size and refcount_order of 6, it is
386unable to access beyond 32 GB (35 bits).
387
388Given an offset into the image file, the refcount of its cluster can be
389obtained as follows:
390
391    refcount_block_entries = (cluster_size * 8 / refcount_bits)
392
393    refcount_block_index = (offset / cluster_size) % refcount_block_entries
394    refcount_table_index = (offset / cluster_size) / refcount_block_entries
395
396    refcount_block = load_cluster(refcount_table[refcount_table_index]);
397    return refcount_block[refcount_block_index];
398
399Refcount table entry:
400
401    Bit  0 -  8:    Reserved (set to 0)
402
403         9 - 63:    Bits 9-63 of the offset into the image file at which the
404                    refcount block starts. Must be aligned to a cluster
405                    boundary.
406
407                    If this is 0, the corresponding refcount block has not yet
408                    been allocated. All refcounts managed by this refcount block
409                    are 0.
410
411Refcount block entry (x = refcount_bits - 1):
412
413    Bit  0 -  x:    Reference count of the cluster. If refcount_bits implies a
414                    sub-byte width, note that bit 0 means the least significant
415                    bit in this context.
416
417
418== Cluster mapping ==
419
420Just as for refcounts, qcow2 uses a two-level structure for the mapping of
421guest clusters to host clusters. They are called L1 and L2 table.
422
423The L1 table has a variable size (stored in the header) and may use multiple
424clusters, however it must be contiguous in the image file. L2 tables are
425exactly one cluster in size.
426
427The L1 and L2 tables have implications on the maximum virtual file
428size; for a given L1 table size, a larger cluster size is required for
429the guest to have access to more space.  Furthermore, a virtual
430cluster must currently map to a host offset below 64 PB (56 bits)
431(although this limit could be relaxed by putting reserved bits into
432use).  Additionally, as cluster size increases, the maximum host
433offset for a compressed cluster is reduced (a 2M cluster size requires
434compressed clusters to reside below 512 TB (49 bits), and this limit
435cannot be relaxed without an incompatible layout change).
436
437Given an offset into the virtual disk, the offset into the image file can be
438obtained as follows:
439
440    l2_entries = (cluster_size / sizeof(uint64_t))
441
442    l2_index = (offset / cluster_size) % l2_entries
443    l1_index = (offset / cluster_size) / l2_entries
444
445    l2_table = load_cluster(l1_table[l1_index]);
446    cluster_offset = l2_table[l2_index];
447
448    return cluster_offset + (offset % cluster_size)
449
450L1 table entry:
451
452    Bit  0 -  8:    Reserved (set to 0)
453
454         9 - 55:    Bits 9-55 of the offset into the image file at which the L2
455                    table starts. Must be aligned to a cluster boundary. If the
456                    offset is 0, the L2 table and all clusters described by this
457                    L2 table are unallocated.
458
459        56 - 62:    Reserved (set to 0)
460
461             63:    0 for an L2 table that is unused or requires COW, 1 if its
462                    refcount is exactly one. This information is only accurate
463                    in the active L1 table.
464
465L2 table entry:
466
467    Bit  0 -  61:   Cluster descriptor
468
469              62:   0 for standard clusters
470                    1 for compressed clusters
471
472              63:   0 for clusters that are unused, compressed or require COW.
473                    1 for standard clusters whose refcount is exactly one.
474                    This information is only accurate in L2 tables
475                    that are reachable from the active L1 table.
476
477                    With external data files, all guest clusters have an
478                    implicit refcount of 1 (because of the fixed host = guest
479                    mapping for guest cluster offsets), so this bit should be 1
480                    for all allocated clusters.
481
482Standard Cluster Descriptor:
483
484    Bit       0:    If set to 1, the cluster reads as all zeros. The host
485                    cluster offset can be used to describe a preallocation,
486                    but it won't be used for reading data from this cluster,
487                    nor is data read from the backing file if the cluster is
488                    unallocated.
489
490                    With version 2, this is always 0.
491
492         1 -  8:    Reserved (set to 0)
493
494         9 - 55:    Bits 9-55 of host cluster offset. Must be aligned to a
495                    cluster boundary. If the offset is 0 and bit 63 is clear,
496                    the cluster is unallocated. The offset may only be 0 with
497                    bit 63 set (indicating a host cluster offset of 0) when an
498                    external data file is used.
499
500        56 - 61:    Reserved (set to 0)
501
502
503Compressed Clusters Descriptor (x = 62 - (cluster_bits - 8)):
504
505    Bit  0 - x-1:   Host cluster offset. This is usually _not_ aligned to a
506                    cluster or sector boundary!  If cluster_bits is
507                    small enough that this field includes bits beyond
508                    55, those upper bits must be set to 0.
509
510         x - 61:    Number of additional 512-byte sectors used for the
511                    compressed data, beyond the sector containing the offset
512                    in the previous field. Some of these sectors may reside
513                    in the next contiguous host cluster.
514
515                    Note that the compressed data does not necessarily occupy
516                    all of the bytes in the final sector; rather, decompression
517                    stops when it has produced a cluster of data.
518
519                    Another compressed cluster may map to the tail of the final
520                    sector used by this compressed cluster.
521
522If a cluster is unallocated, read requests shall read the data from the backing
523file (except if bit 0 in the Standard Cluster Descriptor is set). If there is
524no backing file or the backing file is smaller than the image, they shall read
525zeros for all parts that are not covered by the backing file.
526
527
528== Snapshots ==
529
530qcow2 supports internal snapshots. Their basic principle of operation is to
531switch the active L1 table, so that a different set of host clusters are
532exposed to the guest.
533
534When creating a snapshot, the L1 table should be copied and the refcount of all
535L2 tables and clusters reachable from this L1 table must be increased, so that
536a write causes a COW and isn't visible in other snapshots.
537
538When loading a snapshot, bit 63 of all entries in the new active L1 table and
539all L2 tables referenced by it must be reconstructed from the refcount table
540as it doesn't need to be accurate in inactive L1 tables.
541
542A directory of all snapshots is stored in the snapshot table, a contiguous area
543in the image file, whose starting offset and length are given by the header
544fields snapshots_offset and nb_snapshots. The entries of the snapshot table
545have variable length, depending on the length of ID, name and extra data.
546
547Snapshot table entry:
548
549    Byte 0 -  7:    Offset into the image file at which the L1 table for the
550                    snapshot starts. Must be aligned to a cluster boundary.
551
552         8 - 11:    Number of entries in the L1 table of the snapshots
553
554        12 - 13:    Length of the unique ID string describing the snapshot
555
556        14 - 15:    Length of the name of the snapshot
557
558        16 - 19:    Time at which the snapshot was taken in seconds since the
559                    Epoch
560
561        20 - 23:    Subsecond part of the time at which the snapshot was taken
562                    in nanoseconds
563
564        24 - 31:    Time that the guest was running until the snapshot was
565                    taken in nanoseconds
566
567        32 - 35:    Size of the VM state in bytes. 0 if no VM state is saved.
568                    If there is VM state, it starts at the first cluster
569                    described by first L1 table entry that doesn't describe a
570                    regular guest cluster (i.e. VM state is stored like guest
571                    disk content, except that it is stored at offsets that are
572                    larger than the virtual disk presented to the guest)
573
574        36 - 39:    Size of extra data in the table entry (used for future
575                    extensions of the format)
576
577        variable:   Extra data for future extensions. Unknown fields must be
578                    ignored. Currently defined are (offset relative to snapshot
579                    table entry):
580
581                    Byte 40 - 47:   Size of the VM state in bytes. 0 if no VM
582                                    state is saved. If this field is present,
583                                    the 32-bit value in bytes 32-35 is ignored.
584
585                    Byte 48 - 55:   Virtual disk size of the snapshot in bytes
586
587                    Version 3 images must include extra data at least up to
588                    byte 55.
589
590        variable:   Unique ID string for the snapshot (not null terminated)
591
592        variable:   Name of the snapshot (not null terminated)
593
594        variable:   Padding to round up the snapshot table entry size to the
595                    next multiple of 8.
596
597
598== Bitmaps ==
599
600As mentioned above, the bitmaps extension provides the ability to store bitmaps
601related to a virtual disk. This section describes how these bitmaps are stored.
602
603All stored bitmaps are related to the virtual disk stored in the same image, so
604each bitmap size is equal to the virtual disk size.
605
606Each bit of the bitmap is responsible for strictly defined range of the virtual
607disk. For bit number bit_nr the corresponding range (in bytes) will be:
608
609    [bit_nr * bitmap_granularity .. (bit_nr + 1) * bitmap_granularity - 1]
610
611Granularity is a property of the concrete bitmap, see below.
612
613
614=== Bitmap directory ===
615
616Each bitmap saved in the image is described in a bitmap directory entry. The
617bitmap directory is a contiguous area in the image file, whose starting offset
618and length are given by the header extension fields bitmap_directory_offset and
619bitmap_directory_size. The entries of the bitmap directory have variable
620length, depending on the lengths of the bitmap name and extra data.
621
622Structure of a bitmap directory entry:
623
624    Byte 0 -  7:    bitmap_table_offset
625                    Offset into the image file at which the bitmap table
626                    (described below) for the bitmap starts. Must be aligned to
627                    a cluster boundary.
628
629         8 - 11:    bitmap_table_size
630                    Number of entries in the bitmap table of the bitmap.
631
632        12 - 15:    flags
633                    Bit
634                      0: in_use
635                         The bitmap was not saved correctly and may be
636                         inconsistent. Although the bitmap metadata is still
637                         well-formed from a qcow2 perspective, the metadata
638                         (such as the auto flag or bitmap size) or data
639                         contents may be outdated.
640
641                      1: auto
642                         The bitmap must reflect all changes of the virtual
643                         disk by any application that would write to this qcow2
644                         file (including writes, snapshot switching, etc.). The
645                         type of this bitmap must be 'dirty tracking bitmap'.
646
647                      2: extra_data_compatible
648                         This flags is meaningful when the extra data is
649                         unknown to the software (currently any extra data is
650                         unknown to Qemu).
651                         If it is set, the bitmap may be used as expected, extra
652                         data must be left as is.
653                         If it is not set, the bitmap must not be used, but
654                         both it and its extra data be left as is.
655
656                    Bits 3 - 31 are reserved and must be 0.
657
658             16:    type
659                    This field describes the sort of the bitmap.
660                    Values:
661                      1: Dirty tracking bitmap
662
663                    Values 0, 2 - 255 are reserved.
664
665             17:    granularity_bits
666                    Granularity bits. Valid values: 0 - 63.
667
668                    Note: Qemu currently supports only values 9 - 31.
669
670                    Granularity is calculated as
671                        granularity = 1 << granularity_bits
672
673                    A bitmap's granularity is how many bytes of the image
674                    accounts for one bit of the bitmap.
675
676        18 - 19:    name_size
677                    Size of the bitmap name. Must be non-zero.
678
679                    Note: Qemu currently doesn't support values greater than
680                    1023.
681
682        20 - 23:    extra_data_size
683                    Size of type-specific extra data.
684
685                    For now, as no extra data is defined, extra_data_size is
686                    reserved and should be zero. If it is non-zero the
687                    behavior is defined by extra_data_compatible flag.
688
689        variable:   extra_data
690                    Extra data for the bitmap, occupying extra_data_size bytes.
691                    Extra data must never contain references to clusters or in
692                    some other way allocate additional clusters.
693
694        variable:   name
695                    The name of the bitmap (not null terminated), occupying
696                    name_size bytes. Must be unique among all bitmap names
697                    within the bitmaps extension.
698
699        variable:   Padding to round up the bitmap directory entry size to the
700                    next multiple of 8. All bytes of the padding must be zero.
701
702
703=== Bitmap table ===
704
705Each bitmap is stored using a one-level structure (as opposed to two-level
706structures like for refcounts and guest clusters mapping) for the mapping of
707bitmap data to host clusters. This structure is called the bitmap table.
708
709Each bitmap table has a variable size (stored in the bitmap directory entry)
710and may use multiple clusters, however, it must be contiguous in the image
711file.
712
713Structure of a bitmap table entry:
714
715    Bit       0:    Reserved and must be zero if bits 9 - 55 are non-zero.
716                    If bits 9 - 55 are zero:
717                      0: Cluster should be read as all zeros.
718                      1: Cluster should be read as all ones.
719
720         1 -  8:    Reserved and must be zero.
721
722         9 - 55:    Bits 9 - 55 of the host cluster offset. Must be aligned to
723                    a cluster boundary. If the offset is 0, the cluster is
724                    unallocated; in that case, bit 0 determines how this
725                    cluster should be treated during reads.
726
727        56 - 63:    Reserved and must be zero.
728
729
730=== Bitmap data ===
731
732As noted above, bitmap data is stored in separate clusters, described by the
733bitmap table. Given an offset (in bytes) into the bitmap data, the offset into
734the image file can be obtained as follows:
735
736    image_offset(bitmap_data_offset) =
737        bitmap_table[bitmap_data_offset / cluster_size] +
738            (bitmap_data_offset % cluster_size)
739
740This offset is not defined if bits 9 - 55 of bitmap table entry are zero (see
741above).
742
743Given an offset byte_nr into the virtual disk and the bitmap's granularity, the
744bit offset into the image file to the corresponding bit of the bitmap can be
745calculated like this:
746
747    bit_offset(byte_nr) =
748        image_offset(byte_nr / granularity / 8) * 8 +
749            (byte_nr / granularity) % 8
750
751If the size of the bitmap data is not a multiple of the cluster size then the
752last cluster of the bitmap data contains some unused tail bits. These bits must
753be zero.
754
755
756=== Dirty tracking bitmaps ===
757
758Bitmaps with 'type' field equal to one are dirty tracking bitmaps.
759
760When the virtual disk is in use dirty tracking bitmap may be 'enabled' or
761'disabled'. While the bitmap is 'enabled', all writes to the virtual disk
762should be reflected in the bitmap. A set bit in the bitmap means that the
763corresponding range of the virtual disk (see above) was written to while the
764bitmap was 'enabled'. An unset bit means that this range was not written to.
765
766The software doesn't have to sync the bitmap in the image file with its
767representation in RAM after each write or metadata change. Flag 'in_use'
768should be set while the bitmap is not synced.
769
770In the image file the 'enabled' state is reflected by the 'auto' flag. If this
771flag is set, the software must consider the bitmap as 'enabled' and start
772tracking virtual disk changes to this bitmap from the first write to the
773virtual disk. If this flag is not set then the bitmap is disabled.
774