Block driver correctness testing with ``blkverify`` =================================================== Introduction ------------ This document describes how to use the ``blkverify`` protocol to test that a block driver is operating correctly. It is difficult to test and debug block drivers against real guests. Often processes inside the guest will crash because corrupt sectors were read as part of the executable. Other times obscure errors are raised by a program inside the guest. These issues are extremely hard to trace back to bugs in the block driver. ``blkverify`` solves this problem by catching data corruption inside QEMU the first time bad data is read and reporting the disk sector that is corrupted. How it works ------------ The ``blkverify`` protocol has two child block devices, the "test" device and the "raw" device. Read/write operations are mirrored to both devices so their state should always be in sync. The "raw" device is a raw image, a flat file, that has identical starting contents to the "test" image. The idea is that the "raw" device will handle read/write operations correctly and not corrupt data. It can be used as a reference for comparison against the "test" device. After a mirrored read operation completes, ``blkverify`` will compare the data and raise an error if it is not identical. This makes it possible to catch the first instance where corrupt data is read. Example ------- Imagine raw.img has 0xcd repeated throughout its first sector:: $ ./qemu-io -c 'read -v 0 512' raw.img 00000000: cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd ................ 00000010: cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd ................ [...] 000001e0: cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd ................ 000001f0: cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd ................ read 512/512 bytes at offset 0 512.000000 bytes, 1 ops; 0.0000 sec (97.656 MiB/sec and 200000.0000 ops/sec) And test.img is corrupt, its first sector is zeroed when it shouldn't be:: $ ./qemu-io -c 'read -v 0 512' test.img 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [...] 000001e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 000001f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ read 512/512 bytes at offset 0 512.000000 bytes, 1 ops; 0.0000 sec (81.380 MiB/sec and 166666.6667 ops/sec) This error is caught by ``blkverify``:: $ ./qemu-io -c 'read 0 512' blkverify:a.img:b.img blkverify: read sector_num=0 nb_sectors=4 contents mismatch in sector 0 A more realistic scenario is verifying the installation of a guest OS:: $ ./qemu-img create raw.img 16G $ ./qemu-img create -f qcow2 test.qcow2 16G $ ./qemu-system-x86_64 -cdrom debian.iso \ -drive file=blkverify:raw.img:test.qcow2 If the installation is aborted when ``blkverify`` detects corruption, use ``qemu-io`` to explore the contents of the disk image at the sector in question.