1*78ac2d8dSPeter MaydellBlock driver correctness testing with ``blkverify`` 2*78ac2d8dSPeter Maydell=================================================== 3*78ac2d8dSPeter Maydell 4*78ac2d8dSPeter MaydellIntroduction 5*78ac2d8dSPeter Maydell------------ 6*78ac2d8dSPeter Maydell 7*78ac2d8dSPeter MaydellThis document describes how to use the ``blkverify`` protocol to test that a block 8*78ac2d8dSPeter Maydelldriver is operating correctly. 9*78ac2d8dSPeter Maydell 10*78ac2d8dSPeter MaydellIt is difficult to test and debug block drivers against real guests. Often 11*78ac2d8dSPeter Maydellprocesses inside the guest will crash because corrupt sectors were read as part 12*78ac2d8dSPeter Maydellof the executable. Other times obscure errors are raised by a program inside 13*78ac2d8dSPeter Maydellthe guest. These issues are extremely hard to trace back to bugs in the block 14*78ac2d8dSPeter Maydelldriver. 15*78ac2d8dSPeter Maydell 16*78ac2d8dSPeter Maydell``blkverify`` solves this problem by catching data corruption inside QEMU the first 17*78ac2d8dSPeter Maydelltime bad data is read and reporting the disk sector that is corrupted. 18*78ac2d8dSPeter Maydell 19*78ac2d8dSPeter MaydellHow it works 20*78ac2d8dSPeter Maydell------------ 21*78ac2d8dSPeter Maydell 22*78ac2d8dSPeter MaydellThe ``blkverify`` protocol has two child block devices, the "test" device and the 23*78ac2d8dSPeter Maydell"raw" device. Read/write operations are mirrored to both devices so their 24*78ac2d8dSPeter Maydellstate should always be in sync. 25*78ac2d8dSPeter Maydell 26*78ac2d8dSPeter MaydellThe "raw" device is a raw image, a flat file, that has identical starting 27*78ac2d8dSPeter Maydellcontents to the "test" image. The idea is that the "raw" device will handle 28*78ac2d8dSPeter Maydellread/write operations correctly and not corrupt data. It can be used as a 29*78ac2d8dSPeter Maydellreference for comparison against the "test" device. 30*78ac2d8dSPeter Maydell 31*78ac2d8dSPeter MaydellAfter a mirrored read operation completes, ``blkverify`` will compare the data and 32*78ac2d8dSPeter Maydellraise an error if it is not identical. This makes it possible to catch the 33*78ac2d8dSPeter Maydellfirst instance where corrupt data is read. 34*78ac2d8dSPeter Maydell 35*78ac2d8dSPeter MaydellExample 36*78ac2d8dSPeter Maydell------- 37*78ac2d8dSPeter Maydell 38*78ac2d8dSPeter MaydellImagine raw.img has 0xcd repeated throughout its first sector:: 39*78ac2d8dSPeter Maydell 40*78ac2d8dSPeter Maydell $ ./qemu-io -c 'read -v 0 512' raw.img 41*78ac2d8dSPeter Maydell 00000000: cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd ................ 42*78ac2d8dSPeter Maydell 00000010: cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd ................ 43*78ac2d8dSPeter Maydell [...] 44*78ac2d8dSPeter Maydell 000001e0: cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd ................ 45*78ac2d8dSPeter Maydell 000001f0: cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd ................ 46*78ac2d8dSPeter Maydell read 512/512 bytes at offset 0 47*78ac2d8dSPeter Maydell 512.000000 bytes, 1 ops; 0.0000 sec (97.656 MiB/sec and 200000.0000 ops/sec) 48*78ac2d8dSPeter Maydell 49*78ac2d8dSPeter MaydellAnd test.img is corrupt, its first sector is zeroed when it shouldn't be:: 50*78ac2d8dSPeter Maydell 51*78ac2d8dSPeter Maydell $ ./qemu-io -c 'read -v 0 512' test.img 52*78ac2d8dSPeter Maydell 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 53*78ac2d8dSPeter Maydell 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 54*78ac2d8dSPeter Maydell [...] 55*78ac2d8dSPeter Maydell 000001e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 56*78ac2d8dSPeter Maydell 000001f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 57*78ac2d8dSPeter Maydell read 512/512 bytes at offset 0 58*78ac2d8dSPeter Maydell 512.000000 bytes, 1 ops; 0.0000 sec (81.380 MiB/sec and 166666.6667 ops/sec) 59*78ac2d8dSPeter Maydell 60*78ac2d8dSPeter MaydellThis error is caught by ``blkverify``:: 61*78ac2d8dSPeter Maydell 62*78ac2d8dSPeter Maydell $ ./qemu-io -c 'read 0 512' blkverify:a.img:b.img 63*78ac2d8dSPeter Maydell blkverify: read sector_num=0 nb_sectors=4 contents mismatch in sector 0 64*78ac2d8dSPeter Maydell 65*78ac2d8dSPeter MaydellA more realistic scenario is verifying the installation of a guest OS:: 66*78ac2d8dSPeter Maydell 67*78ac2d8dSPeter Maydell $ ./qemu-img create raw.img 16G 68*78ac2d8dSPeter Maydell $ ./qemu-img create -f qcow2 test.qcow2 16G 69*78ac2d8dSPeter Maydell $ ./qemu-system-x86_64 -cdrom debian.iso \ 70*78ac2d8dSPeter Maydell -drive file=blkverify:raw.img:test.qcow2 71*78ac2d8dSPeter Maydell 72*78ac2d8dSPeter MaydellIf the installation is aborted when ``blkverify`` detects corruption, use ``qemu-io`` 73*78ac2d8dSPeter Maydellto explore the contents of the disk image at the sector in question. 74