1*d5f42aacSPeter MaydellBlock I/O error injection using ``blkdebug`` 2*d5f42aacSPeter Maydell============================================ 3*d5f42aacSPeter Maydell 4*d5f42aacSPeter Maydell.. 5*d5f42aacSPeter Maydell Copyright (C) 2014-2015 Red Hat Inc 6*d5f42aacSPeter Maydell 7*d5f42aacSPeter Maydell This work is licensed under the terms of the GNU GPL, version 2 or later. See 8*d5f42aacSPeter Maydell the COPYING file in the top-level directory. 9*d5f42aacSPeter Maydell 10*d5f42aacSPeter MaydellThe ``blkdebug`` block driver is a rule-based error injection engine. It can be 11*d5f42aacSPeter Maydellused to exercise error code paths in block drivers including ``ENOSPC`` (out of 12*d5f42aacSPeter Maydellspace) and ``EIO``. 13*d5f42aacSPeter Maydell 14*d5f42aacSPeter MaydellThis document gives an overview of the features available in ``blkdebug``. 15*d5f42aacSPeter Maydell 16*d5f42aacSPeter MaydellBackground 17*d5f42aacSPeter Maydell---------- 18*d5f42aacSPeter MaydellBlock drivers have many error code paths that handle I/O errors. Image formats 19*d5f42aacSPeter Maydellare especially complex since metadata I/O errors during cluster allocation or 20*d5f42aacSPeter Maydellwhile updating tables happen halfway through request processing and require 21*d5f42aacSPeter Maydelldiscipline to keep image files consistent. 22*d5f42aacSPeter Maydell 23*d5f42aacSPeter MaydellError injection allows test cases to trigger I/O errors at specific points. 24*d5f42aacSPeter MaydellThis way, all error paths can be tested to make sure they are correct. 25*d5f42aacSPeter Maydell 26*d5f42aacSPeter MaydellRules 27*d5f42aacSPeter Maydell----- 28*d5f42aacSPeter MaydellThe ``blkdebug`` block driver takes a list of "rules" that tell the error injection 29*d5f42aacSPeter Maydellengine when to fail an I/O request. 30*d5f42aacSPeter Maydell 31*d5f42aacSPeter MaydellEach I/O request is evaluated against the rules. If a rule matches the request 32*d5f42aacSPeter Maydellthen its "action" is executed. 33*d5f42aacSPeter Maydell 34*d5f42aacSPeter MaydellRules can be placed in a configuration file; the configuration file 35*d5f42aacSPeter Maydellfollows the same .ini-like format used by QEMU's ``-readconfig`` option, and 36*d5f42aacSPeter Maydelleach section of the file represents a rule. 37*d5f42aacSPeter Maydell 38*d5f42aacSPeter MaydellThe following configuration file defines a single rule:: 39*d5f42aacSPeter Maydell 40*d5f42aacSPeter Maydell $ cat blkdebug.conf 41*d5f42aacSPeter Maydell [inject-error] 42*d5f42aacSPeter Maydell event = "read_aio" 43*d5f42aacSPeter Maydell errno = "28" 44*d5f42aacSPeter Maydell 45*d5f42aacSPeter MaydellThis rule fails all aio read requests with ``ENOSPC`` (28). Note that the errno 46*d5f42aacSPeter Maydellvalue depends on the host. On Linux, see 47*d5f42aacSPeter Maydell``/usr/include/asm-generic/errno-base.h`` for errno values. 48*d5f42aacSPeter Maydell 49*d5f42aacSPeter MaydellInvoke QEMU as follows:: 50*d5f42aacSPeter Maydell 51*d5f42aacSPeter Maydell $ qemu-system-x86_64 52*d5f42aacSPeter Maydell -drive if=none,cache=none,file=blkdebug:blkdebug.conf:test.img,id=drive0 \ 53*d5f42aacSPeter Maydell -device virtio-blk-pci,drive=drive0,id=virtio-blk-pci0 54*d5f42aacSPeter Maydell 55*d5f42aacSPeter MaydellRules support the following attributes: 56*d5f42aacSPeter Maydell 57*d5f42aacSPeter Maydell``event`` 58*d5f42aacSPeter Maydell which type of operation to match (e.g. ``read_aio``, ``write_aio``, 59*d5f42aacSPeter Maydell ``flush_to_os``, ``flush_to_disk``). See `Events`_ for 60*d5f42aacSPeter Maydell information on events. 61*d5f42aacSPeter Maydell 62*d5f42aacSPeter Maydell``state`` 63*d5f42aacSPeter Maydell (optional) the engine must be in this state number in order for this 64*d5f42aacSPeter Maydell rule to match. See `State transitions`_ for information 65*d5f42aacSPeter Maydell on states. 66*d5f42aacSPeter Maydell 67*d5f42aacSPeter Maydell``errno`` 68*d5f42aacSPeter Maydell the numeric errno value to return when a request matches this rule. 69*d5f42aacSPeter Maydell The errno values depend on the host since the numeric values are not 70*d5f42aacSPeter Maydell standardized in the POSIX specification. 71*d5f42aacSPeter Maydell 72*d5f42aacSPeter Maydell``sector`` 73*d5f42aacSPeter Maydell (optional) a sector number that the request must overlap in order to 74*d5f42aacSPeter Maydell match this rule 75*d5f42aacSPeter Maydell 76*d5f42aacSPeter Maydell``once`` 77*d5f42aacSPeter Maydell (optional, default ``off``) only execute this action on the first 78*d5f42aacSPeter Maydell matching request 79*d5f42aacSPeter Maydell 80*d5f42aacSPeter Maydell``immediately`` 81*d5f42aacSPeter Maydell (optional, default ``off``) return a NULL ``BlockAIOCB`` 82*d5f42aacSPeter Maydell pointer and fail without an errno instead. This 83*d5f42aacSPeter Maydell exercises the code path where ``BlockAIOCB`` fails and the 84*d5f42aacSPeter Maydell caller's ``BlockCompletionFunc`` is not invoked. 85*d5f42aacSPeter Maydell 86*d5f42aacSPeter MaydellEvents 87*d5f42aacSPeter Maydell------ 88*d5f42aacSPeter MaydellBlock drivers provide information about the type of I/O request they are about 89*d5f42aacSPeter Maydellto make so rules can match specific types of requests. For example, the ``qcow2`` 90*d5f42aacSPeter Maydellblock driver tells ``blkdebug`` when it accesses the L1 table so rules can match 91*d5f42aacSPeter Maydellonly L1 table accesses and not other metadata or guest data requests. 92*d5f42aacSPeter Maydell 93*d5f42aacSPeter MaydellThe core events are: 94*d5f42aacSPeter Maydell 95*d5f42aacSPeter Maydell``read_aio`` 96*d5f42aacSPeter Maydell guest data read 97*d5f42aacSPeter Maydell 98*d5f42aacSPeter Maydell``write_aio`` 99*d5f42aacSPeter Maydell guest data write 100*d5f42aacSPeter Maydell 101*d5f42aacSPeter Maydell``flush_to_os`` 102*d5f42aacSPeter Maydell write out unwritten block driver state (e.g. cached metadata) 103*d5f42aacSPeter Maydell 104*d5f42aacSPeter Maydell``flush_to_disk`` 105*d5f42aacSPeter Maydell flush the host block device's disk cache 106*d5f42aacSPeter Maydell 107*d5f42aacSPeter MaydellSee ``qapi/block-core.json:BlkdebugEvent`` for the full list of events. 108*d5f42aacSPeter MaydellYou may need to grep block driver source code to understand the 109*d5f42aacSPeter Maydellmeaning of specific events. 110*d5f42aacSPeter Maydell 111*d5f42aacSPeter MaydellState transitions 112*d5f42aacSPeter Maydell----------------- 113*d5f42aacSPeter MaydellThere are cases where more power is needed to match a particular I/O request in 114*d5f42aacSPeter Maydella longer sequence of requests. For example:: 115*d5f42aacSPeter Maydell 116*d5f42aacSPeter Maydell write_aio 117*d5f42aacSPeter Maydell flush_to_disk 118*d5f42aacSPeter Maydell write_aio 119*d5f42aacSPeter Maydell 120*d5f42aacSPeter MaydellHow do we match the 2nd ``write_aio`` but not the first? This is where state 121*d5f42aacSPeter Maydelltransitions come in. 122*d5f42aacSPeter Maydell 123*d5f42aacSPeter MaydellThe error injection engine has an integer called the "state" that always starts 124*d5f42aacSPeter Maydellinitialized to 1. The state integer is internal to ``blkdebug`` and cannot be 125*d5f42aacSPeter Maydellobserved from outside but rules can interact with it for powerful matching 126*d5f42aacSPeter Maydellbehavior. 127*d5f42aacSPeter Maydell 128*d5f42aacSPeter MaydellRules can be conditional on the current state and they can transition to a new 129*d5f42aacSPeter Maydellstate. 130*d5f42aacSPeter Maydell 131*d5f42aacSPeter MaydellWhen a rule's "state" attribute is non-zero then the current state must equal 132*d5f42aacSPeter Maydellthe attribute in order for the rule to match. 133*d5f42aacSPeter Maydell 134*d5f42aacSPeter MaydellFor example, to match the 2nd write_aio:: 135*d5f42aacSPeter Maydell 136*d5f42aacSPeter Maydell [set-state] 137*d5f42aacSPeter Maydell event = "write_aio" 138*d5f42aacSPeter Maydell state = "1" 139*d5f42aacSPeter Maydell new_state = "2" 140*d5f42aacSPeter Maydell 141*d5f42aacSPeter Maydell [inject-error] 142*d5f42aacSPeter Maydell event = "write_aio" 143*d5f42aacSPeter Maydell state = "2" 144*d5f42aacSPeter Maydell errno = "5" 145*d5f42aacSPeter Maydell 146*d5f42aacSPeter MaydellThe first ``write_aio`` request matches the ``set-state`` rule and transitions from 147*d5f42aacSPeter Maydellstate 1 to state 2. Once state 2 has been entered, the ``set-state`` rule no 148*d5f42aacSPeter Maydelllonger matches since it requires state 1. But the ``inject-error`` rule now 149*d5f42aacSPeter Maydellmatches the next ``write_aio`` request and injects ``EIO`` (5). 150*d5f42aacSPeter Maydell 151*d5f42aacSPeter MaydellState transition rules support the following attributes: 152*d5f42aacSPeter Maydell 153*d5f42aacSPeter Maydell``event`` 154*d5f42aacSPeter Maydell which type of operation to match (e.g. ``read_aio``, ``write_aio``, 155*d5f42aacSPeter Maydell ``flush_to_os`, ``flush_to_disk``). See `Events`_ for 156*d5f42aacSPeter Maydell information on events. 157*d5f42aacSPeter Maydell 158*d5f42aacSPeter Maydell``state`` 159*d5f42aacSPeter Maydell (optional) the engine must be in this state number in order for this 160*d5f42aacSPeter Maydell rule to match 161*d5f42aacSPeter Maydell 162*d5f42aacSPeter Maydell``new_state`` 163*d5f42aacSPeter Maydell transition to this state number 164*d5f42aacSPeter Maydell 165*d5f42aacSPeter MaydellSuspend and resume 166*d5f42aacSPeter Maydell------------------ 167*d5f42aacSPeter MaydellExercising code paths in block drivers may require specific ordering amongst 168*d5f42aacSPeter Maydellconcurrent requests. The "breakpoint" feature allows requests to be halted on 169*d5f42aacSPeter Maydella ``blkdebug`` event and resumed later. This makes it possible to achieve 170*d5f42aacSPeter Maydelldeterministic ordering when multiple requests are in flight. 171*d5f42aacSPeter Maydell 172*d5f42aacSPeter MaydellBreakpoints on ``blkdebug`` events are associated with a user-defined ``tag`` string. 173*d5f42aacSPeter MaydellThis tag serves as an identifier by which the request can be resumed at a later 174*d5f42aacSPeter Maydellpoint. 175*d5f42aacSPeter Maydell 176*d5f42aacSPeter MaydellSee the ``qemu-io(1)`` ``break``, ``resume``, ``remove_break``, and ``wait_break`` 177*d5f42aacSPeter Maydellcommands for details. 178