1# Specification for the fuzz testing tool 2# 3# Copyright (C) 2014 Maria Kustova <maria.k@catit.be> 4# 5# This program is free software: you can redistribute it and/or modify 6# it under the terms of the GNU General Public License as published by 7# the Free Software Foundation, either version 2 of the License, or 8# (at your option) any later version. 9# 10# This program is distributed in the hope that it will be useful, 11# but WITHOUT ANY WARRANTY; without even the implied warranty of 12# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 13# GNU General Public License for more details. 14# 15# You should have received a copy of the GNU General Public License 16# along with this program. If not, see <http://www.gnu.org/licenses/>. 17 18 19Image fuzzer 20============ 21 22Description 23----------- 24 25The goal of the image fuzzer is to catch crashes of qemu-io/qemu-img 26by providing to them randomly corrupted images. 27Test images are generated from scratch and have valid inner structure with some 28elements, e.g. L1/L2 tables, having random invalid values. 29 30 31Test runner 32----------- 33 34The test runner generates test images, executes tests utilizing generated 35images, indicates their results and collects all test related artifacts (logs, 36core dumps, test images, backing files). 37The test means execution of all available commands under test with the same 38generated test image. 39By default, the test runner generates new tests and executes them until 40keyboard interruption. But if a test seed is specified via the '--seed' runner 41parameter, then only one test with this seed will be executed, after its finish 42the runner will exit. 43 44The runner uses an external image fuzzer to generate test images. An image 45generator should be specified as a mandatory parameter of the test runner. 46Details about interactions between the runner and fuzzers see "Module 47interfaces". 48 49The runner activates generation of core dumps during test executions, but it 50assumes that core dumps will be generated in the current working directory. 51For comprehensive test results, please, set up your test environment 52properly. 53 54Paths to binaries under test (SUTs) qemu-img and qemu-io are retrieved from 55environment variables. If the environment check fails the runner will 56use SUTs installed in system paths. 57qemu-img is required for creation of backing files, so it's mandatory to set 58the related environment variable if it's not installed in the system path. 59For details about environment variables see qemu-iotests/check. 60 61The runner accepts a JSON array of fields expected to be fuzzed via the 62'--config' argument, e.g. 63 64 '[["feature_name_table"], ["header", "l1_table_offset"]]' 65 66Each sublist can have one or two strings defining image structure elements. 67In the latter case a parent element should be placed on the first position, 68and a field name on the second one. 69 70The runner accepts a list of commands under test as a JSON array via 71the '--command' argument. Each command is a list containing a SUT and all its 72arguments, e.g. 73 74 runner.py -c '[["qemu-io", "$test_img", "-c", "write $off $len"]]' 75 /tmp/test ../qcow2 76 77For variable arguments next aliases can be used: 78 - $test_img for a fuzzed img 79 - $off for an offset in the fuzzed image 80 - $len for a data size 81 82Values for last two aliases will be generated based on a size of a virtual 83disk of the generated image. 84In case when no commands are specified the runner will execute commands from 85the default list: 86 - qemu-img check 87 - qemu-img info 88 - qemu-img convert 89 - qemu-io -c read 90 - qemu-io -c write 91 - qemu-io -c aio_read 92 - qemu-io -c aio_write 93 - qemu-io -c flush 94 - qemu-io -c discard 95 - qemu-io -c truncate 96 97 98Qcow2 image generator 99--------------------- 100 101The 'qcow2' generator is a Python package providing 'create_image' method as 102a single public API. See details in 'Test runner/image fuzzer' chapter of 103'Module interfaces'. 104 105Qcow2 contains two submodules: fuzz.py and layout.py. 106 107'fuzz.py' contains all fuzzing functions, one per image field. It's assumed 108that after code analysis every field will have own constraints for its value. 109For now only universal potentially dangerous values are used, e.g. type limits 110for integers or unsafe symbols as '%s' for strings. For bitmasks random amount 111of bits are set to ones. All fuzzed values are checked on non-equality to the 112current valid value of the field. In case of equality the value will be 113regenerated. 114 115'layout.py' creates a random valid image, fuzzes a random subset of the image 116fields by 'fuzz.py' module and writes a fuzzed image to the file specified. 117If a fuzzer configuration is specified, then it has the next interpretation: 118 119 1. If a list contains a parent image element only, then some random portion 120 of fields of this element will be fuzzed every test. 121 The same behavior is applied for the entire image if no configuration is 122 used. This case is useful for the test specialization. 123 124 2. If a list contains a parent element and a field name, then a field 125 will be always fuzzed for every test. This case is useful for regression 126 testing. 127 128For now only header fields, header extensions and L1/L2 tables are generated. 129 130Module interfaces 131----------------- 132 133* Test runner/image fuzzer 134 135The runner calls an image generator specifying the path to a test image file, 136path to a backing file and its format and a fuzzer configuration. 137An image generator is expected to provide a 138 139 'create_image(test_img_path, backing_file_path=None, 140 backing_file_format=None, fuzz_config=None)' 141 142method that creates a test image, writes it to the specified file and returns 143the size of the virtual disk. 144The file should be created if it doesn't exist or overwritten otherwise. 145fuzz_config has a form of a list of lists. Every sublist can have one 146or two elements: first element is a name of a parent image element, second one 147if exists is a name of a field in this element. 148Example, 149 [['header', 'l1_table_offset'], 150 ['header', 'nb_snapshots'], 151 ['feature_name_table']] 152 153Random seed is set by the runner at every test execution for the regression 154purpose, so an image generator is not recommended to modify it internally. 155 156 157Overall fuzzer requirements 158=========================== 159 160Input data: 161---------- 162 163 - image template (generator) 164 - work directory 165 - action vector (optional) 166 - seed (optional) 167 - SUT and its arguments (optional) 168 169 170Fuzzer requirements: 171------------------- 172 1731. Should be able to inject random data 1742. Should be able to select a random value from the manually pregenerated 175 vector (boundary values, e.g. max/min cluster size) 1763. Image template should describe a general structure invariant for all 177 test images (image format description) 1784. Image template should be autonomous and other fuzzer parts should not 179 rely on it 1805. Image template should contain reference rules (not only block+size 181 description) 1826. Should generate the test image with the correct structure based on an image 183 template 1847. Should accept a seed as an argument (for regression purpose) 1858. Should generate a seed if it is not specified as an input parameter. 1869. The same seed should generate the same image for the same action vector, 187 specified or generated. 18810. Should accept a vector of actions as an argument (for test reproducing and 189 for test case specification, e.g. group of tests for header structure, 190 group of test for snapshots, etc) 19111. Action vector should be randomly generated from the pool of available 192 actions, if it is not specified as an input parameter 19312. Pool of actions should be defined automatically based on an image template 19413. Should accept a SUT and its call parameters as an argument or select them 195 randomly otherwise. As far as it's expected to be rarely changed, the list 196 of all possible test commands can be available in the test runner 197 internally. 19814. Should support an external cancellation of a test run 19915. Seed should be logged (for regression purpose) 20016. All files related to a test result should be collected: a test image, 201 SUT logs, fuzzer logs and crash dumps 20217. Should be compatible with python version 2.4-2.7 20318. Usage of external libraries should be limited as much as possible. 204 205 206Image formats: 207------------- 208 209Main target image format is qcow2, but support of image templates should 210provide an ability to add any other image format. 211 212 213Effectiveness: 214------------- 215 216The fuzzer can be controlled via template, seed and action vector; 217it makes the fuzzer itself invariant to an image format and test logic. 218It should be able to perform rather complex and precise tests, that can be 219specified via an action vector. Otherwise, knowledge about an image structure 220allows the fuzzer to generate the pool of all available areas can be fuzzed 221and randomly select some of them and so compose its own action vector. 222Also complexity of a template defines complexity of the fuzzer, so its 223functionality can be varied from simple model-independent fuzzing to smart 224model-based one. 225 226 227Glossary: 228-------- 229 230Action vector is a sequence of structure elements retrieved from an image 231format, each of them will be fuzzed for the test image. It's a subset of 232elements of the action pool. Example: header, refcount table, etc. 233Action pool is all available elements of an image structure that generated 234automatically from an image template. 235Image template is a formal description of an image structure and relations 236between image blocks. 237Test image is an output image of the fuzzer defined by the current seed and 238action vector. 239