1ff41da50SThomas Huth======== 2ff41da50SThomas HuthFuzzing 3ff41da50SThomas Huth======== 4ff41da50SThomas Huth 5ff41da50SThomas HuthThis document describes the virtual-device fuzzing infrastructure in QEMU and 6ff41da50SThomas Huthhow to use it to implement additional fuzzers. 7ff41da50SThomas Huth 8ff41da50SThomas HuthBasics 9ff41da50SThomas Huth------ 10ff41da50SThomas Huth 11ff41da50SThomas HuthFuzzing operates by passing inputs to an entry point/target function. The 12ff41da50SThomas Huthfuzzer tracks the code coverage triggered by the input. Based on these 13ff41da50SThomas Huthfindings, the fuzzer mutates the input and repeats the fuzzing. 14ff41da50SThomas Huth 15ff41da50SThomas HuthTo fuzz QEMU, we rely on libfuzzer. Unlike other fuzzers such as AFL, libfuzzer 16ff41da50SThomas Huthis an *in-process* fuzzer. For the developer, this means that it is their 17ff41da50SThomas Huthresponsibility to ensure that state is reset between fuzzing-runs. 18ff41da50SThomas Huth 19ff41da50SThomas HuthBuilding the fuzzers 20ff41da50SThomas Huth-------------------- 21ff41da50SThomas Huth 22ff41da50SThomas HuthTo build the fuzzers, install a recent version of clang: 23ff41da50SThomas HuthConfigure with (substitute the clang binaries with the version you installed). 24*c0f86125SMatheus Tavares BernardinoHere, enable-asan and enable-ubsan are optional but they allow us to reliably 25*c0f86125SMatheus Tavares Bernardinodetect bugs such as out-of-bounds accesses, uses-after-free, double-frees 26*c0f86125SMatheus Tavares Bernardinoetc.:: 27ff41da50SThomas Huth 28cb771ac1SRichard Henderson CC=clang-8 CXX=clang++-8 /path/to/configure \ 29cb771ac1SRichard Henderson --enable-fuzzing --enable-asan --enable-ubsan 30ff41da50SThomas Huth 31ff41da50SThomas HuthFuzz targets are built similarly to system targets:: 32ff41da50SThomas Huth 33ff41da50SThomas Huth make qemu-fuzz-i386 34ff41da50SThomas Huth 35ff41da50SThomas HuthThis builds ``./qemu-fuzz-i386`` 36ff41da50SThomas Huth 37ff41da50SThomas HuthThe first option to this command is: ``--fuzz-target=FUZZ_NAME`` 38ff41da50SThomas HuthTo list all of the available fuzzers run ``qemu-fuzz-i386`` with no arguments. 39ff41da50SThomas Huth 40ff41da50SThomas HuthFor example:: 41ff41da50SThomas Huth 42ff41da50SThomas Huth ./qemu-fuzz-i386 --fuzz-target=virtio-scsi-fuzz 43ff41da50SThomas Huth 44ff41da50SThomas HuthInternally, libfuzzer parses all arguments that do not begin with ``"--"``. 45ff41da50SThomas HuthInformation about these is available by passing ``-help=1`` 46ff41da50SThomas Huth 47ff41da50SThomas HuthNow the only thing left to do is wait for the fuzzer to trigger potential 48ff41da50SThomas Huthcrashes. 49ff41da50SThomas Huth 50ff41da50SThomas HuthUseful libFuzzer flags 51ff41da50SThomas Huth---------------------- 52ff41da50SThomas Huth 53ff41da50SThomas HuthAs mentioned above, libFuzzer accepts some arguments. Passing ``-help=1`` will 54ff41da50SThomas Huthlist the available arguments. In particular, these arguments might be helpful: 55ff41da50SThomas Huth 56ff41da50SThomas Huth* ``CORPUS_DIR/`` : Specify a directory as the last argument to libFuzzer. 57ff41da50SThomas Huth libFuzzer stores each "interesting" input in this corpus directory. The next 58ff41da50SThomas Huth time you run libFuzzer, it will read all of the inputs from the corpus, and 59ff41da50SThomas Huth continue fuzzing from there. You can also specify multiple directories. 60ff41da50SThomas Huth libFuzzer loads existing inputs from all specified directories, but will only 61ff41da50SThomas Huth write new ones to the first one specified. 62ff41da50SThomas Huth 63ff41da50SThomas Huth* ``-max_len=4096`` : specify the maximum byte-length of the inputs libFuzzer 64ff41da50SThomas Huth will generate. 65ff41da50SThomas Huth 66ff41da50SThomas Huth* ``-close_fd_mask={1,2,3}`` : close, stderr, or both. Useful for targets that 67ff41da50SThomas Huth trigger many debug/error messages, or create output on the serial console. 68ff41da50SThomas Huth 69ff41da50SThomas Huth* ``-jobs=4 -workers=4`` : These arguments configure libFuzzer to run 4 fuzzers in 70ff41da50SThomas Huth parallel (4 fuzzing jobs in 4 worker processes). Alternatively, with only 71ff41da50SThomas Huth ``-jobs=N``, libFuzzer automatically spawns a number of workers less than or equal 72ff41da50SThomas Huth to half the available CPU cores. Replace 4 with a number appropriate for your 73ff41da50SThomas Huth machine. Make sure to specify a ``CORPUS_DIR``, which will allow the parallel 74ff41da50SThomas Huth fuzzers to share information about the interesting inputs they find. 75ff41da50SThomas Huth 76ff41da50SThomas Huth* ``-use_value_profile=1`` : For each comparison operation, libFuzzer computes 77ff41da50SThomas Huth ``(caller_pc&4095) | (popcnt(Arg1 ^ Arg2) << 12)`` and places this in the 78ff41da50SThomas Huth coverage table. Useful for targets with "magic" constants. If Arg1 came from 79ff41da50SThomas Huth the fuzzer's input and Arg2 is a magic constant, then each time the Hamming 80ff41da50SThomas Huth distance between Arg1 and Arg2 decreases, libFuzzer adds the input to the 81ff41da50SThomas Huth corpus. 82ff41da50SThomas Huth 83ff41da50SThomas Huth* ``-shrink=1`` : Tries to make elements of the corpus "smaller". Might lead to 84ff41da50SThomas Huth better coverage performance, depending on the target. 85ff41da50SThomas Huth 86ff41da50SThomas HuthNote that libFuzzer's exact behavior will depend on the version of 87ff41da50SThomas Huthclang and libFuzzer used to build the device fuzzers. 88ff41da50SThomas Huth 89ff41da50SThomas HuthGenerating Coverage Reports 90ff41da50SThomas Huth--------------------------- 91ff41da50SThomas Huth 92ff41da50SThomas HuthCode coverage is a crucial metric for evaluating a fuzzer's performance. 93ff41da50SThomas HuthlibFuzzer's output provides a "cov: " column that provides a total number of 94ff41da50SThomas Huthunique blocks/edges covered. To examine coverage on a line-by-line basis we 95ff41da50SThomas Huthcan use Clang coverage: 96ff41da50SThomas Huth 97ff41da50SThomas Huth 1. Configure libFuzzer to store a corpus of all interesting inputs (see 98ff41da50SThomas Huth CORPUS_DIR above) 99ff41da50SThomas Huth 2. ``./configure`` the QEMU build with :: 100ff41da50SThomas Huth 101ff41da50SThomas Huth --enable-fuzzing \ 102ff41da50SThomas Huth --extra-cflags="-fprofile-instr-generate -fcoverage-mapping" 103ff41da50SThomas Huth 104ff41da50SThomas Huth 3. Re-run the fuzzer. Specify $CORPUS_DIR/* as an argument, telling libfuzzer 105ff41da50SThomas Huth to execute all of the inputs in $CORPUS_DIR and exit. Once the process 106ff41da50SThomas Huth exits, you should find a file, "default.profraw" in the working directory. 107ff41da50SThomas Huth 4. Execute these commands to generate a detailed HTML coverage-report:: 108ff41da50SThomas Huth 109ff41da50SThomas Huth llvm-profdata merge -output=default.profdata default.profraw 110ff41da50SThomas Huth llvm-cov show ./path/to/qemu-fuzz-i386 -instr-profile=default.profdata \ 111ff41da50SThomas Huth --format html -output-dir=/path/to/output/report 112ff41da50SThomas Huth 113ff41da50SThomas HuthAdding a new fuzzer 114ff41da50SThomas Huth------------------- 115ff41da50SThomas Huth 116ff41da50SThomas HuthCoverage over virtual devices can be improved by adding additional fuzzers. 117ff41da50SThomas HuthFuzzers are kept in ``tests/qtest/fuzz/`` and should be added to 118ff41da50SThomas Huth``tests/qtest/fuzz/meson.build`` 119ff41da50SThomas Huth 120ff41da50SThomas HuthFuzzers can rely on both qtest and libqos to communicate with virtual devices. 121ff41da50SThomas Huth 122ff41da50SThomas Huth1. Create a new source file. For example ``tests/qtest/fuzz/foo-device-fuzz.c``. 123ff41da50SThomas Huth 124ff41da50SThomas Huth2. Write the fuzzing code using the libqtest/libqos API. See existing fuzzers 125ff41da50SThomas Huth for reference. 126ff41da50SThomas Huth 127ff41da50SThomas Huth3. Add the fuzzer to ``tests/qtest/fuzz/meson.build``. 128ff41da50SThomas Huth 129ff41da50SThomas HuthFuzzers can be more-or-less thought of as special qtest programs which can 130ff41da50SThomas Huthmodify the qtest commands and/or qtest command arguments based on inputs 131ff41da50SThomas Huthprovided by libfuzzer. Libfuzzer passes a byte array and length. Commonly the 132ff41da50SThomas Huthfuzzer loops over the byte-array interpreting it as a list of qtest commands, 133ff41da50SThomas Huthaddresses, or values. 134ff41da50SThomas Huth 135ff41da50SThomas HuthThe Generic Fuzzer 136ff41da50SThomas Huth------------------ 137ff41da50SThomas Huth 138ff41da50SThomas HuthWriting a fuzz target can be a lot of effort (especially if a device driver has 139ff41da50SThomas Huthnot be built-out within libqos). Many devices can be fuzzed to some degree, 140ff41da50SThomas Huthwithout any device-specific code, using the generic-fuzz target. 141ff41da50SThomas Huth 142ff41da50SThomas HuthThe generic-fuzz target is capable of fuzzing devices over their PIO, MMIO, 143ff41da50SThomas Huthand DMA input-spaces. To apply the generic-fuzz to a device, we need to define 144ff41da50SThomas Huthtwo env-variables, at minimum: 145ff41da50SThomas Huth 146ff41da50SThomas Huth* ``QEMU_FUZZ_ARGS=`` is the set of QEMU arguments used to configure a machine, with 147ff41da50SThomas Huth the device attached. For example, if we want to fuzz the virtio-net device 148ff41da50SThomas Huth attached to a pc-i440fx machine, we can specify:: 149ff41da50SThomas Huth 150ff41da50SThomas Huth QEMU_FUZZ_ARGS="-M pc -nodefaults -netdev user,id=user0 \ 151ff41da50SThomas Huth -device virtio-net,netdev=user0" 152ff41da50SThomas Huth 153ff41da50SThomas Huth* ``QEMU_FUZZ_OBJECTS=`` is a set of space-delimited strings used to identify 154ff41da50SThomas Huth the MemoryRegions that will be fuzzed. These strings are compared against 155ff41da50SThomas Huth MemoryRegion names and MemoryRegion owner names, to decide whether each 156ff41da50SThomas Huth MemoryRegion should be fuzzed. These strings support globbing. For the 157ff41da50SThomas Huth virtio-net example, we could use one of :: 158ff41da50SThomas Huth 159ff41da50SThomas Huth QEMU_FUZZ_OBJECTS='virtio-net' 160ff41da50SThomas Huth QEMU_FUZZ_OBJECTS='virtio*' 161ff41da50SThomas Huth QEMU_FUZZ_OBJECTS='virtio* pcspk' # Fuzz the virtio devices and the speaker 162ff41da50SThomas Huth QEMU_FUZZ_OBJECTS='*' # Fuzz the whole machine`` 163ff41da50SThomas Huth 164ff41da50SThomas HuthThe ``"info mtree"`` and ``"info qom-tree"`` monitor commands can be especially 165ff41da50SThomas Huthuseful for identifying the ``MemoryRegion`` and ``Object`` names used for 166ff41da50SThomas Huthmatching. 167ff41da50SThomas Huth 168ff41da50SThomas HuthAs a generic rule-of-thumb, the more ``MemoryRegions``/Devices we match, the 169ff41da50SThomas Huthgreater the input-space, and the smaller the probability of finding crashing 170ff41da50SThomas Huthinputs for individual devices. As such, it is usually a good idea to limit the 171ff41da50SThomas Huthfuzzer to only a few ``MemoryRegions``. 172ff41da50SThomas Huth 173ff41da50SThomas HuthTo ensure that these env variables have been configured correctly, we can use:: 174ff41da50SThomas Huth 175ff41da50SThomas Huth ./qemu-fuzz-i386 --fuzz-target=generic-fuzz -runs=0 176ff41da50SThomas Huth 177ff41da50SThomas HuthThe output should contain a complete list of matched MemoryRegions. 178ff41da50SThomas Huth 179ff41da50SThomas HuthOSS-Fuzz 180ff41da50SThomas Huth-------- 181ff41da50SThomas HuthQEMU is continuously fuzzed on `OSS-Fuzz 182ff41da50SThomas Huth<https://github.com/google/oss-fuzz>`_. By default, the OSS-Fuzz build 183ff41da50SThomas Huthwill try to fuzz every fuzz-target. Since the generic-fuzz target 184ff41da50SThomas Huthrequires additional information provided in environment variables, we 185ff41da50SThomas Huthpre-define some generic-fuzz configs in 186ff41da50SThomas Huth``tests/qtest/fuzz/generic_fuzz_configs.h``. Each config must specify: 187ff41da50SThomas Huth 188ff41da50SThomas Huth- ``.name``: To identify the fuzzer config 189ff41da50SThomas Huth 190ff41da50SThomas Huth- ``.args`` OR ``.argfunc``: A string or pointer to a function returning a 191ff41da50SThomas Huth string. These strings are used to specify the ``QEMU_FUZZ_ARGS`` 192ff41da50SThomas Huth environment variable. ``argfunc`` is useful when the config relies on e.g. 193ff41da50SThomas Huth a dynamically created temp directory, or a free tcp/udp port. 194ff41da50SThomas Huth 195ff41da50SThomas Huth- ``.objects``: A string that specifies the ``QEMU_FUZZ_OBJECTS`` environment 196ff41da50SThomas Huth variable. 197ff41da50SThomas Huth 198ff41da50SThomas HuthTo fuzz additional devices/device configuration on OSS-Fuzz, send patches for 199ff41da50SThomas Hutheither a new device-specific fuzzer or a new generic-fuzz config. 200ff41da50SThomas Huth 201ff41da50SThomas HuthBuild details: 202ff41da50SThomas Huth 203ff41da50SThomas Huth- The Dockerfile that sets up the environment for building QEMU's 204ff41da50SThomas Huth fuzzers on OSS-Fuzz can be fund in the OSS-Fuzz repository 205ff41da50SThomas Huth __(https://github.com/google/oss-fuzz/blob/master/projects/qemu/Dockerfile) 206ff41da50SThomas Huth 207ff41da50SThomas Huth- The script responsible for building the fuzzers can be found in the 208ff41da50SThomas Huth QEMU source tree at ``scripts/oss-fuzz/build.sh`` 209ff41da50SThomas Huth 210ff41da50SThomas HuthBuilding Crash Reproducers 211ff41da50SThomas Huth----------------------------------------- 212ff41da50SThomas HuthWhen we find a crash, we should try to create an independent reproducer, that 213ff41da50SThomas Huthcan be used on a non-fuzzer build of QEMU. This filters out any potential 214ff41da50SThomas Huthfalse-positives, and improves the debugging experience for developers. 215ff41da50SThomas HuthHere are the steps for building a reproducer for a crash found by the 216ff41da50SThomas Huthgeneric-fuzz target. 217ff41da50SThomas Huth 218ff41da50SThomas Huth- Ensure the crash reproduces:: 219ff41da50SThomas Huth 220ff41da50SThomas Huth qemu-fuzz-i386 --fuzz-target... ./crash-... 221ff41da50SThomas Huth 222ff41da50SThomas Huth- Gather the QTest output for the crash:: 223ff41da50SThomas Huth 224ff41da50SThomas Huth QEMU_FUZZ_TIMEOUT=0 QTEST_LOG=1 FUZZ_SERIALIZE_QTEST=1 \ 225ff41da50SThomas Huth qemu-fuzz-i386 --fuzz-target... ./crash-... &> /tmp/trace 226ff41da50SThomas Huth 227ff41da50SThomas Huth- Reorder and clean-up the resulting trace:: 228ff41da50SThomas Huth 229ff41da50SThomas Huth scripts/oss-fuzz/reorder_fuzzer_qtest_trace.py /tmp/trace > /tmp/reproducer 230ff41da50SThomas Huth 231ff41da50SThomas Huth- Get the arguments needed to start qemu, and provide a path to qemu:: 232ff41da50SThomas Huth 233ff41da50SThomas Huth less /tmp/trace # The args should be logged at the top of this file 234ff41da50SThomas Huth export QEMU_ARGS="-machine ..." 235ff41da50SThomas Huth export QEMU_PATH="path/to/qemu-system" 236ff41da50SThomas Huth 237ff41da50SThomas Huth- Ensure the crash reproduces in qemu-system:: 238ff41da50SThomas Huth 239ff41da50SThomas Huth $QEMU_PATH $QEMU_ARGS -qtest stdio < /tmp/reproducer 240ff41da50SThomas Huth 241ff41da50SThomas Huth- From the crash output, obtain some string that identifies the crash. This 242ff41da50SThomas Huth can be a line in the stack-trace, for example:: 243ff41da50SThomas Huth 244ff41da50SThomas Huth export CRASH_TOKEN="hw/usb/hcd-xhci.c:1865" 245ff41da50SThomas Huth 246ff41da50SThomas Huth- Minimize the reproducer:: 247ff41da50SThomas Huth 248ff41da50SThomas Huth scripts/oss-fuzz/minimize_qtest_trace.py -M1 -M2 \ 249ff41da50SThomas Huth /tmp/reproducer /tmp/reproducer-minimized 250ff41da50SThomas Huth 251ff41da50SThomas Huth- Confirm that the minimized reproducer still crashes:: 252ff41da50SThomas Huth 253ff41da50SThomas Huth $QEMU_PATH $QEMU_ARGS -qtest stdio < /tmp/reproducer-minimized 254ff41da50SThomas Huth 255ff41da50SThomas Huth- Create a one-liner reproducer that can be sent over email:: 256ff41da50SThomas Huth 257ff41da50SThomas Huth ./scripts/oss-fuzz/output_reproducer.py -bash /tmp/reproducer-minimized 258ff41da50SThomas Huth 259ff41da50SThomas Huth- Output the C source code for a test case that will reproduce the bug:: 260ff41da50SThomas Huth 261ff41da50SThomas Huth ./scripts/oss-fuzz/output_reproducer.py -owner "John Smith <john@smith.com>"\ 262ff41da50SThomas Huth -name "test_function_name" /tmp/reproducer-minimized 263ff41da50SThomas Huth 264ff41da50SThomas Huth- Report the bug and send a patch with the C reproducer upstream 265ff41da50SThomas Huth 266ff41da50SThomas HuthImplementation Details / Fuzzer Lifecycle 267ff41da50SThomas Huth----------------------------------------- 268ff41da50SThomas Huth 269ff41da50SThomas HuthThe fuzzer has two entrypoints that libfuzzer calls. libfuzzer provides it's 270ff41da50SThomas Huthown ``main()``, which performs some setup, and calls the entrypoints: 271ff41da50SThomas Huth 272ff41da50SThomas Huth``LLVMFuzzerInitialize``: called prior to fuzzing. Used to initialize all of the 273ff41da50SThomas Huthnecessary state 274ff41da50SThomas Huth 275ff41da50SThomas Huth``LLVMFuzzerTestOneInput``: called for each fuzzing run. Processes the input and 276ff41da50SThomas Huthresets the state at the end of each run. 277ff41da50SThomas Huth 278ff41da50SThomas HuthIn more detail: 279ff41da50SThomas Huth 280ff41da50SThomas Huth``LLVMFuzzerInitialize`` parses the arguments to the fuzzer (must start with two 281ff41da50SThomas Huthdashes, so they are ignored by libfuzzer ``main()``). Currently, the arguments 282ff41da50SThomas Huthselect the fuzz target. Then, the qtest client is initialized. If the target 283ff41da50SThomas Huthrequires qos, qgraph is set up and the QOM/LIBQOS modules are initialized. 284ff41da50SThomas HuthThen the QGraph is walked and the QEMU cmd_line is determined and saved. 285ff41da50SThomas Huth 286ff41da50SThomas HuthAfter this, the ``vl.c:main`` is called to set up the guest. There are 287ff41da50SThomas Huthtarget-specific hooks that can be called before and after main, for 288ff41da50SThomas Huthadditional setup(e.g. PCI setup, or VM snapshotting). 289ff41da50SThomas Huth 290ff41da50SThomas Huth``LLVMFuzzerTestOneInput``: Uses qtest/qos functions to act based on the fuzz 291ff41da50SThomas Huthinput. It is also responsible for manually calling ``main_loop_wait`` to ensure 292ff41da50SThomas Huththat bottom halves are executed and any cleanup required before the next input. 293ff41da50SThomas Huth 294ff41da50SThomas HuthSince the same process is reused for many fuzzing runs, QEMU state needs to 295ff41da50SThomas Huthbe reset at the end of each run. For example, this can be done by rebooting the 296ff41da50SThomas HuthVM, after each run. 297ff41da50SThomas Huth 298ff41da50SThomas Huth - *Pros*: Straightforward and fast for simple fuzz targets. 299ff41da50SThomas Huth 300ff41da50SThomas Huth - *Cons*: Depending on the device, does not reset all device state. If the 301ff41da50SThomas Huth device requires some initialization prior to being ready for fuzzing (common 302ff41da50SThomas Huth for QOS-based targets), this initialization needs to be done after each 303ff41da50SThomas Huth reboot. 304ff41da50SThomas Huth 305ff41da50SThomas Huth - *Example target*: ``i440fx-qtest-reboot-fuzz`` 306