18ea6abf0SAlex Bennée.. 28ea6abf0SAlex Bennée Copyright (C) 2017, Emilio G. Cota <cota@braap.org> 38ea6abf0SAlex Bennée Copyright (c) 2019, Linaro Limited 48ea6abf0SAlex Bennée Written by Emilio Cota and Alex Bennée 58ea6abf0SAlex Bennée 6a0a6754bSAlex Bennée.. _TCG Plugins: 7a0a6754bSAlex Bennée 88ea6abf0SAlex BennéeQEMU TCG Plugins 98ea6abf0SAlex Bennée================ 108ea6abf0SAlex Bennée 118ea6abf0SAlex BennéeQEMU TCG plugins provide a way for users to run experiments taking 128ea6abf0SAlex Bennéeadvantage of the total system control emulation can have over a guest. 138ea6abf0SAlex BennéeIt provides a mechanism for plugins to subscribe to events during 148ea6abf0SAlex Bennéetranslation and execution and optionally callback into the plugin 158ea6abf0SAlex Bennéeduring these events. TCG plugins are unable to change the system state 168ea6abf0SAlex Bennéeonly monitor it passively. However they can do this down to an 178ea6abf0SAlex Bennéeindividual instruction granularity including potentially subscribing 188ea6abf0SAlex Bennéeto all load and store operations. 198ea6abf0SAlex Bennée 208ea6abf0SAlex BennéeUsage 21e9adb4acSPaolo Bonzini----- 228ea6abf0SAlex Bennée 23ba4dd2aaSAlex BennéeAny QEMU binary with TCG support has plugins enabled by default. 24ba4dd2aaSAlex BennéeEarlier releases needed to be explicitly enabled with:: 258ea6abf0SAlex Bennée 268ea6abf0SAlex Bennée configure --enable-plugins 278ea6abf0SAlex Bennée 288ea6abf0SAlex BennéeOnce built a program can be run with multiple plugins loaded each with 295c6ecbdcSAlex Bennéetheir own arguments:: 308ea6abf0SAlex Bennée 318ea6abf0SAlex Bennée $QEMU $OTHER_QEMU_ARGS \ 320f37cf2fSChristoph Muellner -plugin contrib/plugin/libhowvec.so,inline=on,count=hint \ 330f37cf2fSChristoph Muellner -plugin contrib/plugin/libhotblocks.so 348ea6abf0SAlex Bennée 358ea6abf0SAlex BennéeArguments are plugin specific and can be used to modify their 368ea6abf0SAlex Bennéebehaviour. In this case the howvec plugin is being asked to use inline 378ea6abf0SAlex Bennéeops to count and break down the hint instructions by type. 388ea6abf0SAlex Bennée 390f37cf2fSChristoph MuellnerLinux user-mode emulation also evaluates the environment variable 400f37cf2fSChristoph Muellner``QEMU_PLUGIN``:: 410f37cf2fSChristoph Muellner 420f37cf2fSChristoph Muellner QEMU_PLUGIN="file=contrib/plugins/libhowvec.so,inline=on,count=hint" $QEMU 430f37cf2fSChristoph Muellner 44e9adb4acSPaolo BonziniWriting plugins 45e9adb4acSPaolo Bonzini--------------- 46e9adb4acSPaolo Bonzini 47e9adb4acSPaolo BonziniAPI versioning 48e9adb4acSPaolo Bonzini~~~~~~~~~~~~~~ 49e9adb4acSPaolo Bonzini 50e9adb4acSPaolo BonziniThis is a new feature for QEMU and it does allow people to develop 51e9adb4acSPaolo Bonziniout-of-tree plugins that can be dynamically linked into a running QEMU 52e9adb4acSPaolo Bonziniprocess. However the project reserves the right to change or break the 53e9adb4acSPaolo BonziniAPI should it need to do so. The best way to avoid this is to submit 54e9adb4acSPaolo Bonziniyour plugin upstream so they can be updated if/when the API changes. 55e9adb4acSPaolo Bonzini 56e9adb4acSPaolo BonziniAll plugins need to declare a symbol which exports the plugin API 57e9adb4acSPaolo Bonziniversion they were built against. This can be done simply by:: 58e9adb4acSPaolo Bonzini 59e9adb4acSPaolo Bonzini QEMU_PLUGIN_EXPORT int qemu_plugin_version = QEMU_PLUGIN_VERSION; 60e9adb4acSPaolo Bonzini 61e9adb4acSPaolo BonziniThe core code will refuse to load a plugin that doesn't export a 62e9adb4acSPaolo Bonzini``qemu_plugin_version`` symbol or if plugin version is outside of QEMU's 63e9adb4acSPaolo Bonzinisupported range of API versions. 64e9adb4acSPaolo Bonzini 65e9adb4acSPaolo BonziniAdditionally the ``qemu_info_t`` structure which is passed to the 66e9adb4acSPaolo Bonzini``qemu_plugin_install`` method of a plugin will detail the minimum and 67e9adb4acSPaolo Bonzinicurrent API versions supported by QEMU. The API version will be 68e9adb4acSPaolo Bonziniincremented if new APIs are added. The minimum API version will be 69e9adb4acSPaolo Bonziniincremented if existing APIs are changed or removed. 70e9adb4acSPaolo Bonzini 71e9adb4acSPaolo BonziniLifetime of the query handle 72e9adb4acSPaolo Bonzini~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 73e9adb4acSPaolo Bonzini 74e9adb4acSPaolo BonziniEach callback provides an opaque anonymous information handle which 75e9adb4acSPaolo Bonzinican usually be further queried to find out information about a 76e9adb4acSPaolo Bonzinitranslation, instruction or operation. The handles themselves are only 77e9adb4acSPaolo Bonzinivalid during the lifetime of the callback so it is important that any 78e9adb4acSPaolo Bonziniinformation that is needed is extracted during the callback and saved 79e9adb4acSPaolo Bonziniby the plugin. 80e9adb4acSPaolo Bonzini 81e9adb4acSPaolo BonziniPlugin life cycle 82e9adb4acSPaolo Bonzini~~~~~~~~~~~~~~~~~ 838ea6abf0SAlex Bennée 848ea6abf0SAlex BennéeFirst the plugin is loaded and the public qemu_plugin_install function 858ea6abf0SAlex Bennéeis called. The plugin will then register callbacks for various plugin 868ea6abf0SAlex Bennéeevents. Generally plugins will register a handler for the *atexit* 878ea6abf0SAlex Bennéeif they want to dump a summary of collected information once the 888ea6abf0SAlex Bennéeprogram/system has finished running. 898ea6abf0SAlex Bennée 908ea6abf0SAlex BennéeWhen a registered event occurs the plugin callback is invoked. The 918ea6abf0SAlex Bennéecallbacks may provide additional information. In the case of a 928ea6abf0SAlex Bennéetranslation event the plugin has an option to enumerate the 938ea6abf0SAlex Bennéeinstructions in a block of instructions and optionally register 948ea6abf0SAlex Bennéecallbacks to some or all instructions when they are executed. 958ea6abf0SAlex Bennée 968ea6abf0SAlex BennéeThere is also a facility to add an inline event where code to 978ea6abf0SAlex Bennéeincrement a counter can be directly inlined with the translation. 988ea6abf0SAlex BennéeCurrently only a simple increment is supported. This is not atomic so 998ea6abf0SAlex Bennéecan miss counts. If you want absolute precision you should use a 1008ea6abf0SAlex Bennéecallback which can then ensure atomicity itself. 1018ea6abf0SAlex Bennée 1028ea6abf0SAlex BennéeFinally when QEMU exits all the registered *atexit* callbacks are 1038ea6abf0SAlex Bennéeinvoked. 1048ea6abf0SAlex Bennée 105e9adb4acSPaolo BonziniExposure of QEMU internals 106e9adb4acSPaolo Bonzini~~~~~~~~~~~~~~~~~~~~~~~~~~ 107e9adb4acSPaolo Bonzini 108e9adb4acSPaolo BonziniThe plugin architecture actively avoids leaking implementation details 109e9adb4acSPaolo Bonziniabout how QEMU's translation works to the plugins. While there are 110e9adb4acSPaolo Bonziniconceptions such as translation time and translation blocks the 111e9adb4acSPaolo Bonzinidetails are opaque to plugins. The plugin is able to query select 112e9adb4acSPaolo Bonzinidetails of instructions and system configuration only through the 113e9adb4acSPaolo Bonziniexported *qemu_plugin* functions. 114e9adb4acSPaolo Bonzini 115*f87b220fSAlex BennéeHowever the following assumptions can be made: 116*f87b220fSAlex Bennée 117*f87b220fSAlex BennéeTranslation Blocks 118*f87b220fSAlex Bennée++++++++++++++++++ 119*f87b220fSAlex Bennée 120*f87b220fSAlex BennéeAll code will go through a translation phase although not all 121*f87b220fSAlex Bennéetranslations will be necessarily be executed. You need to instrument 122*f87b220fSAlex Bennéeactual executions to track what is happening. 123*f87b220fSAlex Bennée 124*f87b220fSAlex BennéeIt is quite normal to see the same address translated multiple times. 125*f87b220fSAlex BennéeIf you want to track the code in system emulation you should examine 126*f87b220fSAlex Bennéethe underlying physical address (``qemu_plugin_insn_haddr``) to take 127*f87b220fSAlex Bennéeinto account the effects of virtual memory although if the system does 128*f87b220fSAlex Bennéepaging this will change too. 129*f87b220fSAlex Bennée 130*f87b220fSAlex BennéeNot all instructions in a block will always execute so if its 131*f87b220fSAlex Bennéeimportant to track individual instruction execution you need to 132*f87b220fSAlex Bennéeinstrument them directly. However asynchronous interrupts will not 133*f87b220fSAlex Bennéechange control flow mid-block. 134*f87b220fSAlex Bennée 135*f87b220fSAlex BennéeInstructions 136*f87b220fSAlex Bennée++++++++++++ 137*f87b220fSAlex Bennée 138*f87b220fSAlex BennéeInstruction instrumentation runs before the instruction executes. You 139*f87b220fSAlex Bennéecan be can be sure the instruction will be dispatched, but you can't 140*f87b220fSAlex Bennéebe sure it will complete. Generally this will be because of a 141*f87b220fSAlex Bennéesynchronous exception (e.g. SIGILL) triggered by the instruction 142*f87b220fSAlex Bennéeattempting to execute. If you want to be sure you will need to 143*f87b220fSAlex Bennéeinstrument the next instruction as well. See the ``execlog.c`` plugin 144*f87b220fSAlex Bennéefor examples of how to track this and finalise details after execution. 145*f87b220fSAlex Bennée 146*f87b220fSAlex BennéeMemory Accesses 147*f87b220fSAlex Bennée+++++++++++++++ 148*f87b220fSAlex Bennée 149*f87b220fSAlex BennéeMemory callbacks are called after a successful load or store. 150*f87b220fSAlex BennéeUnsuccessful operations (i.e. faults) will not be visible to memory 151*f87b220fSAlex Bennéeinstrumentation although the execution side effects can be observed 152*f87b220fSAlex Bennée(e.g. entering a exception handler). 153*f87b220fSAlex Bennée 154*f87b220fSAlex BennéeSystem Idle and Resume States 155*f87b220fSAlex Bennée+++++++++++++++++++++++++++++ 156*f87b220fSAlex Bennée 157*f87b220fSAlex BennéeThe ``qemu_plugin_register_vcpu_idle_cb`` and 158*f87b220fSAlex Bennée``qemu_plugin_register_vcpu_resume_cb`` functions can be used to track 159*f87b220fSAlex Bennéewhen CPUs go into and return from sleep states when waiting for 160*f87b220fSAlex Bennéeexternal I/O. Be aware though that these may occur less frequently 161*f87b220fSAlex Bennéethan in real HW due to the inefficiencies of emulation giving less 162*f87b220fSAlex Bennéechance for the CPU to idle. 163*f87b220fSAlex Bennée 1648ea6abf0SAlex BennéeInternals 165e9adb4acSPaolo Bonzini--------- 1668ea6abf0SAlex Bennée 1678ea6abf0SAlex BennéeLocking 168e9adb4acSPaolo Bonzini~~~~~~~ 1698ea6abf0SAlex Bennée 1708ea6abf0SAlex BennéeWe have to ensure we cannot deadlock, particularly under MTTCG. For 1718ea6abf0SAlex Bennéethis we acquire a lock when called from plugin code. We also keep the 1728ea6abf0SAlex Bennéelist of callbacks under RCU so that we do not have to hold the lock 1738ea6abf0SAlex Bennéewhen calling the callbacks. This is also for performance, since some 1748ea6abf0SAlex Bennéecallbacks (e.g. memory access callbacks) might be called very 1758ea6abf0SAlex Bennéefrequently. 1768ea6abf0SAlex Bennée 1778ea6abf0SAlex Bennée * A consequence of this is that we keep our own list of CPUs, so that 1788ea6abf0SAlex Bennée we do not have to worry about locking order wrt cpu_list_lock. 1798ea6abf0SAlex Bennée * Use a recursive lock, since we can get registration calls from 1808ea6abf0SAlex Bennée callbacks. 1818ea6abf0SAlex Bennée 1828ea6abf0SAlex BennéeAs a result registering/unregistering callbacks is "slow", since it 1838ea6abf0SAlex Bennéetakes a lock. But this is very infrequent; we want performance when 1848ea6abf0SAlex Bennéecalling (or not calling) callbacks, not when registering them. Using 1858ea6abf0SAlex BennéeRCU is great for this. 1868ea6abf0SAlex Bennée 1878ea6abf0SAlex BennéeWe support the uninstallation of a plugin at any time (e.g. from 1888ea6abf0SAlex Bennéeplugin callbacks). This allows plugins to remove themselves if they no 1898ea6abf0SAlex Bennéelonger want to instrument the code. This operation is asynchronous 1908ea6abf0SAlex Bennéewhich means callbacks may still occur after the uninstall operation is 1918ea6abf0SAlex Bennéerequested. The plugin isn't completely uninstalled until the safe work 1928ea6abf0SAlex Bennéehas executed while all vCPUs are quiescent. 193c17a386bSAlex Bennée 194c17a386bSAlex BennéeExample Plugins 195b0b3c0f5SAlex Bennée=============== 196c17a386bSAlex Bennée 197c17a386bSAlex BennéeThere are a number of plugins included with QEMU and you are 198c17a386bSAlex Bennéeencouraged to contribute your own plugins plugins upstream. There is a 19959195c65SAlex Bennée``contrib/plugins`` directory where they can go. There are also some 20059195c65SAlex Bennéebasic plugins that are used to test and exercise the API during the 20159195c65SAlex Bennée``make check-tcg`` target in ``tests\plugins``. 202c17a386bSAlex Bennée 20359195c65SAlex Bennée- tests/plugins/empty.c 204c17a386bSAlex Bennée 20559195c65SAlex BennéePurely a test plugin for measuring the overhead of the plugins system 20659195c65SAlex Bennéeitself. Does no instrumentation. 20759195c65SAlex Bennée 20859195c65SAlex Bennée- tests/plugins/bb.c 20959195c65SAlex Bennée 21059195c65SAlex BennéeA very basic plugin which will measure execution in course terms as 21159195c65SAlex Bennéeeach basic block is executed. By default the results are shown once 21259195c65SAlex Bennéeexecution finishes:: 21359195c65SAlex Bennée 21459195c65SAlex Bennée $ qemu-aarch64 -plugin tests/plugin/libbb.so \ 21559195c65SAlex Bennée -d plugin ./tests/tcg/aarch64-linux-user/sha1 21659195c65SAlex Bennée SHA1=15dd99a1991e0b3826fede3deffc1feba42278e6 21759195c65SAlex Bennée bb's: 2277338, insns: 158483046 21859195c65SAlex Bennée 21959195c65SAlex BennéeBehaviour can be tweaked with the following arguments: 22059195c65SAlex Bennée 22159195c65SAlex Bennée * inline=true|false 22259195c65SAlex Bennée 22359195c65SAlex Bennée Use faster inline addition of a single counter. Not per-cpu and not 22459195c65SAlex Bennée thread safe. 22559195c65SAlex Bennée 22659195c65SAlex Bennée * idle=true|false 22759195c65SAlex Bennée 22859195c65SAlex Bennée Dump the current execution stats whenever the guest vCPU idles 22959195c65SAlex Bennée 23059195c65SAlex Bennée- tests/plugins/insn.c 23159195c65SAlex Bennée 23259195c65SAlex BennéeThis is a basic instruction level instrumentation which can count the 23359195c65SAlex Bennéenumber of instructions executed on each core/thread:: 23459195c65SAlex Bennée 23559195c65SAlex Bennée $ qemu-aarch64 -plugin tests/plugin/libinsn.so \ 23659195c65SAlex Bennée -d plugin ./tests/tcg/aarch64-linux-user/threadcount 23759195c65SAlex Bennée Created 10 threads 23859195c65SAlex Bennée Done 23959195c65SAlex Bennée cpu 0 insns: 46765 24059195c65SAlex Bennée cpu 1 insns: 3694 24159195c65SAlex Bennée cpu 2 insns: 3694 24259195c65SAlex Bennée cpu 3 insns: 2994 24359195c65SAlex Bennée cpu 4 insns: 1497 24459195c65SAlex Bennée cpu 5 insns: 1497 24559195c65SAlex Bennée cpu 6 insns: 1497 24659195c65SAlex Bennée cpu 7 insns: 1497 24759195c65SAlex Bennée total insns: 63135 24859195c65SAlex Bennée 24959195c65SAlex BennéeBehaviour can be tweaked with the following arguments: 25059195c65SAlex Bennée 25159195c65SAlex Bennée * inline=true|false 25259195c65SAlex Bennée 25359195c65SAlex Bennée Use faster inline addition of a single counter. Not per-cpu and not 25459195c65SAlex Bennée thread safe. 25559195c65SAlex Bennée 25659195c65SAlex Bennée * sizes=true|false 25759195c65SAlex Bennée 25859195c65SAlex Bennée Give a summary of the instruction sizes for the execution 25959195c65SAlex Bennée 26059195c65SAlex Bennée * match=<string> 26159195c65SAlex Bennée 26259195c65SAlex Bennée Only instrument instructions matching the string prefix. Will show 26359195c65SAlex Bennée some basic stats including how many instructions have executed since 26459195c65SAlex Bennée the last execution. For example:: 26559195c65SAlex Bennée 26659195c65SAlex Bennée $ qemu-aarch64 -plugin tests/plugin/libinsn.so,match=bl \ 26759195c65SAlex Bennée -d plugin ./tests/tcg/aarch64-linux-user/sha512-vector 26859195c65SAlex Bennée ... 26959195c65SAlex Bennée 0x40069c, 'bl #0x4002b0', 10 hits, 1093 match hits, Δ+1257 since last match, 98 avg insns/match 27059195c65SAlex Bennée 0x4006ac, 'bl #0x403690', 10 hits, 1094 match hits, Δ+47 since last match, 98 avg insns/match 27159195c65SAlex Bennée 0x4037fc, 'bl #0x4002b0', 18 hits, 1095 match hits, Δ+22 since last match, 98 avg insns/match 27259195c65SAlex Bennée 0x400720, 'bl #0x403690', 10 hits, 1096 match hits, Δ+58 since last match, 98 avg insns/match 27359195c65SAlex Bennée 0x4037fc, 'bl #0x4002b0', 19 hits, 1097 match hits, Δ+22 since last match, 98 avg insns/match 27459195c65SAlex Bennée 0x400730, 'bl #0x403690', 10 hits, 1098 match hits, Δ+33 since last match, 98 avg insns/match 27559195c65SAlex Bennée 0x4037ac, 'bl #0x4002b0', 12 hits, 1099 match hits, Δ+20 since last match, 98 avg insns/match 27659195c65SAlex Bennée ... 27759195c65SAlex Bennée 27859195c65SAlex BennéeFor more detailed execution tracing see the ``execlog`` plugin for 27959195c65SAlex Bennéeother options. 28059195c65SAlex Bennée 28159195c65SAlex Bennée- tests/plugins/mem.c 28259195c65SAlex Bennée 28359195c65SAlex BennéeBasic instruction level memory instrumentation:: 28459195c65SAlex Bennée 28559195c65SAlex Bennée $ qemu-aarch64 -plugin tests/plugin/libmem.so,inline=true \ 28659195c65SAlex Bennée -d plugin ./tests/tcg/aarch64-linux-user/sha1 28759195c65SAlex Bennée SHA1=15dd99a1991e0b3826fede3deffc1feba42278e6 28859195c65SAlex Bennée inline mem accesses: 79525013 28959195c65SAlex Bennée 29059195c65SAlex BennéeBehaviour can be tweaked with the following arguments: 29159195c65SAlex Bennée 29259195c65SAlex Bennée * inline=true|false 29359195c65SAlex Bennée 29459195c65SAlex Bennée Use faster inline addition of a single counter. Not per-cpu and not 29559195c65SAlex Bennée thread safe. 29659195c65SAlex Bennée 29759195c65SAlex Bennée * callback=true|false 29859195c65SAlex Bennée 29959195c65SAlex Bennée Use callbacks on each memory instrumentation. 30059195c65SAlex Bennée 30159195c65SAlex Bennée * hwaddr=true|false 30259195c65SAlex Bennée 30359195c65SAlex Bennée Count IO accesses (only for system emulation) 30459195c65SAlex Bennée 30559195c65SAlex Bennée- tests/plugins/syscall.c 30659195c65SAlex Bennée 30759195c65SAlex BennéeA basic syscall tracing plugin. This only works for user-mode. By 30859195c65SAlex Bennéedefault it will give a summary of syscall stats at the end of the 30959195c65SAlex Bennéerun:: 31059195c65SAlex Bennée 31159195c65SAlex Bennée $ qemu-aarch64 -plugin tests/plugin/libsyscall \ 31259195c65SAlex Bennée -d plugin ./tests/tcg/aarch64-linux-user/threadcount 31359195c65SAlex Bennée Created 10 threads 31459195c65SAlex Bennée Done 31559195c65SAlex Bennée syscall no. calls errors 31659195c65SAlex Bennée 226 12 0 31759195c65SAlex Bennée 99 11 11 31859195c65SAlex Bennée 115 11 0 31959195c65SAlex Bennée 222 11 0 32059195c65SAlex Bennée 93 10 0 32159195c65SAlex Bennée 220 10 0 32259195c65SAlex Bennée 233 10 0 32359195c65SAlex Bennée 215 8 0 32459195c65SAlex Bennée 214 4 0 32559195c65SAlex Bennée 134 2 0 32659195c65SAlex Bennée 64 2 0 32759195c65SAlex Bennée 96 1 0 32859195c65SAlex Bennée 94 1 0 32959195c65SAlex Bennée 80 1 0 33059195c65SAlex Bennée 261 1 0 33159195c65SAlex Bennée 78 1 0 33259195c65SAlex Bennée 160 1 0 33359195c65SAlex Bennée 135 1 0 334c17a386bSAlex Bennée 335c17a386bSAlex Bennée- contrib/plugins/hotblocks.c 336c17a386bSAlex Bennée 337c17a386bSAlex BennéeThe hotblocks plugin allows you to examine the where hot paths of 338c17a386bSAlex Bennéeexecution are in your program. Once the program has finished you will 339c17a386bSAlex Bennéeget a sorted list of blocks reporting the starting PC, translation 340c17a386bSAlex Bennéecount, number of instructions and execution count. This will work best 341c17a386bSAlex Bennéewith linux-user execution as system emulation tends to generate 342c17a386bSAlex Bennéere-translations as blocks from different programs get swapped in and 343c17a386bSAlex Bennéeout of system memory. 344c17a386bSAlex Bennée 3451e235edaSPeter MaydellIf your program is single-threaded you can use the ``inline`` option for 346c17a386bSAlex Bennéeslightly faster (but not thread safe) counters. 347c17a386bSAlex Bennée 348c17a386bSAlex BennéeExample:: 349c17a386bSAlex Bennée 3501d0603a9SAlex Bennée $ qemu-aarch64 \ 351c17a386bSAlex Bennée -plugin contrib/plugins/libhotblocks.so -d plugin \ 352c17a386bSAlex Bennée ./tests/tcg/aarch64-linux-user/sha1 353c17a386bSAlex Bennée SHA1=15dd99a1991e0b3826fede3deffc1feba42278e6 354c17a386bSAlex Bennée collected 903 entries in the hash table 355c17a386bSAlex Bennée pc, tcount, icount, ecount 356c17a386bSAlex Bennée 0x0000000041ed10, 1, 5, 66087 357c17a386bSAlex Bennée 0x000000004002b0, 1, 4, 66087 358c17a386bSAlex Bennée ... 359c17a386bSAlex Bennée 360c17a386bSAlex Bennée- contrib/plugins/hotpages.c 361c17a386bSAlex Bennée 362c17a386bSAlex BennéeSimilar to hotblocks but this time tracks memory accesses:: 363c17a386bSAlex Bennée 3641d0603a9SAlex Bennée $ qemu-aarch64 \ 365c17a386bSAlex Bennée -plugin contrib/plugins/libhotpages.so -d plugin \ 366c17a386bSAlex Bennée ./tests/tcg/aarch64-linux-user/sha1 367c17a386bSAlex Bennée SHA1=15dd99a1991e0b3826fede3deffc1feba42278e6 368c17a386bSAlex Bennée Addr, RCPUs, Reads, WCPUs, Writes 369c17a386bSAlex Bennée 0x000055007fe000, 0x0001, 31747952, 0x0001, 8835161 370c17a386bSAlex Bennée 0x000055007ff000, 0x0001, 29001054, 0x0001, 8780625 371c17a386bSAlex Bennée 0x00005500800000, 0x0001, 687465, 0x0001, 335857 372c17a386bSAlex Bennée 0x0000000048b000, 0x0001, 130594, 0x0001, 355 373c17a386bSAlex Bennée 0x0000000048a000, 0x0001, 1826, 0x0001, 11 374c17a386bSAlex Bennée 375f698d5efSMahmoud MandourThe hotpages plugin can be configured using the following arguments: 376f698d5efSMahmoud Mandour 377f698d5efSMahmoud Mandour * sortby=reads|writes|address 378f698d5efSMahmoud Mandour 379f698d5efSMahmoud Mandour Log the data sorted by either the number of reads, the number of writes, or 380f698d5efSMahmoud Mandour memory address. (Default: entries are sorted by the sum of reads and writes) 381f698d5efSMahmoud Mandour 382f698d5efSMahmoud Mandour * io=on 383f698d5efSMahmoud Mandour 384f698d5efSMahmoud Mandour Track IO addresses. Only relevant to full system emulation. (Default: off) 385f698d5efSMahmoud Mandour 386f698d5efSMahmoud Mandour * pagesize=N 387f698d5efSMahmoud Mandour 388f698d5efSMahmoud Mandour The page size used. (Default: N = 4096) 389f698d5efSMahmoud Mandour 390c17a386bSAlex Bennée- contrib/plugins/howvec.c 391c17a386bSAlex Bennée 392c17a386bSAlex BennéeThis is an instruction classifier so can be used to count different 393c17a386bSAlex Bennéetypes of instructions. It has a number of options to refine which get 394450e0f28SJohn Snowcounted. You can give a value to the ``count`` argument for a class of 395d8525358SMahmoud Mandourinstructions to break it down fully, so for example to see all the system 396d8525358SMahmoud Mandourregisters accesses:: 397c17a386bSAlex Bennée 3981d0603a9SAlex Bennée $ qemu-system-aarch64 $(QEMU_ARGS) \ 399c17a386bSAlex Bennée -append "root=/dev/sda2 systemd.unit=benchmark.service" \ 400d8525358SMahmoud Mandour -smp 4 -plugin ./contrib/plugins/libhowvec.so,count=sreg -d plugin 401c17a386bSAlex Bennée 402c17a386bSAlex Bennéewhich will lead to a sorted list after the class breakdown:: 403c17a386bSAlex Bennée 404c17a386bSAlex Bennée Instruction Classes: 405c17a386bSAlex Bennée Class: UDEF not counted 406c17a386bSAlex Bennée Class: SVE (68 hits) 407c17a386bSAlex Bennée Class: PCrel addr (47789483 hits) 408c17a386bSAlex Bennée Class: Add/Sub (imm) (192817388 hits) 409c17a386bSAlex Bennée Class: Logical (imm) (93852565 hits) 410c17a386bSAlex Bennée Class: Move Wide (imm) (76398116 hits) 411c17a386bSAlex Bennée Class: Bitfield (44706084 hits) 412c17a386bSAlex Bennée Class: Extract (5499257 hits) 413c17a386bSAlex Bennée Class: Cond Branch (imm) (147202932 hits) 414c17a386bSAlex Bennée Class: Exception Gen (193581 hits) 415c17a386bSAlex Bennée Class: NOP not counted 416c17a386bSAlex Bennée Class: Hints (6652291 hits) 417c17a386bSAlex Bennée Class: Barriers (8001661 hits) 418c17a386bSAlex Bennée Class: PSTATE (1801695 hits) 419c17a386bSAlex Bennée Class: System Insn (6385349 hits) 420c17a386bSAlex Bennée Class: System Reg counted individually 421c17a386bSAlex Bennée Class: Branch (reg) (69497127 hits) 422c17a386bSAlex Bennée Class: Branch (imm) (84393665 hits) 423c17a386bSAlex Bennée Class: Cmp & Branch (110929659 hits) 424c17a386bSAlex Bennée Class: Tst & Branch (44681442 hits) 425c17a386bSAlex Bennée Class: AdvSimd ldstmult (736 hits) 426c17a386bSAlex Bennée Class: ldst excl (9098783 hits) 427c17a386bSAlex Bennée Class: Load Reg (lit) (87189424 hits) 428c17a386bSAlex Bennée Class: ldst noalloc pair (3264433 hits) 429c17a386bSAlex Bennée Class: ldst pair (412526434 hits) 430c17a386bSAlex Bennée Class: ldst reg (imm) (314734576 hits) 431c17a386bSAlex Bennée Class: Loads & Stores (2117774 hits) 432c17a386bSAlex Bennée Class: Data Proc Reg (223519077 hits) 433c17a386bSAlex Bennée Class: Scalar FP (31657954 hits) 434c17a386bSAlex Bennée Individual Instructions: 435c17a386bSAlex Bennée Instr: mrs x0, sp_el0 (2682661 hits) (op=0xd5384100/ System Reg) 436c17a386bSAlex Bennée Instr: mrs x1, tpidr_el2 (1789339 hits) (op=0xd53cd041/ System Reg) 437c17a386bSAlex Bennée Instr: mrs x2, tpidr_el2 (1513494 hits) (op=0xd53cd042/ System Reg) 438c17a386bSAlex Bennée Instr: mrs x0, tpidr_el2 (1490823 hits) (op=0xd53cd040/ System Reg) 439c17a386bSAlex Bennée Instr: mrs x1, sp_el0 (933793 hits) (op=0xd5384101/ System Reg) 440c17a386bSAlex Bennée Instr: mrs x2, sp_el0 (699516 hits) (op=0xd5384102/ System Reg) 441c17a386bSAlex Bennée Instr: mrs x4, tpidr_el2 (528437 hits) (op=0xd53cd044/ System Reg) 442c17a386bSAlex Bennée Instr: mrs x30, ttbr1_el1 (480776 hits) (op=0xd538203e/ System Reg) 443c17a386bSAlex Bennée Instr: msr ttbr1_el1, x30 (480713 hits) (op=0xd518203e/ System Reg) 444c17a386bSAlex Bennée Instr: msr vbar_el1, x30 (480671 hits) (op=0xd518c01e/ System Reg) 445c17a386bSAlex Bennée ... 446c17a386bSAlex Bennée 447c17a386bSAlex BennéeTo find the argument shorthand for the class you need to examine the 4481e235edaSPeter Maydellsource code of the plugin at the moment, specifically the ``*opt`` 449c17a386bSAlex Bennéeargument in the InsnClassExecCount tables. 450c17a386bSAlex Bennée 451c17a386bSAlex Bennée- contrib/plugins/lockstep.c 452c17a386bSAlex Bennée 453c17a386bSAlex BennéeThis is a debugging tool for developers who want to find out when and 454c17a386bSAlex Bennéewhere execution diverges after a subtle change to TCG code generation. 455c17a386bSAlex BennéeIt is not an exact science and results are likely to be mixed once 456c17a386bSAlex Bennéeasynchronous events are introduced. While the use of -icount can 457c17a386bSAlex Bennéeintroduce determinism to the execution flow it doesn't always follow 458c17a386bSAlex Bennéethe translation sequence will be exactly the same. Typically this is 459c17a386bSAlex Bennéecaused by a timer firing to service the GUI causing a block to end 460c17a386bSAlex Bennéeearly. However in some cases it has proved to be useful in pointing 461c17a386bSAlex Bennéepeople at roughly where execution diverges. The only argument you need 462c17a386bSAlex Bennéefor the plugin is a path for the socket the two instances will 463c17a386bSAlex Bennéecommunicate over:: 464c17a386bSAlex Bennée 465c17a386bSAlex Bennée 4661d0603a9SAlex Bennée $ qemu-system-sparc -monitor none -parallel none \ 467c17a386bSAlex Bennée -net none -M SS-20 -m 256 -kernel day11/zImage.elf \ 468b18a0cadSMahmoud Mandour -plugin ./contrib/plugins/liblockstep.so,sockpath=lockstep-sparc.sock \ 469c17a386bSAlex Bennée -d plugin,nochain 470c17a386bSAlex Bennée 471c17a386bSAlex Bennéewhich will eventually report:: 472c17a386bSAlex Bennée 473c17a386bSAlex Bennée qemu-system-sparc: warning: nic lance.0 has no peer 474c17a386bSAlex Bennée @ 0x000000ffd06678 vs 0x000000ffd001e0 (2/1 since last) 475c17a386bSAlex Bennée @ 0x000000ffd07d9c vs 0x000000ffd06678 (3/1 since last) 476c17a386bSAlex Bennée Δ insn_count @ 0x000000ffd07d9c (809900609) vs 0x000000ffd06678 (809900612) 477c17a386bSAlex Bennée previously @ 0x000000ffd06678/10 (809900609 insns) 478c17a386bSAlex Bennée previously @ 0x000000ffd001e0/4 (809900599 insns) 479c17a386bSAlex Bennée previously @ 0x000000ffd080ac/2 (809900595 insns) 480c17a386bSAlex Bennée previously @ 0x000000ffd08098/5 (809900593 insns) 481c17a386bSAlex Bennée previously @ 0x000000ffd080c0/1 (809900588 insns) 482c17a386bSAlex Bennée 483a35af836SMahmoud Mandour- contrib/plugins/hwprofile.c 484a622d64eSAlex Bennée 485a622d64eSAlex BennéeThe hwprofile tool can only be used with system emulation and allows 486a622d64eSAlex Bennéethe user to see what hardware is accessed how often. It has a number of options: 487a622d64eSAlex Bennée 48860753843SMahmoud Mandour * track=read or track=write 489a622d64eSAlex Bennée 490a622d64eSAlex Bennée By default the plugin tracks both reads and writes. You can use one 491a622d64eSAlex Bennée of these options to limit the tracking to just one class of accesses. 492a622d64eSAlex Bennée 49360753843SMahmoud Mandour * source 494a622d64eSAlex Bennée 495a622d64eSAlex Bennée Will include a detailed break down of what the guest PC that made the 49660753843SMahmoud Mandour access was. Not compatible with the pattern option. Example output:: 497a622d64eSAlex Bennée 498a622d64eSAlex Bennée cirrus-low-memory @ 0xfffffd00000a0000 499a622d64eSAlex Bennée pc:fffffc0000005cdc, 1, 256 500a622d64eSAlex Bennée pc:fffffc0000005ce8, 1, 256 501a622d64eSAlex Bennée pc:fffffc0000005cec, 1, 256 502a622d64eSAlex Bennée 50360753843SMahmoud Mandour * pattern 504a622d64eSAlex Bennée 505a622d64eSAlex Bennée Instead break down the accesses based on the offset into the HW 506a622d64eSAlex Bennée region. This can be useful for seeing the most used registers of a 507a622d64eSAlex Bennée device. Example output:: 508a622d64eSAlex Bennée 509a622d64eSAlex Bennée pci0-conf @ 0xfffffd01fe000000 510a622d64eSAlex Bennée off:00000004, 1, 1 511a622d64eSAlex Bennée off:00000010, 1, 3 512a622d64eSAlex Bennée off:00000014, 1, 3 513a622d64eSAlex Bennée off:00000018, 1, 2 514a622d64eSAlex Bennée off:0000001c, 1, 2 515a622d64eSAlex Bennée off:00000020, 1, 2 516a622d64eSAlex Bennée ... 517307ce0aaSAlexandre Iooss 518307ce0aaSAlexandre Iooss- contrib/plugins/execlog.c 519307ce0aaSAlexandre Iooss 520307ce0aaSAlexandre IoossThe execlog tool traces executed instructions with memory access. It can be used 521307ce0aaSAlexandre Ioossfor debugging and security analysis purposes. 522307ce0aaSAlexandre IoossPlease be aware that this will generate a lot of output. 523307ce0aaSAlexandre Iooss 524b7855bf6SAlex BennéeThe plugin needs default argument:: 525307ce0aaSAlexandre Iooss 5261d0603a9SAlex Bennée $ qemu-system-arm $(QEMU_ARGS) \ 527307ce0aaSAlexandre Iooss -plugin ./contrib/plugins/libexeclog.so -d plugin 528307ce0aaSAlexandre Iooss 529307ce0aaSAlexandre Ioosswhich will output an execution trace following this structure:: 530307ce0aaSAlexandre Iooss 531307ce0aaSAlexandre Iooss # vCPU, vAddr, opcode, disassembly[, load/store, memory addr, device]... 532307ce0aaSAlexandre Iooss 0, 0xa12, 0xf8012400, "movs r4, #0" 533307ce0aaSAlexandre Iooss 0, 0xa14, 0xf87f42b4, "cmp r4, r6" 534307ce0aaSAlexandre Iooss 0, 0xa16, 0xd206, "bhs #0xa26" 535307ce0aaSAlexandre Iooss 0, 0xa18, 0xfff94803, "ldr r0, [pc, #0xc]", load, 0x00010a28, RAM 536307ce0aaSAlexandre Iooss 0, 0xa1a, 0xf989f000, "bl #0xd30" 537307ce0aaSAlexandre Iooss 0, 0xd30, 0xfff9b510, "push {r4, lr}", store, 0x20003ee0, RAM, store, 0x20003ee4, RAM 538307ce0aaSAlexandre Iooss 0, 0xd32, 0xf9893014, "adds r0, #0x14" 539307ce0aaSAlexandre Iooss 0, 0xd34, 0xf9c8f000, "bl #0x10c8" 540307ce0aaSAlexandre Iooss 0, 0x10c8, 0xfff96c43, "ldr r3, [r0, #0x44]", load, 0x200000e4, RAM 5414c125f3bSMahmoud Mandour 542b7855bf6SAlex Bennéethe output can be filtered to only track certain instructions or 5431d0603a9SAlex Bennéeaddresses using the ``ifilter`` or ``afilter`` options. You can stack the 544b7855bf6SAlex Bennéearguments if required:: 545b7855bf6SAlex Bennée 5461d0603a9SAlex Bennée $ qemu-system-arm $(QEMU_ARGS) \ 547b7855bf6SAlex Bennée -plugin ./contrib/plugins/libexeclog.so,ifilter=st1w,afilter=0x40001808 -d plugin 548b7855bf6SAlex Bennée 549af6e4e0aSAlex BennéeThis plugin can also dump registers when they change value. Specify the name of the 550af6e4e0aSAlex Bennéeregisters with multiple ``reg`` options. You can also use glob style matching if you wish:: 551af6e4e0aSAlex Bennée 552af6e4e0aSAlex Bennée $ qemu-system-arm $(QEMU_ARGS) \ 553af6e4e0aSAlex Bennée -plugin ./contrib/plugins/libexeclog.so,reg=\*_el2,reg=sp -d plugin 554af6e4e0aSAlex Bennée 555af6e4e0aSAlex BennéeBe aware that each additional register to check will slow down 556af6e4e0aSAlex Bennéeexecution quite considerably. You can optimise the number of register 557af6e4e0aSAlex Bennéechecks done by using the rdisas option. This will only instrument 558af6e4e0aSAlex Bennéeinstructions that mention the registers in question in disassembly. 559af6e4e0aSAlex BennéeThis is not foolproof as some instructions implicitly change 560af6e4e0aSAlex Bennéeinstructions. You can use the ifilter to catch these cases: 561af6e4e0aSAlex Bennée 562af6e4e0aSAlex Bennée $ qemu-system-arm $(QEMU_ARGS) \ 563af6e4e0aSAlex Bennée -plugin ./contrib/plugins/libexeclog.so,ifilter=msr,ifilter=blr,reg=x30,reg=\*_el1,rdisas=on 564af6e4e0aSAlex Bennée 565a35af836SMahmoud Mandour- contrib/plugins/cache.c 5664c125f3bSMahmoud Mandour 567b8312e04SMahmoud MandourCache modelling plugin that measures the performance of a given L1 cache 568b8312e04SMahmoud Mandourconfiguration, and optionally a unified L2 per-core cache when a given working 569b8312e04SMahmoud Mandourset is run:: 5704c125f3bSMahmoud Mandour 5711d0603a9SAlex Bennée $ qemu-x86_64 -plugin ./contrib/plugins/libcache.so \ 5724c125f3bSMahmoud Mandour -d plugin -D cache.log ./tests/tcg/x86_64-linux-user/float_convs 5734c125f3bSMahmoud Mandour 5744c125f3bSMahmoud Mandourwill report the following:: 5754c125f3bSMahmoud Mandour 5765397acb8SMahmoud Mandour core #, data accesses, data misses, dmiss rate, insn accesses, insn misses, imiss rate 5775397acb8SMahmoud Mandour 0 996695 508 0.0510% 2642799 18617 0.7044% 5784c125f3bSMahmoud Mandour 5794c125f3bSMahmoud Mandour address, data misses, instruction 5804c125f3bSMahmoud Mandour 0x424f1e (_int_malloc), 109, movq %rax, 8(%rcx) 5814c125f3bSMahmoud Mandour 0x41f395 (_IO_default_xsputn), 49, movb %dl, (%rdi, %rax) 5824c125f3bSMahmoud Mandour 0x42584d (ptmalloc_init.part.0), 33, movaps %xmm0, (%rax) 5834c125f3bSMahmoud Mandour 0x454d48 (__tunables_init), 20, cmpb $0, (%r8) 5844c125f3bSMahmoud Mandour ... 5854c125f3bSMahmoud Mandour 5864c125f3bSMahmoud Mandour address, fetch misses, instruction 5874c125f3bSMahmoud Mandour 0x4160a0 (__vfprintf_internal), 744, movl $1, %ebx 5884c125f3bSMahmoud Mandour 0x41f0a0 (_IO_setb), 744, endbr64 5894c125f3bSMahmoud Mandour 0x415882 (__vfprintf_internal), 744, movq %r12, %rdi 5904c125f3bSMahmoud Mandour 0x4268a0 (__malloc), 696, andq $0xfffffffffffffff0, %rax 5914c125f3bSMahmoud Mandour ... 5924c125f3bSMahmoud Mandour 5934c125f3bSMahmoud MandourThe plugin has a number of arguments, all of them are optional: 5944c125f3bSMahmoud Mandour 5952dd3fef8SMahmoud Mandour * limit=N 5964c125f3bSMahmoud Mandour 5974c125f3bSMahmoud Mandour Print top N icache and dcache thrashing instructions along with their 5984c125f3bSMahmoud Mandour address, number of misses, and its disassembly. (default: 32) 5994c125f3bSMahmoud Mandour 6002dd3fef8SMahmoud Mandour * icachesize=N 6012dd3fef8SMahmoud Mandour * iblksize=B 6022dd3fef8SMahmoud Mandour * iassoc=A 6034c125f3bSMahmoud Mandour 6044c125f3bSMahmoud Mandour Instruction cache configuration arguments. They specify the cache size, block 6054c125f3bSMahmoud Mandour size, and associativity of the instruction cache, respectively. 6064c125f3bSMahmoud Mandour (default: N = 16384, B = 64, A = 8) 6074c125f3bSMahmoud Mandour 6082dd3fef8SMahmoud Mandour * dcachesize=N 6092dd3fef8SMahmoud Mandour * dblksize=B 6102dd3fef8SMahmoud Mandour * dassoc=A 6114c125f3bSMahmoud Mandour 6124c125f3bSMahmoud Mandour Data cache configuration arguments. They specify the cache size, block size, 6134c125f3bSMahmoud Mandour and associativity of the data cache, respectively. 6144c125f3bSMahmoud Mandour (default: N = 16384, B = 64, A = 8) 6154c125f3bSMahmoud Mandour 6162dd3fef8SMahmoud Mandour * evict=POLICY 6174c125f3bSMahmoud Mandour 6184c125f3bSMahmoud Mandour Sets the eviction policy to POLICY. Available policies are: :code:`lru`, 6194c125f3bSMahmoud Mandour :code:`fifo`, and :code:`rand`. The plugin will use the specified policy for 6204c125f3bSMahmoud Mandour both instruction and data caches. (default: POLICY = :code:`lru`) 6215397acb8SMahmoud Mandour 6222dd3fef8SMahmoud Mandour * cores=N 6235397acb8SMahmoud Mandour 6245397acb8SMahmoud Mandour Sets the number of cores for which we maintain separate icache and dcache. 6255397acb8SMahmoud Mandour (default: for linux-user, N = 1, for full system emulation: N = cores 6265397acb8SMahmoud Mandour available to guest) 627b8312e04SMahmoud Mandour 628b8312e04SMahmoud Mandour * l2=on 629b8312e04SMahmoud Mandour 630b8312e04SMahmoud Mandour Simulates a unified L2 cache (stores blocks for both instructions and data) 631b8312e04SMahmoud Mandour using the default L2 configuration (cache size = 2MB, associativity = 16-way, 632b8312e04SMahmoud Mandour block size = 64B). 633b8312e04SMahmoud Mandour 634b8312e04SMahmoud Mandour * l2cachesize=N 635b8312e04SMahmoud Mandour * l2blksize=B 636b8312e04SMahmoud Mandour * l2assoc=A 637b8312e04SMahmoud Mandour 638b8312e04SMahmoud Mandour L2 cache configuration arguments. They specify the cache size, block size, and 639b8312e04SMahmoud Mandour associativity of the L2 cache, respectively. Setting any of the L2 640b8312e04SMahmoud Mandour configuration arguments implies ``l2=on``. 641b8312e04SMahmoud Mandour (default: N = 2097152 (2MB), B = 64, A = 16) 6427f522743SAlex Bennée 643b0b3c0f5SAlex BennéePlugin API 644b0b3c0f5SAlex Bennée========== 6457f522743SAlex Bennée 6467f522743SAlex BennéeThe following API is generated from the inline documentation in 6477f522743SAlex Bennée``include/qemu/qemu-plugin.h``. Please ensure any updates to the API 6487f522743SAlex Bennéeinclude the full kernel-doc annotations. 6497f522743SAlex Bennée 6507f522743SAlex Bennée.. kernel-doc:: include/qemu/qemu-plugin.h 651