1*4d7fe02bSAlex Bennée.. 2*4d7fe02bSAlex Bennée Copyright (c) 2020, Linaro Limited 3*4d7fe02bSAlex Bennée Written by Alex Bennée 4*4d7fe02bSAlex Bennée 5*4d7fe02bSAlex Bennée 6*4d7fe02bSAlex Bennée======================== 7*4d7fe02bSAlex BennéeTCG Instruction Counting 8*4d7fe02bSAlex Bennée======================== 9*4d7fe02bSAlex Bennée 10*4d7fe02bSAlex BennéeTCG has long supported a feature known as icount which allows for 11*4d7fe02bSAlex Bennéeinstruction counting during execution. This should not be confused 12*4d7fe02bSAlex Bennéewith cycle accurate emulation - QEMU does not attempt to emulate how 13*4d7fe02bSAlex Bennéelong an instruction would take on real hardware. That is a job for 14*4d7fe02bSAlex Bennéeother more detailed (and slower) tools that simulate the rest of a 15*4d7fe02bSAlex Bennéemicro-architecture. 16*4d7fe02bSAlex Bennée 17*4d7fe02bSAlex BennéeThis feature is only available for system emulation and is 18*4d7fe02bSAlex Bennéeincompatible with multi-threaded TCG. It can be used to better align 19*4d7fe02bSAlex Bennéeexecution time with wall-clock time so a "slow" device doesn't run too 20*4d7fe02bSAlex Bennéefast on modern hardware. It can also provides for a degree of 21*4d7fe02bSAlex Bennéedeterministic execution and is an essential part of the record/replay 22*4d7fe02bSAlex Bennéesupport in QEMU. 23*4d7fe02bSAlex Bennée 24*4d7fe02bSAlex BennéeCore Concepts 25*4d7fe02bSAlex Bennée============= 26*4d7fe02bSAlex Bennée 27*4d7fe02bSAlex BennéeAt its heart icount is simply a count of executed instructions which 28*4d7fe02bSAlex Bennéeis stored in the TimersState of QEMU's timer sub-system. The number of 29*4d7fe02bSAlex Bennéeexecuted instructions can then be used to calculate QEMU_CLOCK_VIRTUAL 30*4d7fe02bSAlex Bennéewhich represents the amount of elapsed time in the system since 31*4d7fe02bSAlex Bennéeexecution started. Depending on the icount mode this may either be a 32*4d7fe02bSAlex Bennéefixed number of ns per instruction or adjusted as execution continues 33*4d7fe02bSAlex Bennéeto keep wall clock time and virtual time in sync. 34*4d7fe02bSAlex Bennée 35*4d7fe02bSAlex BennéeTo be able to calculate the number of executed instructions the 36*4d7fe02bSAlex Bennéetranslator starts by allocating a budget of instructions to be 37*4d7fe02bSAlex Bennéeexecuted. The budget of instructions is limited by how long it will be 38*4d7fe02bSAlex Bennéeuntil the next timer will expire. We store this budget as part of a 39*4d7fe02bSAlex BennéevCPU icount_decr field which shared with the machinery for handling 40*4d7fe02bSAlex Bennéecpu_exit(). The whole field is checked at the start of every 41*4d7fe02bSAlex Bennéetranslated block and will cause a return to the outer loop to deal 42*4d7fe02bSAlex Bennéewith whatever caused the exit. 43*4d7fe02bSAlex Bennée 44*4d7fe02bSAlex BennéeIn the case of icount, before the flag is checked we subtract the 45*4d7fe02bSAlex Bennéenumber of instructions the translation block would execute. If this 46*4d7fe02bSAlex Bennéewould cause the instruction budget to go negative we exit the main 47*4d7fe02bSAlex Bennéeloop and regenerate a new translation block with exactly the right 48*4d7fe02bSAlex Bennéenumber of instructions to take the budget to 0 meaning whatever timer 49*4d7fe02bSAlex Bennéewas due to expire will expire exactly when we exit the main run loop. 50*4d7fe02bSAlex Bennée 51*4d7fe02bSAlex BennéeDealing with MMIO 52*4d7fe02bSAlex Bennée----------------- 53*4d7fe02bSAlex Bennée 54*4d7fe02bSAlex BennéeWhile we can adjust the instruction budget for known events like timer 55*4d7fe02bSAlex Bennéeexpiry we cannot do the same for MMIO. Every load/store we execute 56*4d7fe02bSAlex Bennéemight potentially trigger an I/O event, at which point we will need an 57*4d7fe02bSAlex Bennéeup to date and accurate reading of the icount number. 58*4d7fe02bSAlex Bennée 59*4d7fe02bSAlex BennéeTo deal with this case, when an I/O access is made we: 60*4d7fe02bSAlex Bennée 61*4d7fe02bSAlex Bennée - restore un-executed instructions to the icount budget 62*4d7fe02bSAlex Bennée - re-compile a single [1]_ instruction block for the current PC 63*4d7fe02bSAlex Bennée - exit the cpu loop and execute the re-compiled block 64*4d7fe02bSAlex Bennée 65*4d7fe02bSAlex Bennée.. [1] sometimes two instructions if dealing with delay slots 66*4d7fe02bSAlex Bennée 67*4d7fe02bSAlex BennéeOther I/O operations 68*4d7fe02bSAlex Bennée-------------------- 69*4d7fe02bSAlex Bennée 70*4d7fe02bSAlex BennéeMMIO isn't the only type of operation for which we might need a 71*4d7fe02bSAlex Bennéecorrect and accurate clock. IO port instructions and accesses to 72*4d7fe02bSAlex Bennéesystem registers are the common examples here. These instructions have 73*4d7fe02bSAlex Bennéeto be handled by the individual translators which have the knowledge 74*4d7fe02bSAlex Bennéeof which operations are I/O operations. 75*4d7fe02bSAlex Bennée 76*4d7fe02bSAlex BennéeWhen the translator is handling an instruction of this kind: 77*4d7fe02bSAlex Bennée 78*4d7fe02bSAlex Bennée* it must call gen_io_start() if icount is enabled, at some 79*4d7fe02bSAlex Bennée point before the generation of the code which actually does 80*4d7fe02bSAlex Bennée the I/O, using a code fragment similar to: 81*4d7fe02bSAlex Bennée 82*4d7fe02bSAlex Bennée.. code:: c 83*4d7fe02bSAlex Bennée 84*4d7fe02bSAlex Bennée if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) { 85*4d7fe02bSAlex Bennée gen_io_start(); 86*4d7fe02bSAlex Bennée } 87*4d7fe02bSAlex Bennée 88*4d7fe02bSAlex Bennée* it must end the TB immediately after this instruction 89