xref: /openbmc/qemu/docs/devel/tcg-icount.rst (revision 34a5cb6d8434303c170230644b2a7c1d5781d197)
1*4d7fe02bSAlex Bennée..
2*4d7fe02bSAlex Bennée   Copyright (c) 2020, Linaro Limited
3*4d7fe02bSAlex Bennée   Written by Alex Bennée
4*4d7fe02bSAlex Bennée
5*4d7fe02bSAlex Bennée
6*4d7fe02bSAlex Bennée========================
7*4d7fe02bSAlex BennéeTCG Instruction Counting
8*4d7fe02bSAlex Bennée========================
9*4d7fe02bSAlex Bennée
10*4d7fe02bSAlex BennéeTCG has long supported a feature known as icount which allows for
11*4d7fe02bSAlex Bennéeinstruction counting during execution. This should not be confused
12*4d7fe02bSAlex Bennéewith cycle accurate emulation - QEMU does not attempt to emulate how
13*4d7fe02bSAlex Bennéelong an instruction would take on real hardware. That is a job for
14*4d7fe02bSAlex Bennéeother more detailed (and slower) tools that simulate the rest of a
15*4d7fe02bSAlex Bennéemicro-architecture.
16*4d7fe02bSAlex Bennée
17*4d7fe02bSAlex BennéeThis feature is only available for system emulation and is
18*4d7fe02bSAlex Bennéeincompatible with multi-threaded TCG. It can be used to better align
19*4d7fe02bSAlex Bennéeexecution time with wall-clock time so a "slow" device doesn't run too
20*4d7fe02bSAlex Bennéefast on modern hardware. It can also provides for a degree of
21*4d7fe02bSAlex Bennéedeterministic execution and is an essential part of the record/replay
22*4d7fe02bSAlex Bennéesupport in QEMU.
23*4d7fe02bSAlex Bennée
24*4d7fe02bSAlex BennéeCore Concepts
25*4d7fe02bSAlex Bennée=============
26*4d7fe02bSAlex Bennée
27*4d7fe02bSAlex BennéeAt its heart icount is simply a count of executed instructions which
28*4d7fe02bSAlex Bennéeis stored in the TimersState of QEMU's timer sub-system. The number of
29*4d7fe02bSAlex Bennéeexecuted instructions can then be used to calculate QEMU_CLOCK_VIRTUAL
30*4d7fe02bSAlex Bennéewhich represents the amount of elapsed time in the system since
31*4d7fe02bSAlex Bennéeexecution started. Depending on the icount mode this may either be a
32*4d7fe02bSAlex Bennéefixed number of ns per instruction or adjusted as execution continues
33*4d7fe02bSAlex Bennéeto keep wall clock time and virtual time in sync.
34*4d7fe02bSAlex Bennée
35*4d7fe02bSAlex BennéeTo be able to calculate the number of executed instructions the
36*4d7fe02bSAlex Bennéetranslator starts by allocating a budget of instructions to be
37*4d7fe02bSAlex Bennéeexecuted. The budget of instructions is limited by how long it will be
38*4d7fe02bSAlex Bennéeuntil the next timer will expire. We store this budget as part of a
39*4d7fe02bSAlex BennéevCPU icount_decr field which shared with the machinery for handling
40*4d7fe02bSAlex Bennéecpu_exit(). The whole field is checked at the start of every
41*4d7fe02bSAlex Bennéetranslated block and will cause a return to the outer loop to deal
42*4d7fe02bSAlex Bennéewith whatever caused the exit.
43*4d7fe02bSAlex Bennée
44*4d7fe02bSAlex BennéeIn the case of icount, before the flag is checked we subtract the
45*4d7fe02bSAlex Bennéenumber of instructions the translation block would execute. If this
46*4d7fe02bSAlex Bennéewould cause the instruction budget to go negative we exit the main
47*4d7fe02bSAlex Bennéeloop and regenerate a new translation block with exactly the right
48*4d7fe02bSAlex Bennéenumber of instructions to take the budget to 0 meaning whatever timer
49*4d7fe02bSAlex Bennéewas due to expire will expire exactly when we exit the main run loop.
50*4d7fe02bSAlex Bennée
51*4d7fe02bSAlex BennéeDealing with MMIO
52*4d7fe02bSAlex Bennée-----------------
53*4d7fe02bSAlex Bennée
54*4d7fe02bSAlex BennéeWhile we can adjust the instruction budget for known events like timer
55*4d7fe02bSAlex Bennéeexpiry we cannot do the same for MMIO. Every load/store we execute
56*4d7fe02bSAlex Bennéemight potentially trigger an I/O event, at which point we will need an
57*4d7fe02bSAlex Bennéeup to date and accurate reading of the icount number.
58*4d7fe02bSAlex Bennée
59*4d7fe02bSAlex BennéeTo deal with this case, when an I/O access is made we:
60*4d7fe02bSAlex Bennée
61*4d7fe02bSAlex Bennée  - restore un-executed instructions to the icount budget
62*4d7fe02bSAlex Bennée  - re-compile a single [1]_ instruction block for the current PC
63*4d7fe02bSAlex Bennée  - exit the cpu loop and execute the re-compiled block
64*4d7fe02bSAlex Bennée
65*4d7fe02bSAlex Bennée.. [1] sometimes two instructions if dealing with delay slots
66*4d7fe02bSAlex Bennée
67*4d7fe02bSAlex BennéeOther I/O operations
68*4d7fe02bSAlex Bennée--------------------
69*4d7fe02bSAlex Bennée
70*4d7fe02bSAlex BennéeMMIO isn't the only type of operation for which we might need a
71*4d7fe02bSAlex Bennéecorrect and accurate clock. IO port instructions and accesses to
72*4d7fe02bSAlex Bennéesystem registers are the common examples here. These instructions have
73*4d7fe02bSAlex Bennéeto be handled by the individual translators which have the knowledge
74*4d7fe02bSAlex Bennéeof which operations are I/O operations.
75*4d7fe02bSAlex Bennée
76*4d7fe02bSAlex BennéeWhen the translator is handling an instruction of this kind:
77*4d7fe02bSAlex Bennée
78*4d7fe02bSAlex Bennée* it must call gen_io_start() if icount is enabled, at some
79*4d7fe02bSAlex Bennée   point before the generation of the code which actually does
80*4d7fe02bSAlex Bennée   the I/O, using a code fragment similar to:
81*4d7fe02bSAlex Bennée
82*4d7fe02bSAlex Bennée.. code:: c
83*4d7fe02bSAlex Bennée
84*4d7fe02bSAlex Bennée    if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
85*4d7fe02bSAlex Bennée        gen_io_start();
86*4d7fe02bSAlex Bennée    }
87*4d7fe02bSAlex Bennée
88*4d7fe02bSAlex Bennée* it must end the TB immediately after this instruction
89