xref: /openbmc/qemu/docs/devel/tcg-ops.rst (revision 03e4bc0bc02779fdf6f8e8d83197f05e70881abf)
15e97a28aSMark Cave-Ayland.. _tcg-ops-ref:
25e97a28aSMark Cave-Ayland
35e97a28aSMark Cave-Ayland*******************************
45e97a28aSMark Cave-AylandTCG Intermediate Representation
55e97a28aSMark Cave-Ayland*******************************
65e97a28aSMark Cave-Ayland
75e97a28aSMark Cave-AylandIntroduction
85e97a28aSMark Cave-Ayland============
95e97a28aSMark Cave-Ayland
109644e714SRichard HendersonTCG (Tiny Code Generator) began as a generic backend for a C compiler.
119644e714SRichard HendersonIt was simplified to be used in QEMU.  It also has its roots in the
129644e714SRichard HendersonQOP code generator written by Paul Brook.
135e97a28aSMark Cave-Ayland
145e97a28aSMark Cave-AylandDefinitions
155e97a28aSMark Cave-Ayland===========
165e97a28aSMark Cave-Ayland
179644e714SRichard HendersonThe TCG *target* is the architecture for which we generate the code.
189644e714SRichard HendersonIt is of course not the same as the "target" of QEMU which is the
199644e714SRichard Hendersonemulated architecture.  As TCG started as a generic C backend used
209644e714SRichard Hendersonfor cross compiling, the assumption was that TCG target might be
219644e714SRichard Hendersondifferent from the host, although this is never the case for QEMU.
225e97a28aSMark Cave-Ayland
235e97a28aSMark Cave-AylandIn this document, we use *guest* to specify what architecture we are
245e97a28aSMark Cave-Aylandemulating; *target* always means the TCG target, the machine on which
255e97a28aSMark Cave-Aylandwe are running QEMU.
265e97a28aSMark Cave-Ayland
275e97a28aSMark Cave-AylandAn operation with *undefined behavior* may result in a crash.
285e97a28aSMark Cave-Ayland
295e97a28aSMark Cave-AylandAn operation with *unspecified behavior* shall not crash.  However,
305e97a28aSMark Cave-Aylandthe result may be one of several possibilities so may be considered
315e97a28aSMark Cave-Aylandan *undefined result*.
325e97a28aSMark Cave-Ayland
339644e714SRichard HendersonBasic Blocks
349644e714SRichard Henderson============
355e97a28aSMark Cave-Ayland
369644e714SRichard HendersonA TCG *basic block* is a single entry, multiple exit region which
379644e714SRichard Hendersoncorresponds to a list of instructions terminated by a label, or
389644e714SRichard Hendersonany branch instruction.
395e97a28aSMark Cave-Ayland
409644e714SRichard HendersonA TCG *extended basic block* is a single entry, multiple exit region
419644e714SRichard Hendersonwhich corresponds to a list of instructions terminated by a label or
429644e714SRichard Hendersonan unconditional branch.  Specifically, an extended basic block is
439644e714SRichard Hendersona sequence of basic blocks connected by the fall-through paths of
449644e714SRichard Hendersonzero or more conditional branch instructions.
455e97a28aSMark Cave-Ayland
469644e714SRichard HendersonOperations
479644e714SRichard Henderson==========
485e97a28aSMark Cave-Ayland
499644e714SRichard HendersonTCG instructions or *ops* operate on TCG *variables*, both of which
509644e714SRichard Hendersonare strongly typed.  Each instruction has a fixed number of output
519644e714SRichard Hendersonvariable operands, input variable operands and constant operands.
529644e714SRichard HendersonVector instructions have a field specifying the element size within
539644e714SRichard Hendersonthe vector.  The notable exception is the call instruction which has
549644e714SRichard Hendersona variable number of outputs and inputs.
555e97a28aSMark Cave-Ayland
565e97a28aSMark Cave-AylandIn the textual form, output operands usually come first, followed by
575e97a28aSMark Cave-Aylandinput operands, followed by constant operands. The output type is
585e97a28aSMark Cave-Aylandincluded in the instruction name. Constants are prefixed with a '$'.
595e97a28aSMark Cave-Ayland
605e97a28aSMark Cave-Ayland.. code-block:: none
615e97a28aSMark Cave-Ayland
625e97a28aSMark Cave-Ayland   add_i32 t0, t1, t2    /* (t0 <- t1 + t2) */
635e97a28aSMark Cave-Ayland
649644e714SRichard HendersonVariables
659644e714SRichard Henderson=========
665e97a28aSMark Cave-Ayland
679644e714SRichard Henderson* ``TEMP_FIXED``
685e97a28aSMark Cave-Ayland
699644e714SRichard Henderson  There is one TCG *fixed global* variable, ``cpu_env``, which is
709644e714SRichard Henderson  live in all translation blocks, and holds a pointer to ``CPUArchState``.
719644e714SRichard Henderson  This variable is held in a host cpu register at all times in all
729644e714SRichard Henderson  translation blocks.
735e97a28aSMark Cave-Ayland
749644e714SRichard Henderson* ``TEMP_GLOBAL``
755e97a28aSMark Cave-Ayland
769644e714SRichard Henderson  A TCG *global* is a variable which is live in all translation blocks,
779644e714SRichard Henderson  and corresponds to memory location that is within ``CPUArchState``.
789644e714SRichard Henderson  These may be specified as an offset from ``cpu_env``, in which case
799644e714SRichard Henderson  they are called *direct globals*, or may be specified as an offset
809644e714SRichard Henderson  from a direct global, in which case they are called *indirect globals*.
819644e714SRichard Henderson  Even indirect globals should still reference memory within
829644e714SRichard Henderson  ``CPUArchState``.  All TCG globals are defined during
839644e714SRichard Henderson  ``TCGCPUOps.initialize``, before any translation blocks are generated.
845e97a28aSMark Cave-Ayland
859644e714SRichard Henderson* ``TEMP_CONST``
865e97a28aSMark Cave-Ayland
879644e714SRichard Henderson  A TCG *constant* is a variable which is live throughout the entire
889644e714SRichard Henderson  translation block, and contains a constant value.  These variables
899644e714SRichard Henderson  are allocated on demand during translation and are hashed so that
909644e714SRichard Henderson  there is exactly one variable holding a given value.
915e97a28aSMark Cave-Ayland
929644e714SRichard Henderson* ``TEMP_TB``
935e97a28aSMark Cave-Ayland
949644e714SRichard Henderson  A TCG *translation block temporary* is a variable which is live
959644e714SRichard Henderson  throughout the entire translation block, but dies on any exit.
969644e714SRichard Henderson  These temporaries are allocated explicitly during translation.
975e97a28aSMark Cave-Ayland
989644e714SRichard Henderson* ``TEMP_EBB``
999644e714SRichard Henderson
1009644e714SRichard Henderson  A TCG *extended basic block temporary* is a variable which is live
1019644e714SRichard Henderson  throughout an extended basic block, but dies on any exit.
1029644e714SRichard Henderson  These temporaries are allocated explicitly during translation.
1039644e714SRichard Henderson
1049644e714SRichard HendersonTypes
1059644e714SRichard Henderson=====
1069644e714SRichard Henderson
1079644e714SRichard Henderson* ``TCG_TYPE_I32``
1089644e714SRichard Henderson
1099644e714SRichard Henderson  A 32-bit integer.
1109644e714SRichard Henderson
1119644e714SRichard Henderson* ``TCG_TYPE_I64``
1129644e714SRichard Henderson
1139644e714SRichard Henderson  A 64-bit integer.  For 32-bit hosts, such variables are split into a pair
1149644e714SRichard Henderson  of variables with ``type=TCG_TYPE_I32`` and ``base_type=TCG_TYPE_I64``.
1159644e714SRichard Henderson  The ``temp_subindex`` for each indicates where it falls within the
1169644e714SRichard Henderson  host-endian representation.
1179644e714SRichard Henderson
1189644e714SRichard Henderson* ``TCG_TYPE_PTR``
1199644e714SRichard Henderson
1209644e714SRichard Henderson  An alias for ``TCG_TYPE_I32`` or ``TCG_TYPE_I64``, depending on the size
1219644e714SRichard Henderson  of a pointer for the host.
1229644e714SRichard Henderson
1239644e714SRichard Henderson* ``TCG_TYPE_REG``
1249644e714SRichard Henderson
1259644e714SRichard Henderson  An alias for ``TCG_TYPE_I32`` or ``TCG_TYPE_I64``, depending on the size
1269644e714SRichard Henderson  of the integer registers for the host.  This may be larger
1279644e714SRichard Henderson  than ``TCG_TYPE_PTR`` depending on the host ABI.
1289644e714SRichard Henderson
1299644e714SRichard Henderson* ``TCG_TYPE_I128``
1309644e714SRichard Henderson
1319644e714SRichard Henderson  A 128-bit integer.  For all hosts, such variables are split into a number
1329644e714SRichard Henderson  of variables with ``type=TCG_TYPE_REG`` and ``base_type=TCG_TYPE_I128``.
1339644e714SRichard Henderson  The ``temp_subindex`` for each indicates where it falls within the
1349644e714SRichard Henderson  host-endian representation.
1359644e714SRichard Henderson
1369644e714SRichard Henderson* ``TCG_TYPE_V64``
1379644e714SRichard Henderson
1389644e714SRichard Henderson  A 64-bit vector.  This type is valid only if the TCG target
1399644e714SRichard Henderson  sets ``TCG_TARGET_HAS_v64``.
1409644e714SRichard Henderson
1419644e714SRichard Henderson* ``TCG_TYPE_V128``
1429644e714SRichard Henderson
1439644e714SRichard Henderson  A 128-bit vector.  This type is valid only if the TCG target
1449644e714SRichard Henderson  sets ``TCG_TARGET_HAS_v128``.
1459644e714SRichard Henderson
1469644e714SRichard Henderson* ``TCG_TYPE_V256``
1479644e714SRichard Henderson
1489644e714SRichard Henderson  A 256-bit vector.  This type is valid only if the TCG target
1499644e714SRichard Henderson  sets ``TCG_TARGET_HAS_v256``.
1505e97a28aSMark Cave-Ayland
1515e97a28aSMark Cave-AylandHelpers
1529644e714SRichard Henderson=======
1535e97a28aSMark Cave-Ayland
1549644e714SRichard HendersonHelpers are registered in a guest-specific ``helper.h``,
1559644e714SRichard Hendersonwhich is processed to generate ``tcg_gen_helper_*`` functions.
1569644e714SRichard HendersonWith these functions it is possible to call a function taking
1579644e714SRichard Hendersoni32, i64, i128 or pointer types.
1585e97a28aSMark Cave-Ayland
1599644e714SRichard HendersonBy default, before calling a helper, all globals are stored at their
1609644e714SRichard Hendersoncanonical location.  By default, the helper is allowed to modify the
1619644e714SRichard HendersonCPU state (including the state represented by tcg globals)
1629644e714SRichard Hendersonor may raise an exception.  This default can be overridden using the
1639644e714SRichard Hendersonfollowing function modifiers:
1645e97a28aSMark Cave-Ayland
1659644e714SRichard Henderson* ``TCG_CALL_NO_WRITE_GLOBALS``
1665e97a28aSMark Cave-Ayland
1679644e714SRichard Henderson  The helper does not modify any globals, but may read them.
1689644e714SRichard Henderson  Globals will be saved to their canonical location before calling helpers,
1699644e714SRichard Henderson  but need not be reloaded afterwards.
1705e97a28aSMark Cave-Ayland
1719644e714SRichard Henderson* ``TCG_CALL_NO_READ_GLOBALS``
1725e97a28aSMark Cave-Ayland
1739644e714SRichard Henderson  The helper does not read globals, either directly or via an exception.
1749644e714SRichard Henderson  They will not be saved to their canonical locations before calling
1759644e714SRichard Henderson  the helper.  This implies ``TCG_CALL_NO_WRITE_GLOBALS``.
1765e97a28aSMark Cave-Ayland
1779644e714SRichard Henderson* ``TCG_CALL_NO_SIDE_EFFECTS``
1785e97a28aSMark Cave-Ayland
1799644e714SRichard Henderson  The call to the helper function may be removed if the return value is
1809644e714SRichard Henderson  not used.  This means that it may not modify any CPU state nor may it
1819644e714SRichard Henderson  raise an exception.
1825e97a28aSMark Cave-Ayland
1835e97a28aSMark Cave-AylandCode Optimizations
1849644e714SRichard Henderson==================
1855e97a28aSMark Cave-Ayland
1865e97a28aSMark Cave-AylandWhen generating instructions, you can count on at least the following
1875e97a28aSMark Cave-Aylandoptimizations:
1885e97a28aSMark Cave-Ayland
1895e97a28aSMark Cave-Ayland- Single instructions are simplified, e.g.
1905e97a28aSMark Cave-Ayland
1915e97a28aSMark Cave-Ayland  .. code-block:: none
1925e97a28aSMark Cave-Ayland
1935e97a28aSMark Cave-Ayland     and_i32 t0, t0, $0xffffffff
1945e97a28aSMark Cave-Ayland
1955e97a28aSMark Cave-Ayland  is suppressed.
1965e97a28aSMark Cave-Ayland
1975e97a28aSMark Cave-Ayland- A liveness analysis is done at the basic block level. The
1985e97a28aSMark Cave-Ayland  information is used to suppress moves from a dead variable to
1995e97a28aSMark Cave-Ayland  another one. It is also used to remove instructions which compute
2005e97a28aSMark Cave-Ayland  dead results. The later is especially useful for condition code
2015e97a28aSMark Cave-Ayland  optimization in QEMU.
2025e97a28aSMark Cave-Ayland
2035e97a28aSMark Cave-Ayland  In the following example:
2045e97a28aSMark Cave-Ayland
2055e97a28aSMark Cave-Ayland  .. code-block:: none
2065e97a28aSMark Cave-Ayland
2075e97a28aSMark Cave-Ayland     add_i32 t0, t1, t2
2085e97a28aSMark Cave-Ayland     add_i32 t0, t0, $1
2095e97a28aSMark Cave-Ayland     mov_i32 t0, $1
2105e97a28aSMark Cave-Ayland
2115e97a28aSMark Cave-Ayland  only the last instruction is kept.
2125e97a28aSMark Cave-Ayland
2135e97a28aSMark Cave-Ayland
2145e97a28aSMark Cave-AylandInstruction Reference
2155e97a28aSMark Cave-Ayland=====================
2165e97a28aSMark Cave-Ayland
2175e97a28aSMark Cave-AylandFunction call
2185e97a28aSMark Cave-Ayland-------------
2195e97a28aSMark Cave-Ayland
2205e97a28aSMark Cave-Ayland.. list-table::
2215e97a28aSMark Cave-Ayland
2225e97a28aSMark Cave-Ayland   * - call *<ret>* *<params>* ptr
2235e97a28aSMark Cave-Ayland
2245e97a28aSMark Cave-Ayland     - |  call function 'ptr' (pointer type)
2255e97a28aSMark Cave-Ayland       |
2265e97a28aSMark Cave-Ayland       |  *<ret>* optional 32 bit or 64 bit return value
2275e97a28aSMark Cave-Ayland       |  *<params>* optional 32 bit or 64 bit parameters
2285e97a28aSMark Cave-Ayland
2295e97a28aSMark Cave-AylandJumps/Labels
2305e97a28aSMark Cave-Ayland------------
2315e97a28aSMark Cave-Ayland
2325e97a28aSMark Cave-Ayland.. list-table::
2335e97a28aSMark Cave-Ayland
2345e97a28aSMark Cave-Ayland   * - set_label $label
2355e97a28aSMark Cave-Ayland
2365e97a28aSMark Cave-Ayland     - | Define label 'label' at the current program point.
2375e97a28aSMark Cave-Ayland
2385e97a28aSMark Cave-Ayland   * - br $label
2395e97a28aSMark Cave-Ayland
2405e97a28aSMark Cave-Ayland     - | Jump to label.
2415e97a28aSMark Cave-Ayland
2425e97a28aSMark Cave-Ayland   * - brcond_i32/i64 *t0*, *t1*, *cond*, *label*
2435e97a28aSMark Cave-Ayland
2445e97a28aSMark Cave-Ayland     - | Conditional jump if *t0* *cond* *t1* is true. *cond* can be:
2455e97a28aSMark Cave-Ayland       |
2465e97a28aSMark Cave-Ayland       |   ``TCG_COND_EQ``
2475e97a28aSMark Cave-Ayland       |   ``TCG_COND_NE``
2485e97a28aSMark Cave-Ayland       |   ``TCG_COND_LT /* signed */``
2495e97a28aSMark Cave-Ayland       |   ``TCG_COND_GE /* signed */``
2505e97a28aSMark Cave-Ayland       |   ``TCG_COND_LE /* signed */``
2515e97a28aSMark Cave-Ayland       |   ``TCG_COND_GT /* signed */``
2525e97a28aSMark Cave-Ayland       |   ``TCG_COND_LTU /* unsigned */``
2535e97a28aSMark Cave-Ayland       |   ``TCG_COND_GEU /* unsigned */``
2545e97a28aSMark Cave-Ayland       |   ``TCG_COND_LEU /* unsigned */``
2555e97a28aSMark Cave-Ayland       |   ``TCG_COND_GTU /* unsigned */``
256*d48097d0SRichard Henderson       |   ``TCG_COND_TSTEQ /* t1 & t2 == 0 */``
257*d48097d0SRichard Henderson       |   ``TCG_COND_TSTNE /* t1 & t2 != 0 */``
2585e97a28aSMark Cave-Ayland
2595e97a28aSMark Cave-AylandArithmetic
2605e97a28aSMark Cave-Ayland----------
2615e97a28aSMark Cave-Ayland
2625e97a28aSMark Cave-Ayland.. list-table::
2635e97a28aSMark Cave-Ayland
2645e97a28aSMark Cave-Ayland   * - add_i32/i64 *t0*, *t1*, *t2*
2655e97a28aSMark Cave-Ayland
2665e97a28aSMark Cave-Ayland     - | *t0* = *t1* + *t2*
2675e97a28aSMark Cave-Ayland
2685e97a28aSMark Cave-Ayland   * - sub_i32/i64 *t0*, *t1*, *t2*
2695e97a28aSMark Cave-Ayland
2705e97a28aSMark Cave-Ayland     - | *t0* = *t1* - *t2*
2715e97a28aSMark Cave-Ayland
2725e97a28aSMark Cave-Ayland   * - neg_i32/i64 *t0*, *t1*
2735e97a28aSMark Cave-Ayland
2745e97a28aSMark Cave-Ayland     - | *t0* = -*t1* (two's complement)
2755e97a28aSMark Cave-Ayland
2765e97a28aSMark Cave-Ayland   * - mul_i32/i64 *t0*, *t1*, *t2*
2775e97a28aSMark Cave-Ayland
2785e97a28aSMark Cave-Ayland     - | *t0* = *t1* * *t2*
2795e97a28aSMark Cave-Ayland
2805e97a28aSMark Cave-Ayland   * - div_i32/i64 *t0*, *t1*, *t2*
2815e97a28aSMark Cave-Ayland
2825e97a28aSMark Cave-Ayland     - | *t0* = *t1* / *t2* (signed)
2835e97a28aSMark Cave-Ayland       | Undefined behavior if division by zero or overflow.
2845e97a28aSMark Cave-Ayland
2855e97a28aSMark Cave-Ayland   * - divu_i32/i64 *t0*, *t1*, *t2*
2865e97a28aSMark Cave-Ayland
2875e97a28aSMark Cave-Ayland     - | *t0* = *t1* / *t2* (unsigned)
2885e97a28aSMark Cave-Ayland       | Undefined behavior if division by zero.
2895e97a28aSMark Cave-Ayland
2905e97a28aSMark Cave-Ayland   * - rem_i32/i64 *t0*, *t1*, *t2*
2915e97a28aSMark Cave-Ayland
2925e97a28aSMark Cave-Ayland     - | *t0* = *t1* % *t2* (signed)
2935e97a28aSMark Cave-Ayland       | Undefined behavior if division by zero or overflow.
2945e97a28aSMark Cave-Ayland
2955e97a28aSMark Cave-Ayland   * - remu_i32/i64 *t0*, *t1*, *t2*
2965e97a28aSMark Cave-Ayland
2975e97a28aSMark Cave-Ayland     - | *t0* = *t1* % *t2* (unsigned)
2985e97a28aSMark Cave-Ayland       | Undefined behavior if division by zero.
2995e97a28aSMark Cave-Ayland
3005e97a28aSMark Cave-Ayland
3015e97a28aSMark Cave-AylandLogical
3025e97a28aSMark Cave-Ayland-------
3035e97a28aSMark Cave-Ayland
3045e97a28aSMark Cave-Ayland.. list-table::
3055e97a28aSMark Cave-Ayland
3065e97a28aSMark Cave-Ayland   * - and_i32/i64 *t0*, *t1*, *t2*
3075e97a28aSMark Cave-Ayland
3085e97a28aSMark Cave-Ayland     - | *t0* = *t1* & *t2*
3095e97a28aSMark Cave-Ayland
3105e97a28aSMark Cave-Ayland   * - or_i32/i64 *t0*, *t1*, *t2*
3115e97a28aSMark Cave-Ayland
3125e97a28aSMark Cave-Ayland     - | *t0* = *t1* | *t2*
3135e97a28aSMark Cave-Ayland
3145e97a28aSMark Cave-Ayland   * - xor_i32/i64 *t0*, *t1*, *t2*
3155e97a28aSMark Cave-Ayland
3165e97a28aSMark Cave-Ayland     - | *t0* = *t1* ^ *t2*
3175e97a28aSMark Cave-Ayland
3185e97a28aSMark Cave-Ayland   * - not_i32/i64 *t0*, *t1*
3195e97a28aSMark Cave-Ayland
3205e97a28aSMark Cave-Ayland     - | *t0* = ~\ *t1*
3215e97a28aSMark Cave-Ayland
3225e97a28aSMark Cave-Ayland   * - andc_i32/i64 *t0*, *t1*, *t2*
3235e97a28aSMark Cave-Ayland
3245e97a28aSMark Cave-Ayland     - | *t0* = *t1* & ~\ *t2*
3255e97a28aSMark Cave-Ayland
3265e97a28aSMark Cave-Ayland   * - eqv_i32/i64 *t0*, *t1*, *t2*
3275e97a28aSMark Cave-Ayland
3285e97a28aSMark Cave-Ayland     - | *t0* = ~(*t1* ^ *t2*), or equivalently, *t0* = *t1* ^ ~\ *t2*
3295e97a28aSMark Cave-Ayland
3305e97a28aSMark Cave-Ayland   * - nand_i32/i64 *t0*, *t1*, *t2*
3315e97a28aSMark Cave-Ayland
3325e97a28aSMark Cave-Ayland     - | *t0* = ~(*t1* & *t2*)
3335e97a28aSMark Cave-Ayland
3345e97a28aSMark Cave-Ayland   * - nor_i32/i64 *t0*, *t1*, *t2*
3355e97a28aSMark Cave-Ayland
3365e97a28aSMark Cave-Ayland     - | *t0* = ~(*t1* | *t2*)
3375e97a28aSMark Cave-Ayland
3385e97a28aSMark Cave-Ayland   * - orc_i32/i64 *t0*, *t1*, *t2*
3395e97a28aSMark Cave-Ayland
3405e97a28aSMark Cave-Ayland     - | *t0* = *t1* | ~\ *t2*
3415e97a28aSMark Cave-Ayland
3425e97a28aSMark Cave-Ayland   * - clz_i32/i64 *t0*, *t1*, *t2*
3435e97a28aSMark Cave-Ayland
3445e97a28aSMark Cave-Ayland     - | *t0* = *t1* ? clz(*t1*) : *t2*
3455e97a28aSMark Cave-Ayland
3465e97a28aSMark Cave-Ayland   * - ctz_i32/i64 *t0*, *t1*, *t2*
3475e97a28aSMark Cave-Ayland
3485e97a28aSMark Cave-Ayland     - | *t0* = *t1* ? ctz(*t1*) : *t2*
3495e97a28aSMark Cave-Ayland
3505e97a28aSMark Cave-Ayland   * - ctpop_i32/i64 *t0*, *t1*
3515e97a28aSMark Cave-Ayland
3525e97a28aSMark Cave-Ayland     - | *t0* = number of bits set in *t1*
3535e97a28aSMark Cave-Ayland       |
3545e97a28aSMark Cave-Ayland       | With *ctpop* short for "count population", matching
3555e97a28aSMark Cave-Ayland       | the function name used in ``include/qemu/host-utils.h``.
3565e97a28aSMark Cave-Ayland
3575e97a28aSMark Cave-Ayland
3585e97a28aSMark Cave-AylandShifts/Rotates
3595e97a28aSMark Cave-Ayland--------------
3605e97a28aSMark Cave-Ayland
3615e97a28aSMark Cave-Ayland.. list-table::
3625e97a28aSMark Cave-Ayland
3635e97a28aSMark Cave-Ayland   * - shl_i32/i64 *t0*, *t1*, *t2*
3645e97a28aSMark Cave-Ayland
3655e97a28aSMark Cave-Ayland     - | *t0* = *t1* << *t2*
3665e97a28aSMark Cave-Ayland       | Unspecified behavior if *t2* < 0 or *t2* >= 32 (resp 64)
3675e97a28aSMark Cave-Ayland
3685e97a28aSMark Cave-Ayland   * - shr_i32/i64 *t0*, *t1*, *t2*
3695e97a28aSMark Cave-Ayland
3705e97a28aSMark Cave-Ayland     - | *t0* = *t1* >> *t2* (unsigned)
3715e97a28aSMark Cave-Ayland       | Unspecified behavior if *t2* < 0 or *t2* >= 32 (resp 64)
3725e97a28aSMark Cave-Ayland
3735e97a28aSMark Cave-Ayland   * - sar_i32/i64 *t0*, *t1*, *t2*
3745e97a28aSMark Cave-Ayland
3755e97a28aSMark Cave-Ayland     - | *t0* = *t1* >> *t2* (signed)
3765e97a28aSMark Cave-Ayland       | Unspecified behavior if *t2* < 0 or *t2* >= 32 (resp 64)
3775e97a28aSMark Cave-Ayland
3785e97a28aSMark Cave-Ayland   * - rotl_i32/i64 *t0*, *t1*, *t2*
3795e97a28aSMark Cave-Ayland
3805e97a28aSMark Cave-Ayland     - | Rotation of *t2* bits to the left
3815e97a28aSMark Cave-Ayland       | Unspecified behavior if *t2* < 0 or *t2* >= 32 (resp 64)
3825e97a28aSMark Cave-Ayland
3835e97a28aSMark Cave-Ayland   * - rotr_i32/i64 *t0*, *t1*, *t2*
3845e97a28aSMark Cave-Ayland
3855e97a28aSMark Cave-Ayland     - | Rotation of *t2* bits to the right.
3865e97a28aSMark Cave-Ayland       | Unspecified behavior if *t2* < 0 or *t2* >= 32 (resp 64)
3875e97a28aSMark Cave-Ayland
3885e97a28aSMark Cave-Ayland
3895e97a28aSMark Cave-AylandMisc
3905e97a28aSMark Cave-Ayland----
3915e97a28aSMark Cave-Ayland
3925e97a28aSMark Cave-Ayland.. list-table::
3935e97a28aSMark Cave-Ayland
3945e97a28aSMark Cave-Ayland   * - mov_i32/i64 *t0*, *t1*
3955e97a28aSMark Cave-Ayland
3965e97a28aSMark Cave-Ayland     - | *t0* = *t1*
3975e97a28aSMark Cave-Ayland       | Move *t1* to *t0* (both operands must have the same type).
3985e97a28aSMark Cave-Ayland
3995e97a28aSMark Cave-Ayland   * - ext8s_i32/i64 *t0*, *t1*
4005e97a28aSMark Cave-Ayland
4015e97a28aSMark Cave-Ayland       ext8u_i32/i64 *t0*, *t1*
4025e97a28aSMark Cave-Ayland
4035e97a28aSMark Cave-Ayland       ext16s_i32/i64 *t0*, *t1*
4045e97a28aSMark Cave-Ayland
4055e97a28aSMark Cave-Ayland       ext16u_i32/i64 *t0*, *t1*
4065e97a28aSMark Cave-Ayland
4075e97a28aSMark Cave-Ayland       ext32s_i64 *t0*, *t1*
4085e97a28aSMark Cave-Ayland
4095e97a28aSMark Cave-Ayland       ext32u_i64 *t0*, *t1*
4105e97a28aSMark Cave-Ayland
4115e97a28aSMark Cave-Ayland     - | 8, 16 or 32 bit sign/zero extension (both operands must have the same type)
4125e97a28aSMark Cave-Ayland
4135e97a28aSMark Cave-Ayland   * - bswap16_i32/i64 *t0*, *t1*, *flags*
4145e97a28aSMark Cave-Ayland
4155e97a28aSMark Cave-Ayland     - | 16 bit byte swap on the low bits of a 32/64 bit input.
4165e97a28aSMark Cave-Ayland       |
4175e97a28aSMark Cave-Ayland       | If *flags* & ``TCG_BSWAP_IZ``, then *t1* is known to be zero-extended from bit 15.
4185e97a28aSMark Cave-Ayland       | If *flags* & ``TCG_BSWAP_OZ``, then *t0* will be zero-extended from bit 15.
4195e97a28aSMark Cave-Ayland       | If *flags* & ``TCG_BSWAP_OS``, then *t0* will be sign-extended from bit 15.
4205e97a28aSMark Cave-Ayland       |
4215e97a28aSMark Cave-Ayland       | If neither ``TCG_BSWAP_OZ`` nor ``TCG_BSWAP_OS`` are set, then the bits of *t0* above bit 15 may contain any value.
4225e97a28aSMark Cave-Ayland
4235e97a28aSMark Cave-Ayland   * - bswap32_i64 *t0*, *t1*, *flags*
4245e97a28aSMark Cave-Ayland
4255e97a28aSMark Cave-Ayland     - | 32 bit byte swap on a 64-bit value.  The flags are the same as for bswap16,
4265e97a28aSMark Cave-Ayland         except they apply from bit 31 instead of bit 15.
4275e97a28aSMark Cave-Ayland
4285e97a28aSMark Cave-Ayland   * - bswap32_i32 *t0*, *t1*, *flags*
4295e97a28aSMark Cave-Ayland
4305e97a28aSMark Cave-Ayland       bswap64_i64 *t0*, *t1*, *flags*
4315e97a28aSMark Cave-Ayland
4325e97a28aSMark Cave-Ayland     - | 32/64 bit byte swap. The flags are ignored, but still present
4335e97a28aSMark Cave-Ayland         for consistency with the other bswap opcodes.
4345e97a28aSMark Cave-Ayland
4355e97a28aSMark Cave-Ayland   * - discard_i32/i64 *t0*
4365e97a28aSMark Cave-Ayland
4375e97a28aSMark Cave-Ayland     - | Indicate that the value of *t0* won't be used later. It is useful to
4385e97a28aSMark Cave-Ayland         force dead code elimination.
4395e97a28aSMark Cave-Ayland
4405e97a28aSMark Cave-Ayland   * - deposit_i32/i64 *dest*, *t1*, *t2*, *pos*, *len*
4415e97a28aSMark Cave-Ayland
4425e97a28aSMark Cave-Ayland     - | Deposit *t2* as a bitfield into *t1*, placing the result in *dest*.
4435e97a28aSMark Cave-Ayland       |
4445e97a28aSMark Cave-Ayland       | The bitfield is described by *pos*/*len*, which are immediate values:
4455e97a28aSMark Cave-Ayland       |
4465e97a28aSMark Cave-Ayland       |     *len* - the length of the bitfield
4475e97a28aSMark Cave-Ayland       |     *pos* - the position of the first bit, counting from the LSB
4485e97a28aSMark Cave-Ayland       |
4495e97a28aSMark Cave-Ayland       | For example, "deposit_i32 dest, t1, t2, 8, 4" indicates a 4-bit field
4505e97a28aSMark Cave-Ayland         at bit 8. This operation would be equivalent to
4515e97a28aSMark Cave-Ayland       |
4525e97a28aSMark Cave-Ayland       |     *dest* = (*t1* & ~0x0f00) | ((*t2* << 8) & 0x0f00)
4535e97a28aSMark Cave-Ayland
4545e97a28aSMark Cave-Ayland   * - extract_i32/i64 *dest*, *t1*, *pos*, *len*
4555e97a28aSMark Cave-Ayland
4565e97a28aSMark Cave-Ayland       sextract_i32/i64 *dest*, *t1*, *pos*, *len*
4575e97a28aSMark Cave-Ayland
4585e97a28aSMark Cave-Ayland     - | Extract a bitfield from *t1*, placing the result in *dest*.
4595e97a28aSMark Cave-Ayland       |
4605e97a28aSMark Cave-Ayland       | The bitfield is described by *pos*/*len*, which are immediate values,
4615e97a28aSMark Cave-Ayland         as above for deposit.  For extract_*, the result will be extended
4625e97a28aSMark Cave-Ayland         to the left with zeros; for sextract_*, the result will be extended
4635e97a28aSMark Cave-Ayland         to the left with copies of the bitfield sign bit at *pos* + *len* - 1.
4645e97a28aSMark Cave-Ayland       |
4655e97a28aSMark Cave-Ayland       | For example, "sextract_i32 dest, t1, 8, 4" indicates a 4-bit field
4665e97a28aSMark Cave-Ayland         at bit 8. This operation would be equivalent to
4675e97a28aSMark Cave-Ayland       |
4685e97a28aSMark Cave-Ayland       |    *dest* = (*t1* << 20) >> 28
4695e97a28aSMark Cave-Ayland       |
4705e97a28aSMark Cave-Ayland       | (using an arithmetic right shift).
4715e97a28aSMark Cave-Ayland
4725e97a28aSMark Cave-Ayland   * - extract2_i32/i64 *dest*, *t1*, *t2*, *pos*
4735e97a28aSMark Cave-Ayland
4745e97a28aSMark Cave-Ayland     - | For N = {32,64}, extract an N-bit quantity from the concatenation
4755e97a28aSMark Cave-Ayland         of *t2*:*t1*, beginning at *pos*. The tcg_gen_extract2_{i32,i64} expander
4765e97a28aSMark Cave-Ayland         accepts 0 <= *pos* <= N as inputs. The backend code generator will
4775e97a28aSMark Cave-Ayland         not see either 0 or N as inputs for these opcodes.
4785e97a28aSMark Cave-Ayland
4795e97a28aSMark Cave-Ayland   * - extrl_i64_i32 *t0*, *t1*
4805e97a28aSMark Cave-Ayland
4815e97a28aSMark Cave-Ayland     - | For 64-bit hosts only, extract the low 32-bits of input *t1* and place it
4825e97a28aSMark Cave-Ayland         into 32-bit output *t0*.  Depending on the host, this may be a simple move,
4835e97a28aSMark Cave-Ayland         or may require additional canonicalization.
4845e97a28aSMark Cave-Ayland
4855e97a28aSMark Cave-Ayland   * - extrh_i64_i32 *t0*, *t1*
4865e97a28aSMark Cave-Ayland
4875e97a28aSMark Cave-Ayland     - | For 64-bit hosts only, extract the high 32-bits of input *t1* and place it
4885e97a28aSMark Cave-Ayland         into 32-bit output *t0*.  Depending on the host, this may be a simple shift,
4895e97a28aSMark Cave-Ayland         or may require additional canonicalization.
4905e97a28aSMark Cave-Ayland
4915e97a28aSMark Cave-Ayland
4925e97a28aSMark Cave-AylandConditional moves
4935e97a28aSMark Cave-Ayland-----------------
4945e97a28aSMark Cave-Ayland
4955e97a28aSMark Cave-Ayland.. list-table::
4965e97a28aSMark Cave-Ayland
4975e97a28aSMark Cave-Ayland   * - setcond_i32/i64 *dest*, *t1*, *t2*, *cond*
4985e97a28aSMark Cave-Ayland
4995e97a28aSMark Cave-Ayland     - | *dest* = (*t1* *cond* *t2*)
5005e97a28aSMark Cave-Ayland       |
5015e97a28aSMark Cave-Ayland       | Set *dest* to 1 if (*t1* *cond* *t2*) is true, otherwise set to 0.
5025e97a28aSMark Cave-Ayland
5033635502dSRichard Henderson   * - negsetcond_i32/i64 *dest*, *t1*, *t2*, *cond*
5043635502dSRichard Henderson
5053635502dSRichard Henderson     - | *dest* = -(*t1* *cond* *t2*)
5063635502dSRichard Henderson       |
5073635502dSRichard Henderson       | Set *dest* to -1 if (*t1* *cond* *t2*) is true, otherwise set to 0.
5083635502dSRichard Henderson
5095e97a28aSMark Cave-Ayland   * - movcond_i32/i64 *dest*, *c1*, *c2*, *v1*, *v2*, *cond*
5105e97a28aSMark Cave-Ayland
5115e97a28aSMark Cave-Ayland     - | *dest* = (*c1* *cond* *c2* ? *v1* : *v2*)
5125e97a28aSMark Cave-Ayland       |
5135e97a28aSMark Cave-Ayland       | Set *dest* to *v1* if (*c1* *cond* *c2*) is true, otherwise set to *v2*.
5145e97a28aSMark Cave-Ayland
5155e97a28aSMark Cave-Ayland
5165e97a28aSMark Cave-AylandType conversions
5175e97a28aSMark Cave-Ayland----------------
5185e97a28aSMark Cave-Ayland
5195e97a28aSMark Cave-Ayland.. list-table::
5205e97a28aSMark Cave-Ayland
5215e97a28aSMark Cave-Ayland   * - ext_i32_i64 *t0*, *t1*
5225e97a28aSMark Cave-Ayland
5235e97a28aSMark Cave-Ayland     - | Convert *t1* (32 bit) to *t0* (64 bit) and does sign extension
5245e97a28aSMark Cave-Ayland
5255e97a28aSMark Cave-Ayland   * - extu_i32_i64 *t0*, *t1*
5265e97a28aSMark Cave-Ayland
5275e97a28aSMark Cave-Ayland     - | Convert *t1* (32 bit) to *t0* (64 bit) and does zero extension
5285e97a28aSMark Cave-Ayland
5295e97a28aSMark Cave-Ayland   * - trunc_i64_i32 *t0*, *t1*
5305e97a28aSMark Cave-Ayland
5315e97a28aSMark Cave-Ayland     - | Truncate *t1* (64 bit) to *t0* (32 bit)
5325e97a28aSMark Cave-Ayland
5335e97a28aSMark Cave-Ayland   * - concat_i32_i64 *t0*, *t1*, *t2*
5345e97a28aSMark Cave-Ayland
5355e97a28aSMark Cave-Ayland     - | Construct *t0* (64-bit) taking the low half from *t1* (32 bit) and the high half
5365e97a28aSMark Cave-Ayland         from *t2* (32 bit).
5375e97a28aSMark Cave-Ayland
5385e97a28aSMark Cave-Ayland   * - concat32_i64 *t0*, *t1*, *t2*
5395e97a28aSMark Cave-Ayland
5405e97a28aSMark Cave-Ayland     - | Construct *t0* (64-bit) taking the low half from *t1* (64 bit) and the high half
5415e97a28aSMark Cave-Ayland         from *t2* (64 bit).
5425e97a28aSMark Cave-Ayland
5435e97a28aSMark Cave-Ayland
5445e97a28aSMark Cave-AylandLoad/Store
5455e97a28aSMark Cave-Ayland----------
5465e97a28aSMark Cave-Ayland
5475e97a28aSMark Cave-Ayland.. list-table::
5485e97a28aSMark Cave-Ayland
5495e97a28aSMark Cave-Ayland   * - ld_i32/i64 *t0*, *t1*, *offset*
5505e97a28aSMark Cave-Ayland
5515e97a28aSMark Cave-Ayland       ld8s_i32/i64 *t0*, *t1*, *offset*
5525e97a28aSMark Cave-Ayland
5535e97a28aSMark Cave-Ayland       ld8u_i32/i64 *t0*, *t1*, *offset*
5545e97a28aSMark Cave-Ayland
5555e97a28aSMark Cave-Ayland       ld16s_i32/i64 *t0*, *t1*, *offset*
5565e97a28aSMark Cave-Ayland
5575e97a28aSMark Cave-Ayland       ld16u_i32/i64 *t0*, *t1*, *offset*
5585e97a28aSMark Cave-Ayland
5595e97a28aSMark Cave-Ayland       ld32s_i64 t0, *t1*, *offset*
5605e97a28aSMark Cave-Ayland
5615e97a28aSMark Cave-Ayland       ld32u_i64 t0, *t1*, *offset*
5625e97a28aSMark Cave-Ayland
5635e97a28aSMark Cave-Ayland     - | *t0* = read(*t1* + *offset*)
5645e97a28aSMark Cave-Ayland       |
5655e97a28aSMark Cave-Ayland       | Load 8, 16, 32 or 64 bits with or without sign extension from host memory.
5665e97a28aSMark Cave-Ayland         *offset* must be a constant.
5675e97a28aSMark Cave-Ayland
5685e97a28aSMark Cave-Ayland   * - st_i32/i64 *t0*, *t1*, *offset*
5695e97a28aSMark Cave-Ayland
5705e97a28aSMark Cave-Ayland       st8_i32/i64 *t0*, *t1*, *offset*
5715e97a28aSMark Cave-Ayland
5725e97a28aSMark Cave-Ayland       st16_i32/i64 *t0*, *t1*, *offset*
5735e97a28aSMark Cave-Ayland
5745e97a28aSMark Cave-Ayland       st32_i64 *t0*, *t1*, *offset*
5755e97a28aSMark Cave-Ayland
5765e97a28aSMark Cave-Ayland     - | write(*t0*, *t1* + *offset*)
5775e97a28aSMark Cave-Ayland       |
5785e97a28aSMark Cave-Ayland       | Write 8, 16, 32 or 64 bits to host memory.
5795e97a28aSMark Cave-Ayland
5805e97a28aSMark Cave-AylandAll this opcodes assume that the pointed host memory doesn't correspond
5815e97a28aSMark Cave-Aylandto a global. In the latter case the behaviour is unpredictable.
5825e97a28aSMark Cave-Ayland
5835e97a28aSMark Cave-Ayland
5845e97a28aSMark Cave-AylandMultiword arithmetic support
5855e97a28aSMark Cave-Ayland----------------------------
5865e97a28aSMark Cave-Ayland
5875e97a28aSMark Cave-Ayland.. list-table::
5885e97a28aSMark Cave-Ayland
5895e97a28aSMark Cave-Ayland   * - add2_i32/i64 *t0_low*, *t0_high*, *t1_low*, *t1_high*, *t2_low*, *t2_high*
5905e97a28aSMark Cave-Ayland
5915e97a28aSMark Cave-Ayland       sub2_i32/i64 *t0_low*, *t0_high*, *t1_low*, *t1_high*, *t2_low*, *t2_high*
5925e97a28aSMark Cave-Ayland
5935e97a28aSMark Cave-Ayland     - | Similar to add/sub, except that the double-word inputs *t1* and *t2* are
5945e97a28aSMark Cave-Ayland         formed from two single-word arguments, and the double-word output *t0*
5955e97a28aSMark Cave-Ayland         is returned in two single-word outputs.
5965e97a28aSMark Cave-Ayland
5975e97a28aSMark Cave-Ayland   * - mulu2_i32/i64 *t0_low*, *t0_high*, *t1*, *t2*
5985e97a28aSMark Cave-Ayland
5995e97a28aSMark Cave-Ayland     - | Similar to mul, except two unsigned inputs *t1* and *t2* yielding the full
6005e97a28aSMark Cave-Ayland         double-word product *t0*. The latter is returned in two single-word outputs.
6015e97a28aSMark Cave-Ayland
6025e97a28aSMark Cave-Ayland   * - muls2_i32/i64 *t0_low*, *t0_high*, *t1*, *t2*
6035e97a28aSMark Cave-Ayland
6045e97a28aSMark Cave-Ayland     - | Similar to mulu2, except the two inputs *t1* and *t2* are signed.
6055e97a28aSMark Cave-Ayland
6065e97a28aSMark Cave-Ayland   * - mulsh_i32/i64 *t0*, *t1*, *t2*
6075e97a28aSMark Cave-Ayland
6085e97a28aSMark Cave-Ayland       muluh_i32/i64 *t0*, *t1*, *t2*
6095e97a28aSMark Cave-Ayland
6105e97a28aSMark Cave-Ayland     - | Provide the high part of a signed or unsigned multiply, respectively.
6115e97a28aSMark Cave-Ayland       |
6125e97a28aSMark Cave-Ayland       | If mulu2/muls2 are not provided by the backend, the tcg-op generator
6135e97a28aSMark Cave-Ayland         can obtain the same results by emitting a pair of opcodes, mul + muluh/mulsh.
6145e97a28aSMark Cave-Ayland
6155e97a28aSMark Cave-Ayland
6165e97a28aSMark Cave-AylandMemory Barrier support
6175e97a28aSMark Cave-Ayland----------------------
6185e97a28aSMark Cave-Ayland
6195e97a28aSMark Cave-Ayland.. list-table::
6205e97a28aSMark Cave-Ayland
6215e97a28aSMark Cave-Ayland   * - mb *<$arg>*
6225e97a28aSMark Cave-Ayland
6235e97a28aSMark Cave-Ayland     - | Generate a target memory barrier instruction to ensure memory ordering
6245e97a28aSMark Cave-Ayland         as being  enforced by a corresponding guest memory barrier instruction.
6255e97a28aSMark Cave-Ayland       |
6265e97a28aSMark Cave-Ayland       | The ordering enforced by the backend may be stricter than the ordering
6275e97a28aSMark Cave-Ayland         required by the guest. It cannot be weaker. This opcode takes a constant
6285e97a28aSMark Cave-Ayland         argument which is required to generate the appropriate barrier
6295e97a28aSMark Cave-Ayland         instruction. The backend should take care to emit the target barrier
6305e97a28aSMark Cave-Ayland         instruction only when necessary i.e., for SMP guests and when MTTCG is
6315e97a28aSMark Cave-Ayland         enabled.
6325e97a28aSMark Cave-Ayland       |
6335e97a28aSMark Cave-Ayland       | The guest translators should generate this opcode for all guest instructions
6345e97a28aSMark Cave-Ayland         which have ordering side effects.
6355e97a28aSMark Cave-Ayland       |
6365e97a28aSMark Cave-Ayland       | Please see :ref:`atomics-ref` for more information on memory barriers.
6375e97a28aSMark Cave-Ayland
6385e97a28aSMark Cave-Ayland
6395e97a28aSMark Cave-Ayland64-bit guest on 32-bit host support
6405e97a28aSMark Cave-Ayland-----------------------------------
6415e97a28aSMark Cave-Ayland
6425e97a28aSMark Cave-AylandThe following opcodes are internal to TCG.  Thus they are to be implemented by
6435e97a28aSMark Cave-Ayland32-bit host code generators, but are not to be emitted by guest translators.
6445e97a28aSMark Cave-AylandThey are emitted as needed by inline functions within ``tcg-op.h``.
6455e97a28aSMark Cave-Ayland
6465e97a28aSMark Cave-Ayland.. list-table::
6475e97a28aSMark Cave-Ayland
6485e97a28aSMark Cave-Ayland   * - brcond2_i32 *t0_low*, *t0_high*, *t1_low*, *t1_high*, *cond*, *label*
6495e97a28aSMark Cave-Ayland
6505e97a28aSMark Cave-Ayland     - | Similar to brcond, except that the 64-bit values *t0* and *t1*
6515e97a28aSMark Cave-Ayland         are formed from two 32-bit arguments.
6525e97a28aSMark Cave-Ayland
6535e97a28aSMark Cave-Ayland   * - setcond2_i32 *dest*, *t1_low*, *t1_high*, *t2_low*, *t2_high*, *cond*
6545e97a28aSMark Cave-Ayland
6555e97a28aSMark Cave-Ayland     - | Similar to setcond, except that the 64-bit values *t1* and *t2* are
6565e97a28aSMark Cave-Ayland         formed from two 32-bit arguments. The result is a 32-bit value.
6575e97a28aSMark Cave-Ayland
6585e97a28aSMark Cave-Ayland
6595e97a28aSMark Cave-AylandQEMU specific operations
6605e97a28aSMark Cave-Ayland------------------------
6615e97a28aSMark Cave-Ayland
6625e97a28aSMark Cave-Ayland.. list-table::
6635e97a28aSMark Cave-Ayland
6645e97a28aSMark Cave-Ayland   * - exit_tb *t0*
6655e97a28aSMark Cave-Ayland
6665e97a28aSMark Cave-Ayland     - | Exit the current TB and return the value *t0* (word type).
6675e97a28aSMark Cave-Ayland
6685e97a28aSMark Cave-Ayland   * - goto_tb *index*
6695e97a28aSMark Cave-Ayland
6705e97a28aSMark Cave-Ayland     - | Exit the current TB and jump to the TB index *index* (constant) if the
6715e97a28aSMark Cave-Ayland         current TB was linked to this TB. Otherwise execute the next
6725e97a28aSMark Cave-Ayland         instructions. Only indices 0 and 1 are valid and tcg_gen_goto_tb may be issued
6735e97a28aSMark Cave-Ayland         at most once with each slot index per TB.
6745e97a28aSMark Cave-Ayland
6755e97a28aSMark Cave-Ayland   * - lookup_and_goto_ptr *tb_addr*
6765e97a28aSMark Cave-Ayland
6775e97a28aSMark Cave-Ayland     - | Look up a TB address *tb_addr* and jump to it if valid. If not valid,
6785e97a28aSMark Cave-Ayland         jump to the TCG epilogue to go back to the exec loop.
6795e97a28aSMark Cave-Ayland       |
6805e97a28aSMark Cave-Ayland       | This operation is optional. If the TCG backend does not implement the
6815e97a28aSMark Cave-Ayland         goto_ptr opcode, emitting this op is equivalent to emitting exit_tb(0).
6825e97a28aSMark Cave-Ayland
68312fde9bcSRichard Henderson   * - qemu_ld_i32/i64/i128 *t0*, *t1*, *flags*, *memidx*
6845e97a28aSMark Cave-Ayland
68512fde9bcSRichard Henderson       qemu_st_i32/i64/i128 *t0*, *t1*, *flags*, *memidx*
6865e97a28aSMark Cave-Ayland
6875e97a28aSMark Cave-Ayland       qemu_st8_i32 *t0*, *t1*, *flags*, *memidx*
6885e97a28aSMark Cave-Ayland
6895e97a28aSMark Cave-Ayland     - | Load data at the guest address *t1* into *t0*, or store data in *t0* at guest
69012fde9bcSRichard Henderson         address *t1*.  The _i32/_i64/_i128 size applies to the size of the input/output
6915e97a28aSMark Cave-Ayland         register *t0* only.  The address *t1* is always sized according to the guest,
6925e97a28aSMark Cave-Ayland         and the width of the memory operation is controlled by *flags*.
6935e97a28aSMark Cave-Ayland       |
6945e97a28aSMark Cave-Ayland       | Both *t0* and *t1* may be split into little-endian ordered pairs of registers
69512fde9bcSRichard Henderson         if dealing with 64-bit quantities on a 32-bit host, or 128-bit quantities on
69612fde9bcSRichard Henderson         a 64-bit host.
6975e97a28aSMark Cave-Ayland       |
6985e97a28aSMark Cave-Ayland       | The *memidx* selects the qemu tlb index to use (e.g. user or kernel access).
6995e97a28aSMark Cave-Ayland         The flags are the MemOp bits, selecting the sign, width, and endianness
7005e97a28aSMark Cave-Ayland         of the memory access.
7015e97a28aSMark Cave-Ayland       |
7025e97a28aSMark Cave-Ayland       | For a 32-bit host, qemu_ld/st_i64 is guaranteed to only be used with a
7035e97a28aSMark Cave-Ayland         64-bit memory access specified in *flags*.
7045e97a28aSMark Cave-Ayland       |
70512fde9bcSRichard Henderson       | For qemu_ld/st_i128, these are only supported for a 64-bit host.
70612fde9bcSRichard Henderson       |
7075e97a28aSMark Cave-Ayland       | For i386, qemu_st8_i32 is exactly like qemu_st_i32, except the size of
7085e97a28aSMark Cave-Ayland         the memory operation is known to be 8-bit.  This allows the backend to
7095e97a28aSMark Cave-Ayland         provide a different set of register constraints.
7105e97a28aSMark Cave-Ayland
7115e97a28aSMark Cave-Ayland
7125e97a28aSMark Cave-AylandHost vector operations
7135e97a28aSMark Cave-Ayland----------------------
7145e97a28aSMark Cave-Ayland
7155e97a28aSMark Cave-AylandAll of the vector ops have two parameters, ``TCGOP_VECL`` & ``TCGOP_VECE``.
7165e97a28aSMark Cave-AylandThe former specifies the length of the vector in log2 64-bit units; the
7175e97a28aSMark Cave-Aylandlatter specifies the length of the element (if applicable) in log2 8-bit units.
7185e97a28aSMark Cave-AylandE.g. VECL = 1 -> 64 << 1 -> v128, and VECE = 2 -> 1 << 2 -> i32.
7195e97a28aSMark Cave-Ayland
7205e97a28aSMark Cave-Ayland.. list-table::
7215e97a28aSMark Cave-Ayland
7225e97a28aSMark Cave-Ayland   * - mov_vec *v0*, *v1*
723b08caa6dSMark Cave-Ayland
7245e97a28aSMark Cave-Ayland       ld_vec *v0*, *t1*
725b08caa6dSMark Cave-Ayland
7265e97a28aSMark Cave-Ayland       st_vec *v0*, *t1*
7275e97a28aSMark Cave-Ayland
7285e97a28aSMark Cave-Ayland     - | Move, load and store.
7295e97a28aSMark Cave-Ayland
7305e97a28aSMark Cave-Ayland   * - dup_vec *v0*, *r1*
7315e97a28aSMark Cave-Ayland
7325e97a28aSMark Cave-Ayland     - | Duplicate the low N bits of *r1* into VECL/VECE copies across *v0*.
7335e97a28aSMark Cave-Ayland
7345e97a28aSMark Cave-Ayland   * - dupi_vec *v0*, *c*
7355e97a28aSMark Cave-Ayland
7365e97a28aSMark Cave-Ayland     - | Similarly, for a constant.
7375e97a28aSMark Cave-Ayland       | Smaller values will be replicated to host register size by the expanders.
7385e97a28aSMark Cave-Ayland
7395e97a28aSMark Cave-Ayland   * - dup2_vec *v0*, *r1*, *r2*
7405e97a28aSMark Cave-Ayland
7415e97a28aSMark Cave-Ayland     - | Duplicate *r2*:*r1* into VECL/64 copies across *v0*. This opcode is
7425e97a28aSMark Cave-Ayland         only present for 32-bit hosts.
7435e97a28aSMark Cave-Ayland
7445e97a28aSMark Cave-Ayland   * - add_vec *v0*, *v1*, *v2*
7455e97a28aSMark Cave-Ayland
7465e97a28aSMark Cave-Ayland     - | *v0* = *v1* + *v2*, in elements across the vector.
7475e97a28aSMark Cave-Ayland
7485e97a28aSMark Cave-Ayland   * - sub_vec *v0*, *v1*, *v2*
7495e97a28aSMark Cave-Ayland
7505e97a28aSMark Cave-Ayland     - | Similarly, *v0* = *v1* - *v2*.
7515e97a28aSMark Cave-Ayland
7525e97a28aSMark Cave-Ayland   * - mul_vec *v0*, *v1*, *v2*
7535e97a28aSMark Cave-Ayland
7545e97a28aSMark Cave-Ayland     - | Similarly, *v0* = *v1* * *v2*.
7555e97a28aSMark Cave-Ayland
7565e97a28aSMark Cave-Ayland   * - neg_vec *v0*, *v1*
7575e97a28aSMark Cave-Ayland
7585e97a28aSMark Cave-Ayland     - | Similarly, *v0* = -*v1*.
7595e97a28aSMark Cave-Ayland
7605e97a28aSMark Cave-Ayland   * - abs_vec *v0*, *v1*
7615e97a28aSMark Cave-Ayland
7625e97a28aSMark Cave-Ayland     - | Similarly, *v0* = *v1* < 0 ? -*v1* : *v1*, in elements across the vector.
7635e97a28aSMark Cave-Ayland
7645e97a28aSMark Cave-Ayland   * - smin_vec *v0*, *v1*, *v2*
7655e97a28aSMark Cave-Ayland
7665e97a28aSMark Cave-Ayland       umin_vec *v0*, *v1*, *v2*
7675e97a28aSMark Cave-Ayland
7685e97a28aSMark Cave-Ayland     - | Similarly, *v0* = MIN(*v1*, *v2*), for signed and unsigned element types.
7695e97a28aSMark Cave-Ayland
7705e97a28aSMark Cave-Ayland   * - smax_vec *v0*, *v1*, *v2*
7715e97a28aSMark Cave-Ayland
7725e97a28aSMark Cave-Ayland       umax_vec *v0*, *v1*, *v2*
7735e97a28aSMark Cave-Ayland
7745e97a28aSMark Cave-Ayland     - | Similarly, *v0* = MAX(*v1*, *v2*), for signed and unsigned element types.
7755e97a28aSMark Cave-Ayland
7765e97a28aSMark Cave-Ayland   * - ssadd_vec *v0*, *v1*, *v2*
7775e97a28aSMark Cave-Ayland
7785e97a28aSMark Cave-Ayland       sssub_vec *v0*, *v1*, *v2*
7795e97a28aSMark Cave-Ayland
7805e97a28aSMark Cave-Ayland       usadd_vec *v0*, *v1*, *v2*
7815e97a28aSMark Cave-Ayland
7825e97a28aSMark Cave-Ayland       ussub_vec *v0*, *v1*, *v2*
7835e97a28aSMark Cave-Ayland
7845e97a28aSMark Cave-Ayland     - | Signed and unsigned saturating addition and subtraction.
7855e97a28aSMark Cave-Ayland       |
7865e97a28aSMark Cave-Ayland       | If the true result is not representable within the element type, the
7875e97a28aSMark Cave-Ayland         element is set to the minimum or maximum value for the type.
7885e97a28aSMark Cave-Ayland
7895e97a28aSMark Cave-Ayland   * - and_vec *v0*, *v1*, *v2*
7905e97a28aSMark Cave-Ayland
7915e97a28aSMark Cave-Ayland       or_vec *v0*, *v1*, *v2*
7925e97a28aSMark Cave-Ayland
7935e97a28aSMark Cave-Ayland       xor_vec *v0*, *v1*, *v2*
7945e97a28aSMark Cave-Ayland
7955e97a28aSMark Cave-Ayland       andc_vec *v0*, *v1*, *v2*
7965e97a28aSMark Cave-Ayland
7975e97a28aSMark Cave-Ayland       orc_vec *v0*, *v1*, *v2*
7985e97a28aSMark Cave-Ayland
7995e97a28aSMark Cave-Ayland       not_vec *v0*, *v1*
8005e97a28aSMark Cave-Ayland
8015e97a28aSMark Cave-Ayland     - | Similarly, logical operations with and without complement.
8025e97a28aSMark Cave-Ayland       |
8035e97a28aSMark Cave-Ayland       | Note that VECE is unused.
8045e97a28aSMark Cave-Ayland
8055e97a28aSMark Cave-Ayland   * - shli_vec *v0*, *v1*, *i2*
8065e97a28aSMark Cave-Ayland
8075e97a28aSMark Cave-Ayland       shls_vec *v0*, *v1*, *s2*
8085e97a28aSMark Cave-Ayland
8095e97a28aSMark Cave-Ayland     - | Shift all elements from v1 by a scalar *i2*/*s2*. I.e.
8105e97a28aSMark Cave-Ayland
8115e97a28aSMark Cave-Ayland       .. code-block:: c
8125e97a28aSMark Cave-Ayland
8135e97a28aSMark Cave-Ayland          for (i = 0; i < VECL/VECE; ++i) {
8145e97a28aSMark Cave-Ayland              v0[i] = v1[i] << s2;
8155e97a28aSMark Cave-Ayland          }
8165e97a28aSMark Cave-Ayland
8175e97a28aSMark Cave-Ayland   * - shri_vec *v0*, *v1*, *i2*
8185e97a28aSMark Cave-Ayland
8195e97a28aSMark Cave-Ayland       sari_vec *v0*, *v1*, *i2*
8205e97a28aSMark Cave-Ayland
8215e97a28aSMark Cave-Ayland       rotli_vec *v0*, *v1*, *i2*
8225e97a28aSMark Cave-Ayland
8235e97a28aSMark Cave-Ayland       shrs_vec *v0*, *v1*, *s2*
8245e97a28aSMark Cave-Ayland
8255e97a28aSMark Cave-Ayland       sars_vec *v0*, *v1*, *s2*
8265e97a28aSMark Cave-Ayland
8275e97a28aSMark Cave-Ayland     - | Similarly for logical and arithmetic right shift, and left rotate.
8285e97a28aSMark Cave-Ayland
8295e97a28aSMark Cave-Ayland   * - shlv_vec *v0*, *v1*, *v2*
8305e97a28aSMark Cave-Ayland
8315e97a28aSMark Cave-Ayland     - | Shift elements from *v1* by elements from *v2*. I.e.
8325e97a28aSMark Cave-Ayland
8335e97a28aSMark Cave-Ayland       .. code-block:: c
8345e97a28aSMark Cave-Ayland
8355e97a28aSMark Cave-Ayland          for (i = 0; i < VECL/VECE; ++i) {
8365e97a28aSMark Cave-Ayland              v0[i] = v1[i] << v2[i];
8375e97a28aSMark Cave-Ayland          }
8385e97a28aSMark Cave-Ayland
8395e97a28aSMark Cave-Ayland   * - shrv_vec *v0*, *v1*, *v2*
8405e97a28aSMark Cave-Ayland
8415e97a28aSMark Cave-Ayland       sarv_vec *v0*, *v1*, *v2*
8425e97a28aSMark Cave-Ayland
8435e97a28aSMark Cave-Ayland       rotlv_vec *v0*, *v1*, *v2*
8445e97a28aSMark Cave-Ayland
8455e97a28aSMark Cave-Ayland       rotrv_vec *v0*, *v1*, *v2*
8465e97a28aSMark Cave-Ayland
8475e97a28aSMark Cave-Ayland     - | Similarly for logical and arithmetic right shift, and rotates.
8485e97a28aSMark Cave-Ayland
8495e97a28aSMark Cave-Ayland   * - cmp_vec *v0*, *v1*, *v2*, *cond*
8505e97a28aSMark Cave-Ayland
8515e97a28aSMark Cave-Ayland     - | Compare vectors by element, storing -1 for true and 0 for false.
8525e97a28aSMark Cave-Ayland
8535e97a28aSMark Cave-Ayland   * - bitsel_vec *v0*, *v1*, *v2*, *v3*
8545e97a28aSMark Cave-Ayland
8555e97a28aSMark Cave-Ayland     - | Bitwise select, *v0* = (*v2* & *v1*) | (*v3* & ~\ *v1*), across the entire vector.
8565e97a28aSMark Cave-Ayland
8575e97a28aSMark Cave-Ayland   * - cmpsel_vec *v0*, *c1*, *c2*, *v3*, *v4*, *cond*
8585e97a28aSMark Cave-Ayland
8595e97a28aSMark Cave-Ayland     - | Select elements based on comparison results:
8605e97a28aSMark Cave-Ayland
8615e97a28aSMark Cave-Ayland       .. code-block:: c
8625e97a28aSMark Cave-Ayland
8635e97a28aSMark Cave-Ayland          for (i = 0; i < n; ++i) {
8645e97a28aSMark Cave-Ayland              v0[i] = (c1[i] cond c2[i]) ? v3[i] : v4[i].
8655e97a28aSMark Cave-Ayland          }
8665e97a28aSMark Cave-Ayland
8675e97a28aSMark Cave-Ayland**Note 1**: Some shortcuts are defined when the last operand is known to be
8685e97a28aSMark Cave-Aylanda constant (e.g. addi for add, movi for mov).
8695e97a28aSMark Cave-Ayland
8705e97a28aSMark Cave-Ayland**Note 2**: When using TCG, the opcodes must never be generated directly
8715e97a28aSMark Cave-Aylandas some of them may not be available as "real" opcodes. Always use the
8725e97a28aSMark Cave-Aylandfunction tcg_gen_xxx(args).
8735e97a28aSMark Cave-Ayland
8745e97a28aSMark Cave-Ayland
8755e97a28aSMark Cave-AylandBackend
8765e97a28aSMark Cave-Ayland=======
8775e97a28aSMark Cave-Ayland
8785e97a28aSMark Cave-Ayland``tcg-target.h`` contains the target specific definitions. ``tcg-target.c.inc``
8795e97a28aSMark Cave-Aylandcontains the target specific code; it is #included by ``tcg/tcg.c``, rather
8805e97a28aSMark Cave-Aylandthan being a standalone C file.
8815e97a28aSMark Cave-Ayland
8825e97a28aSMark Cave-AylandAssumptions
8835e97a28aSMark Cave-Ayland-----------
8845e97a28aSMark Cave-Ayland
8855e97a28aSMark Cave-AylandThe target word size (``TCG_TARGET_REG_BITS``) is expected to be 32 bit or
8865e97a28aSMark Cave-Ayland64 bit. It is expected that the pointer has the same size as the word.
8875e97a28aSMark Cave-Ayland
8885e97a28aSMark Cave-AylandOn a 32 bit target, all 64 bit operations are converted to 32 bits. A
8895e97a28aSMark Cave-Aylandfew specific operations must be implemented to allow it (see add2_i32,
8905e97a28aSMark Cave-Aylandsub2_i32, brcond2_i32).
8915e97a28aSMark Cave-Ayland
8925e97a28aSMark Cave-AylandOn a 64 bit target, the values are transferred between 32 and 64-bit
8935e97a28aSMark Cave-Aylandregisters using the following ops:
8945e97a28aSMark Cave-Ayland
895bb9d7ee8SPhilippe Mathieu-Daudé- extrl_i64_i32
896bb9d7ee8SPhilippe Mathieu-Daudé- extrh_i64_i32
8975e97a28aSMark Cave-Ayland- ext_i32_i64
8985e97a28aSMark Cave-Ayland- extu_i32_i64
8995e97a28aSMark Cave-Ayland
9005e97a28aSMark Cave-AylandThey ensure that the values are correctly truncated or extended when
9015e97a28aSMark Cave-Aylandmoved from a 32-bit to a 64-bit register or vice-versa. Note that the
902bb9d7ee8SPhilippe Mathieu-Daudéextrl_i64_i32 and extrh_i64_i32 are optional ops. It is not necessary
903bb9d7ee8SPhilippe Mathieu-Daudéto implement them if all the following conditions are met:
9045e97a28aSMark Cave-Ayland
9055e97a28aSMark Cave-Ayland- 64-bit registers can hold 32-bit values
9065e97a28aSMark Cave-Ayland- 32-bit values in a 64-bit register do not need to stay zero or
9075e97a28aSMark Cave-Ayland  sign extended
9085e97a28aSMark Cave-Ayland- all 32-bit TCG ops ignore the high part of 64-bit registers
9095e97a28aSMark Cave-Ayland
9105e97a28aSMark Cave-AylandFloating point operations are not supported in this version. A
9115e97a28aSMark Cave-Aylandprevious incarnation of the code generator had full support of them,
9125e97a28aSMark Cave-Aylandbut it is better to concentrate on integer operations first.
9135e97a28aSMark Cave-Ayland
9145e97a28aSMark Cave-AylandConstraints
9155e97a28aSMark Cave-Ayland----------------
9165e97a28aSMark Cave-Ayland
9175e97a28aSMark Cave-AylandGCC like constraints are used to define the constraints of every
9185e97a28aSMark Cave-Aylandinstruction. Memory constraints are not supported in this
9195e97a28aSMark Cave-Aylandversion. Aliases are specified in the input operands as for GCC.
9205e97a28aSMark Cave-Ayland
9215e97a28aSMark Cave-AylandThe same register may be used for both an input and an output, even when
9225e97a28aSMark Cave-Aylandthey are not explicitly aliased.  If an op expands to multiple target
9235e97a28aSMark Cave-Aylandinstructions then care must be taken to avoid clobbering input values.
9245e97a28aSMark Cave-AylandGCC style "early clobber" outputs are supported, with '``&``'.
9255e97a28aSMark Cave-Ayland
9265e97a28aSMark Cave-AylandA target can define specific register or constant constraints. If an
9275e97a28aSMark Cave-Aylandoperation uses a constant input constraint which does not allow all
9285e97a28aSMark Cave-Aylandconstants, it must also accept registers in order to have a fallback.
9295e97a28aSMark Cave-AylandThe constraint '``i``' is defined generically to accept any constant.
9305e97a28aSMark Cave-AylandThe constraint '``r``' is not defined generically, but is consistently
9315e97a28aSMark Cave-Aylandused by each backend to indicate all registers.
9325e97a28aSMark Cave-Ayland
9335e97a28aSMark Cave-AylandThe movi_i32 and movi_i64 operations must accept any constants.
9345e97a28aSMark Cave-Ayland
9355e97a28aSMark Cave-AylandThe mov_i32 and mov_i64 operations must accept any registers of the
9365e97a28aSMark Cave-Aylandsame type.
9375e97a28aSMark Cave-Ayland
9385e97a28aSMark Cave-AylandThe ld/st/sti instructions must accept signed 32 bit constant offsets.
9395e97a28aSMark Cave-AylandThis can be implemented by reserving a specific register in which to
9405e97a28aSMark Cave-Aylandcompute the address if the offset is too big.
9415e97a28aSMark Cave-Ayland
9425e97a28aSMark Cave-AylandThe ld/st instructions must accept any destination (ld) or source (st)
9435e97a28aSMark Cave-Aylandregister.
9445e97a28aSMark Cave-Ayland
9455e97a28aSMark Cave-AylandThe sti instruction may fail if it cannot store the given constant.
9465e97a28aSMark Cave-Ayland
9475e97a28aSMark Cave-AylandFunction call assumptions
9485e97a28aSMark Cave-Ayland-------------------------
9495e97a28aSMark Cave-Ayland
9505e97a28aSMark Cave-Ayland- The only supported types for parameters and return value are: 32 and
9515e97a28aSMark Cave-Ayland  64 bit integers and pointer.
9525e97a28aSMark Cave-Ayland- The stack grows downwards.
9535e97a28aSMark Cave-Ayland- The first N parameters are passed in registers.
9545e97a28aSMark Cave-Ayland- The next parameters are passed on the stack by storing them as words.
9555e97a28aSMark Cave-Ayland- Some registers are clobbered during the call.
9565e97a28aSMark Cave-Ayland- The function can return 0 or 1 value in registers. On a 32 bit
9575e97a28aSMark Cave-Ayland  target, functions must be able to return 2 values in registers for
9585e97a28aSMark Cave-Ayland  64 bit return type.
9595e97a28aSMark Cave-Ayland
9605e97a28aSMark Cave-Ayland
9615e97a28aSMark Cave-AylandRecommended coding rules for best performance
9625e97a28aSMark Cave-Ayland=============================================
9635e97a28aSMark Cave-Ayland
9645e97a28aSMark Cave-Ayland- Use globals to represent the parts of the QEMU CPU state which are
9655e97a28aSMark Cave-Ayland  often modified, e.g. the integer registers and the condition
9665e97a28aSMark Cave-Ayland  codes. TCG will be able to use host registers to store them.
9675e97a28aSMark Cave-Ayland
9685e97a28aSMark Cave-Ayland- Don't hesitate to use helpers for complicated or seldom used guest
9695e97a28aSMark Cave-Ayland  instructions. There is little performance advantage in using TCG to
9705e97a28aSMark Cave-Ayland  implement guest instructions taking more than about twenty TCG
9715e97a28aSMark Cave-Ayland  instructions. Note that this rule of thumb is more applicable to
9725e97a28aSMark Cave-Ayland  helpers doing complex logic or arithmetic, where the C compiler has
9735e97a28aSMark Cave-Ayland  scope to do a good job of optimisation; it is less relevant where
9745e97a28aSMark Cave-Ayland  the instruction is mostly doing loads and stores, and in those cases
9755e97a28aSMark Cave-Ayland  inline TCG may still be faster for longer sequences.
9765e97a28aSMark Cave-Ayland
9775e97a28aSMark Cave-Ayland- Use the 'discard' instruction if you know that TCG won't be able to
9785e97a28aSMark Cave-Ayland  prove that a given global is "dead" at a given program point. The
9795e97a28aSMark Cave-Ayland  x86 guest uses it to improve the condition codes optimisation.
980