15e97a28aSMark Cave-Ayland.. _tcg-ops-ref: 25e97a28aSMark Cave-Ayland 35e97a28aSMark Cave-Ayland******************************* 45e97a28aSMark Cave-AylandTCG Intermediate Representation 55e97a28aSMark Cave-Ayland******************************* 65e97a28aSMark Cave-Ayland 75e97a28aSMark Cave-AylandIntroduction 85e97a28aSMark Cave-Ayland============ 95e97a28aSMark Cave-Ayland 109644e714SRichard HendersonTCG (Tiny Code Generator) began as a generic backend for a C compiler. 119644e714SRichard HendersonIt was simplified to be used in QEMU. It also has its roots in the 129644e714SRichard HendersonQOP code generator written by Paul Brook. 135e97a28aSMark Cave-Ayland 145e97a28aSMark Cave-AylandDefinitions 155e97a28aSMark Cave-Ayland=========== 165e97a28aSMark Cave-Ayland 179644e714SRichard HendersonThe TCG *target* is the architecture for which we generate the code. 189644e714SRichard HendersonIt is of course not the same as the "target" of QEMU which is the 199644e714SRichard Hendersonemulated architecture. As TCG started as a generic C backend used 209644e714SRichard Hendersonfor cross compiling, the assumption was that TCG target might be 219644e714SRichard Hendersondifferent from the host, although this is never the case for QEMU. 225e97a28aSMark Cave-Ayland 235e97a28aSMark Cave-AylandIn this document, we use *guest* to specify what architecture we are 245e97a28aSMark Cave-Aylandemulating; *target* always means the TCG target, the machine on which 255e97a28aSMark Cave-Aylandwe are running QEMU. 265e97a28aSMark Cave-Ayland 275e97a28aSMark Cave-AylandAn operation with *undefined behavior* may result in a crash. 285e97a28aSMark Cave-Ayland 295e97a28aSMark Cave-AylandAn operation with *unspecified behavior* shall not crash. However, 305e97a28aSMark Cave-Aylandthe result may be one of several possibilities so may be considered 315e97a28aSMark Cave-Aylandan *undefined result*. 325e97a28aSMark Cave-Ayland 339644e714SRichard HendersonBasic Blocks 349644e714SRichard Henderson============ 355e97a28aSMark Cave-Ayland 369644e714SRichard HendersonA TCG *basic block* is a single entry, multiple exit region which 379644e714SRichard Hendersoncorresponds to a list of instructions terminated by a label, or 389644e714SRichard Hendersonany branch instruction. 395e97a28aSMark Cave-Ayland 409644e714SRichard HendersonA TCG *extended basic block* is a single entry, multiple exit region 419644e714SRichard Hendersonwhich corresponds to a list of instructions terminated by a label or 429644e714SRichard Hendersonan unconditional branch. Specifically, an extended basic block is 439644e714SRichard Hendersona sequence of basic blocks connected by the fall-through paths of 449644e714SRichard Hendersonzero or more conditional branch instructions. 455e97a28aSMark Cave-Ayland 469644e714SRichard HendersonOperations 479644e714SRichard Henderson========== 485e97a28aSMark Cave-Ayland 499644e714SRichard HendersonTCG instructions or *ops* operate on TCG *variables*, both of which 509644e714SRichard Hendersonare strongly typed. Each instruction has a fixed number of output 519644e714SRichard Hendersonvariable operands, input variable operands and constant operands. 529644e714SRichard HendersonVector instructions have a field specifying the element size within 539644e714SRichard Hendersonthe vector. The notable exception is the call instruction which has 549644e714SRichard Hendersona variable number of outputs and inputs. 555e97a28aSMark Cave-Ayland 565e97a28aSMark Cave-AylandIn the textual form, output operands usually come first, followed by 575e97a28aSMark Cave-Aylandinput operands, followed by constant operands. The output type is 585e97a28aSMark Cave-Aylandincluded in the instruction name. Constants are prefixed with a '$'. 595e97a28aSMark Cave-Ayland 605e97a28aSMark Cave-Ayland.. code-block:: none 615e97a28aSMark Cave-Ayland 625e97a28aSMark Cave-Ayland add_i32 t0, t1, t2 /* (t0 <- t1 + t2) */ 635e97a28aSMark Cave-Ayland 649644e714SRichard HendersonVariables 659644e714SRichard Henderson========= 665e97a28aSMark Cave-Ayland 679644e714SRichard Henderson* ``TEMP_FIXED`` 685e97a28aSMark Cave-Ayland 699644e714SRichard Henderson There is one TCG *fixed global* variable, ``cpu_env``, which is 709644e714SRichard Henderson live in all translation blocks, and holds a pointer to ``CPUArchState``. 719644e714SRichard Henderson This variable is held in a host cpu register at all times in all 729644e714SRichard Henderson translation blocks. 735e97a28aSMark Cave-Ayland 749644e714SRichard Henderson* ``TEMP_GLOBAL`` 755e97a28aSMark Cave-Ayland 769644e714SRichard Henderson A TCG *global* is a variable which is live in all translation blocks, 779644e714SRichard Henderson and corresponds to memory location that is within ``CPUArchState``. 789644e714SRichard Henderson These may be specified as an offset from ``cpu_env``, in which case 799644e714SRichard Henderson they are called *direct globals*, or may be specified as an offset 809644e714SRichard Henderson from a direct global, in which case they are called *indirect globals*. 819644e714SRichard Henderson Even indirect globals should still reference memory within 829644e714SRichard Henderson ``CPUArchState``. All TCG globals are defined during 839644e714SRichard Henderson ``TCGCPUOps.initialize``, before any translation blocks are generated. 845e97a28aSMark Cave-Ayland 859644e714SRichard Henderson* ``TEMP_CONST`` 865e97a28aSMark Cave-Ayland 879644e714SRichard Henderson A TCG *constant* is a variable which is live throughout the entire 889644e714SRichard Henderson translation block, and contains a constant value. These variables 899644e714SRichard Henderson are allocated on demand during translation and are hashed so that 909644e714SRichard Henderson there is exactly one variable holding a given value. 915e97a28aSMark Cave-Ayland 929644e714SRichard Henderson* ``TEMP_TB`` 935e97a28aSMark Cave-Ayland 949644e714SRichard Henderson A TCG *translation block temporary* is a variable which is live 959644e714SRichard Henderson throughout the entire translation block, but dies on any exit. 969644e714SRichard Henderson These temporaries are allocated explicitly during translation. 975e97a28aSMark Cave-Ayland 989644e714SRichard Henderson* ``TEMP_EBB`` 999644e714SRichard Henderson 1009644e714SRichard Henderson A TCG *extended basic block temporary* is a variable which is live 1019644e714SRichard Henderson throughout an extended basic block, but dies on any exit. 1029644e714SRichard Henderson These temporaries are allocated explicitly during translation. 1039644e714SRichard Henderson 1049644e714SRichard HendersonTypes 1059644e714SRichard Henderson===== 1069644e714SRichard Henderson 1079644e714SRichard Henderson* ``TCG_TYPE_I32`` 1089644e714SRichard Henderson 1099644e714SRichard Henderson A 32-bit integer. 1109644e714SRichard Henderson 1119644e714SRichard Henderson* ``TCG_TYPE_I64`` 1129644e714SRichard Henderson 1139644e714SRichard Henderson A 64-bit integer. For 32-bit hosts, such variables are split into a pair 1149644e714SRichard Henderson of variables with ``type=TCG_TYPE_I32`` and ``base_type=TCG_TYPE_I64``. 1159644e714SRichard Henderson The ``temp_subindex`` for each indicates where it falls within the 1169644e714SRichard Henderson host-endian representation. 1179644e714SRichard Henderson 1189644e714SRichard Henderson* ``TCG_TYPE_PTR`` 1199644e714SRichard Henderson 1209644e714SRichard Henderson An alias for ``TCG_TYPE_I32`` or ``TCG_TYPE_I64``, depending on the size 1219644e714SRichard Henderson of a pointer for the host. 1229644e714SRichard Henderson 1239644e714SRichard Henderson* ``TCG_TYPE_REG`` 1249644e714SRichard Henderson 1259644e714SRichard Henderson An alias for ``TCG_TYPE_I32`` or ``TCG_TYPE_I64``, depending on the size 1269644e714SRichard Henderson of the integer registers for the host. This may be larger 1279644e714SRichard Henderson than ``TCG_TYPE_PTR`` depending on the host ABI. 1289644e714SRichard Henderson 1299644e714SRichard Henderson* ``TCG_TYPE_I128`` 1309644e714SRichard Henderson 1319644e714SRichard Henderson A 128-bit integer. For all hosts, such variables are split into a number 1329644e714SRichard Henderson of variables with ``type=TCG_TYPE_REG`` and ``base_type=TCG_TYPE_I128``. 1339644e714SRichard Henderson The ``temp_subindex`` for each indicates where it falls within the 1349644e714SRichard Henderson host-endian representation. 1359644e714SRichard Henderson 1369644e714SRichard Henderson* ``TCG_TYPE_V64`` 1379644e714SRichard Henderson 1389644e714SRichard Henderson A 64-bit vector. This type is valid only if the TCG target 1399644e714SRichard Henderson sets ``TCG_TARGET_HAS_v64``. 1409644e714SRichard Henderson 1419644e714SRichard Henderson* ``TCG_TYPE_V128`` 1429644e714SRichard Henderson 1439644e714SRichard Henderson A 128-bit vector. This type is valid only if the TCG target 1449644e714SRichard Henderson sets ``TCG_TARGET_HAS_v128``. 1459644e714SRichard Henderson 1469644e714SRichard Henderson* ``TCG_TYPE_V256`` 1479644e714SRichard Henderson 1489644e714SRichard Henderson A 256-bit vector. This type is valid only if the TCG target 1499644e714SRichard Henderson sets ``TCG_TARGET_HAS_v256``. 1505e97a28aSMark Cave-Ayland 1515e97a28aSMark Cave-AylandHelpers 1529644e714SRichard Henderson======= 1535e97a28aSMark Cave-Ayland 1549644e714SRichard HendersonHelpers are registered in a guest-specific ``helper.h``, 1559644e714SRichard Hendersonwhich is processed to generate ``tcg_gen_helper_*`` functions. 1569644e714SRichard HendersonWith these functions it is possible to call a function taking 1579644e714SRichard Hendersoni32, i64, i128 or pointer types. 1585e97a28aSMark Cave-Ayland 1599644e714SRichard HendersonBy default, before calling a helper, all globals are stored at their 1609644e714SRichard Hendersoncanonical location. By default, the helper is allowed to modify the 1619644e714SRichard HendersonCPU state (including the state represented by tcg globals) 1629644e714SRichard Hendersonor may raise an exception. This default can be overridden using the 1639644e714SRichard Hendersonfollowing function modifiers: 1645e97a28aSMark Cave-Ayland 1659644e714SRichard Henderson* ``TCG_CALL_NO_WRITE_GLOBALS`` 1665e97a28aSMark Cave-Ayland 1679644e714SRichard Henderson The helper does not modify any globals, but may read them. 1689644e714SRichard Henderson Globals will be saved to their canonical location before calling helpers, 1699644e714SRichard Henderson but need not be reloaded afterwards. 1705e97a28aSMark Cave-Ayland 1719644e714SRichard Henderson* ``TCG_CALL_NO_READ_GLOBALS`` 1725e97a28aSMark Cave-Ayland 1739644e714SRichard Henderson The helper does not read globals, either directly or via an exception. 1749644e714SRichard Henderson They will not be saved to their canonical locations before calling 1759644e714SRichard Henderson the helper. This implies ``TCG_CALL_NO_WRITE_GLOBALS``. 1765e97a28aSMark Cave-Ayland 1779644e714SRichard Henderson* ``TCG_CALL_NO_SIDE_EFFECTS`` 1785e97a28aSMark Cave-Ayland 1799644e714SRichard Henderson The call to the helper function may be removed if the return value is 1809644e714SRichard Henderson not used. This means that it may not modify any CPU state nor may it 1819644e714SRichard Henderson raise an exception. 1825e97a28aSMark Cave-Ayland 1835e97a28aSMark Cave-AylandCode Optimizations 1849644e714SRichard Henderson================== 1855e97a28aSMark Cave-Ayland 1865e97a28aSMark Cave-AylandWhen generating instructions, you can count on at least the following 1875e97a28aSMark Cave-Aylandoptimizations: 1885e97a28aSMark Cave-Ayland 1895e97a28aSMark Cave-Ayland- Single instructions are simplified, e.g. 1905e97a28aSMark Cave-Ayland 1915e97a28aSMark Cave-Ayland .. code-block:: none 1925e97a28aSMark Cave-Ayland 1935e97a28aSMark Cave-Ayland and_i32 t0, t0, $0xffffffff 1945e97a28aSMark Cave-Ayland 1955e97a28aSMark Cave-Ayland is suppressed. 1965e97a28aSMark Cave-Ayland 1975e97a28aSMark Cave-Ayland- A liveness analysis is done at the basic block level. The 1985e97a28aSMark Cave-Ayland information is used to suppress moves from a dead variable to 1995e97a28aSMark Cave-Ayland another one. It is also used to remove instructions which compute 2005e97a28aSMark Cave-Ayland dead results. The later is especially useful for condition code 2015e97a28aSMark Cave-Ayland optimization in QEMU. 2025e97a28aSMark Cave-Ayland 2035e97a28aSMark Cave-Ayland In the following example: 2045e97a28aSMark Cave-Ayland 2055e97a28aSMark Cave-Ayland .. code-block:: none 2065e97a28aSMark Cave-Ayland 2075e97a28aSMark Cave-Ayland add_i32 t0, t1, t2 2085e97a28aSMark Cave-Ayland add_i32 t0, t0, $1 2095e97a28aSMark Cave-Ayland mov_i32 t0, $1 2105e97a28aSMark Cave-Ayland 2115e97a28aSMark Cave-Ayland only the last instruction is kept. 2125e97a28aSMark Cave-Ayland 2135e97a28aSMark Cave-Ayland 2145e97a28aSMark Cave-AylandInstruction Reference 2155e97a28aSMark Cave-Ayland===================== 2165e97a28aSMark Cave-Ayland 2175e97a28aSMark Cave-AylandFunction call 2185e97a28aSMark Cave-Ayland------------- 2195e97a28aSMark Cave-Ayland 2205e97a28aSMark Cave-Ayland.. list-table:: 2215e97a28aSMark Cave-Ayland 2225e97a28aSMark Cave-Ayland * - call *<ret>* *<params>* ptr 2235e97a28aSMark Cave-Ayland 2245e97a28aSMark Cave-Ayland - | call function 'ptr' (pointer type) 2255e97a28aSMark Cave-Ayland | 2265e97a28aSMark Cave-Ayland | *<ret>* optional 32 bit or 64 bit return value 2275e97a28aSMark Cave-Ayland | *<params>* optional 32 bit or 64 bit parameters 2285e97a28aSMark Cave-Ayland 2295e97a28aSMark Cave-AylandJumps/Labels 2305e97a28aSMark Cave-Ayland------------ 2315e97a28aSMark Cave-Ayland 2325e97a28aSMark Cave-Ayland.. list-table:: 2335e97a28aSMark Cave-Ayland 2345e97a28aSMark Cave-Ayland * - set_label $label 2355e97a28aSMark Cave-Ayland 2365e97a28aSMark Cave-Ayland - | Define label 'label' at the current program point. 2375e97a28aSMark Cave-Ayland 2385e97a28aSMark Cave-Ayland * - br $label 2395e97a28aSMark Cave-Ayland 2405e97a28aSMark Cave-Ayland - | Jump to label. 2415e97a28aSMark Cave-Ayland 2425e97a28aSMark Cave-Ayland * - brcond_i32/i64 *t0*, *t1*, *cond*, *label* 2435e97a28aSMark Cave-Ayland 2445e97a28aSMark Cave-Ayland - | Conditional jump if *t0* *cond* *t1* is true. *cond* can be: 2455e97a28aSMark Cave-Ayland | 2465e97a28aSMark Cave-Ayland | ``TCG_COND_EQ`` 2475e97a28aSMark Cave-Ayland | ``TCG_COND_NE`` 2485e97a28aSMark Cave-Ayland | ``TCG_COND_LT /* signed */`` 2495e97a28aSMark Cave-Ayland | ``TCG_COND_GE /* signed */`` 2505e97a28aSMark Cave-Ayland | ``TCG_COND_LE /* signed */`` 2515e97a28aSMark Cave-Ayland | ``TCG_COND_GT /* signed */`` 2525e97a28aSMark Cave-Ayland | ``TCG_COND_LTU /* unsigned */`` 2535e97a28aSMark Cave-Ayland | ``TCG_COND_GEU /* unsigned */`` 2545e97a28aSMark Cave-Ayland | ``TCG_COND_LEU /* unsigned */`` 2555e97a28aSMark Cave-Ayland | ``TCG_COND_GTU /* unsigned */`` 256*d48097d0SRichard Henderson | ``TCG_COND_TSTEQ /* t1 & t2 == 0 */`` 257*d48097d0SRichard Henderson | ``TCG_COND_TSTNE /* t1 & t2 != 0 */`` 2585e97a28aSMark Cave-Ayland 2595e97a28aSMark Cave-AylandArithmetic 2605e97a28aSMark Cave-Ayland---------- 2615e97a28aSMark Cave-Ayland 2625e97a28aSMark Cave-Ayland.. list-table:: 2635e97a28aSMark Cave-Ayland 2645e97a28aSMark Cave-Ayland * - add_i32/i64 *t0*, *t1*, *t2* 2655e97a28aSMark Cave-Ayland 2665e97a28aSMark Cave-Ayland - | *t0* = *t1* + *t2* 2675e97a28aSMark Cave-Ayland 2685e97a28aSMark Cave-Ayland * - sub_i32/i64 *t0*, *t1*, *t2* 2695e97a28aSMark Cave-Ayland 2705e97a28aSMark Cave-Ayland - | *t0* = *t1* - *t2* 2715e97a28aSMark Cave-Ayland 2725e97a28aSMark Cave-Ayland * - neg_i32/i64 *t0*, *t1* 2735e97a28aSMark Cave-Ayland 2745e97a28aSMark Cave-Ayland - | *t0* = -*t1* (two's complement) 2755e97a28aSMark Cave-Ayland 2765e97a28aSMark Cave-Ayland * - mul_i32/i64 *t0*, *t1*, *t2* 2775e97a28aSMark Cave-Ayland 2785e97a28aSMark Cave-Ayland - | *t0* = *t1* * *t2* 2795e97a28aSMark Cave-Ayland 2805e97a28aSMark Cave-Ayland * - div_i32/i64 *t0*, *t1*, *t2* 2815e97a28aSMark Cave-Ayland 2825e97a28aSMark Cave-Ayland - | *t0* = *t1* / *t2* (signed) 2835e97a28aSMark Cave-Ayland | Undefined behavior if division by zero or overflow. 2845e97a28aSMark Cave-Ayland 2855e97a28aSMark Cave-Ayland * - divu_i32/i64 *t0*, *t1*, *t2* 2865e97a28aSMark Cave-Ayland 2875e97a28aSMark Cave-Ayland - | *t0* = *t1* / *t2* (unsigned) 2885e97a28aSMark Cave-Ayland | Undefined behavior if division by zero. 2895e97a28aSMark Cave-Ayland 2905e97a28aSMark Cave-Ayland * - rem_i32/i64 *t0*, *t1*, *t2* 2915e97a28aSMark Cave-Ayland 2925e97a28aSMark Cave-Ayland - | *t0* = *t1* % *t2* (signed) 2935e97a28aSMark Cave-Ayland | Undefined behavior if division by zero or overflow. 2945e97a28aSMark Cave-Ayland 2955e97a28aSMark Cave-Ayland * - remu_i32/i64 *t0*, *t1*, *t2* 2965e97a28aSMark Cave-Ayland 2975e97a28aSMark Cave-Ayland - | *t0* = *t1* % *t2* (unsigned) 2985e97a28aSMark Cave-Ayland | Undefined behavior if division by zero. 2995e97a28aSMark Cave-Ayland 3005e97a28aSMark Cave-Ayland 3015e97a28aSMark Cave-AylandLogical 3025e97a28aSMark Cave-Ayland------- 3035e97a28aSMark Cave-Ayland 3045e97a28aSMark Cave-Ayland.. list-table:: 3055e97a28aSMark Cave-Ayland 3065e97a28aSMark Cave-Ayland * - and_i32/i64 *t0*, *t1*, *t2* 3075e97a28aSMark Cave-Ayland 3085e97a28aSMark Cave-Ayland - | *t0* = *t1* & *t2* 3095e97a28aSMark Cave-Ayland 3105e97a28aSMark Cave-Ayland * - or_i32/i64 *t0*, *t1*, *t2* 3115e97a28aSMark Cave-Ayland 3125e97a28aSMark Cave-Ayland - | *t0* = *t1* | *t2* 3135e97a28aSMark Cave-Ayland 3145e97a28aSMark Cave-Ayland * - xor_i32/i64 *t0*, *t1*, *t2* 3155e97a28aSMark Cave-Ayland 3165e97a28aSMark Cave-Ayland - | *t0* = *t1* ^ *t2* 3175e97a28aSMark Cave-Ayland 3185e97a28aSMark Cave-Ayland * - not_i32/i64 *t0*, *t1* 3195e97a28aSMark Cave-Ayland 3205e97a28aSMark Cave-Ayland - | *t0* = ~\ *t1* 3215e97a28aSMark Cave-Ayland 3225e97a28aSMark Cave-Ayland * - andc_i32/i64 *t0*, *t1*, *t2* 3235e97a28aSMark Cave-Ayland 3245e97a28aSMark Cave-Ayland - | *t0* = *t1* & ~\ *t2* 3255e97a28aSMark Cave-Ayland 3265e97a28aSMark Cave-Ayland * - eqv_i32/i64 *t0*, *t1*, *t2* 3275e97a28aSMark Cave-Ayland 3285e97a28aSMark Cave-Ayland - | *t0* = ~(*t1* ^ *t2*), or equivalently, *t0* = *t1* ^ ~\ *t2* 3295e97a28aSMark Cave-Ayland 3305e97a28aSMark Cave-Ayland * - nand_i32/i64 *t0*, *t1*, *t2* 3315e97a28aSMark Cave-Ayland 3325e97a28aSMark Cave-Ayland - | *t0* = ~(*t1* & *t2*) 3335e97a28aSMark Cave-Ayland 3345e97a28aSMark Cave-Ayland * - nor_i32/i64 *t0*, *t1*, *t2* 3355e97a28aSMark Cave-Ayland 3365e97a28aSMark Cave-Ayland - | *t0* = ~(*t1* | *t2*) 3375e97a28aSMark Cave-Ayland 3385e97a28aSMark Cave-Ayland * - orc_i32/i64 *t0*, *t1*, *t2* 3395e97a28aSMark Cave-Ayland 3405e97a28aSMark Cave-Ayland - | *t0* = *t1* | ~\ *t2* 3415e97a28aSMark Cave-Ayland 3425e97a28aSMark Cave-Ayland * - clz_i32/i64 *t0*, *t1*, *t2* 3435e97a28aSMark Cave-Ayland 3445e97a28aSMark Cave-Ayland - | *t0* = *t1* ? clz(*t1*) : *t2* 3455e97a28aSMark Cave-Ayland 3465e97a28aSMark Cave-Ayland * - ctz_i32/i64 *t0*, *t1*, *t2* 3475e97a28aSMark Cave-Ayland 3485e97a28aSMark Cave-Ayland - | *t0* = *t1* ? ctz(*t1*) : *t2* 3495e97a28aSMark Cave-Ayland 3505e97a28aSMark Cave-Ayland * - ctpop_i32/i64 *t0*, *t1* 3515e97a28aSMark Cave-Ayland 3525e97a28aSMark Cave-Ayland - | *t0* = number of bits set in *t1* 3535e97a28aSMark Cave-Ayland | 3545e97a28aSMark Cave-Ayland | With *ctpop* short for "count population", matching 3555e97a28aSMark Cave-Ayland | the function name used in ``include/qemu/host-utils.h``. 3565e97a28aSMark Cave-Ayland 3575e97a28aSMark Cave-Ayland 3585e97a28aSMark Cave-AylandShifts/Rotates 3595e97a28aSMark Cave-Ayland-------------- 3605e97a28aSMark Cave-Ayland 3615e97a28aSMark Cave-Ayland.. list-table:: 3625e97a28aSMark Cave-Ayland 3635e97a28aSMark Cave-Ayland * - shl_i32/i64 *t0*, *t1*, *t2* 3645e97a28aSMark Cave-Ayland 3655e97a28aSMark Cave-Ayland - | *t0* = *t1* << *t2* 3665e97a28aSMark Cave-Ayland | Unspecified behavior if *t2* < 0 or *t2* >= 32 (resp 64) 3675e97a28aSMark Cave-Ayland 3685e97a28aSMark Cave-Ayland * - shr_i32/i64 *t0*, *t1*, *t2* 3695e97a28aSMark Cave-Ayland 3705e97a28aSMark Cave-Ayland - | *t0* = *t1* >> *t2* (unsigned) 3715e97a28aSMark Cave-Ayland | Unspecified behavior if *t2* < 0 or *t2* >= 32 (resp 64) 3725e97a28aSMark Cave-Ayland 3735e97a28aSMark Cave-Ayland * - sar_i32/i64 *t0*, *t1*, *t2* 3745e97a28aSMark Cave-Ayland 3755e97a28aSMark Cave-Ayland - | *t0* = *t1* >> *t2* (signed) 3765e97a28aSMark Cave-Ayland | Unspecified behavior if *t2* < 0 or *t2* >= 32 (resp 64) 3775e97a28aSMark Cave-Ayland 3785e97a28aSMark Cave-Ayland * - rotl_i32/i64 *t0*, *t1*, *t2* 3795e97a28aSMark Cave-Ayland 3805e97a28aSMark Cave-Ayland - | Rotation of *t2* bits to the left 3815e97a28aSMark Cave-Ayland | Unspecified behavior if *t2* < 0 or *t2* >= 32 (resp 64) 3825e97a28aSMark Cave-Ayland 3835e97a28aSMark Cave-Ayland * - rotr_i32/i64 *t0*, *t1*, *t2* 3845e97a28aSMark Cave-Ayland 3855e97a28aSMark Cave-Ayland - | Rotation of *t2* bits to the right. 3865e97a28aSMark Cave-Ayland | Unspecified behavior if *t2* < 0 or *t2* >= 32 (resp 64) 3875e97a28aSMark Cave-Ayland 3885e97a28aSMark Cave-Ayland 3895e97a28aSMark Cave-AylandMisc 3905e97a28aSMark Cave-Ayland---- 3915e97a28aSMark Cave-Ayland 3925e97a28aSMark Cave-Ayland.. list-table:: 3935e97a28aSMark Cave-Ayland 3945e97a28aSMark Cave-Ayland * - mov_i32/i64 *t0*, *t1* 3955e97a28aSMark Cave-Ayland 3965e97a28aSMark Cave-Ayland - | *t0* = *t1* 3975e97a28aSMark Cave-Ayland | Move *t1* to *t0* (both operands must have the same type). 3985e97a28aSMark Cave-Ayland 3995e97a28aSMark Cave-Ayland * - ext8s_i32/i64 *t0*, *t1* 4005e97a28aSMark Cave-Ayland 4015e97a28aSMark Cave-Ayland ext8u_i32/i64 *t0*, *t1* 4025e97a28aSMark Cave-Ayland 4035e97a28aSMark Cave-Ayland ext16s_i32/i64 *t0*, *t1* 4045e97a28aSMark Cave-Ayland 4055e97a28aSMark Cave-Ayland ext16u_i32/i64 *t0*, *t1* 4065e97a28aSMark Cave-Ayland 4075e97a28aSMark Cave-Ayland ext32s_i64 *t0*, *t1* 4085e97a28aSMark Cave-Ayland 4095e97a28aSMark Cave-Ayland ext32u_i64 *t0*, *t1* 4105e97a28aSMark Cave-Ayland 4115e97a28aSMark Cave-Ayland - | 8, 16 or 32 bit sign/zero extension (both operands must have the same type) 4125e97a28aSMark Cave-Ayland 4135e97a28aSMark Cave-Ayland * - bswap16_i32/i64 *t0*, *t1*, *flags* 4145e97a28aSMark Cave-Ayland 4155e97a28aSMark Cave-Ayland - | 16 bit byte swap on the low bits of a 32/64 bit input. 4165e97a28aSMark Cave-Ayland | 4175e97a28aSMark Cave-Ayland | If *flags* & ``TCG_BSWAP_IZ``, then *t1* is known to be zero-extended from bit 15. 4185e97a28aSMark Cave-Ayland | If *flags* & ``TCG_BSWAP_OZ``, then *t0* will be zero-extended from bit 15. 4195e97a28aSMark Cave-Ayland | If *flags* & ``TCG_BSWAP_OS``, then *t0* will be sign-extended from bit 15. 4205e97a28aSMark Cave-Ayland | 4215e97a28aSMark Cave-Ayland | If neither ``TCG_BSWAP_OZ`` nor ``TCG_BSWAP_OS`` are set, then the bits of *t0* above bit 15 may contain any value. 4225e97a28aSMark Cave-Ayland 4235e97a28aSMark Cave-Ayland * - bswap32_i64 *t0*, *t1*, *flags* 4245e97a28aSMark Cave-Ayland 4255e97a28aSMark Cave-Ayland - | 32 bit byte swap on a 64-bit value. The flags are the same as for bswap16, 4265e97a28aSMark Cave-Ayland except they apply from bit 31 instead of bit 15. 4275e97a28aSMark Cave-Ayland 4285e97a28aSMark Cave-Ayland * - bswap32_i32 *t0*, *t1*, *flags* 4295e97a28aSMark Cave-Ayland 4305e97a28aSMark Cave-Ayland bswap64_i64 *t0*, *t1*, *flags* 4315e97a28aSMark Cave-Ayland 4325e97a28aSMark Cave-Ayland - | 32/64 bit byte swap. The flags are ignored, but still present 4335e97a28aSMark Cave-Ayland for consistency with the other bswap opcodes. 4345e97a28aSMark Cave-Ayland 4355e97a28aSMark Cave-Ayland * - discard_i32/i64 *t0* 4365e97a28aSMark Cave-Ayland 4375e97a28aSMark Cave-Ayland - | Indicate that the value of *t0* won't be used later. It is useful to 4385e97a28aSMark Cave-Ayland force dead code elimination. 4395e97a28aSMark Cave-Ayland 4405e97a28aSMark Cave-Ayland * - deposit_i32/i64 *dest*, *t1*, *t2*, *pos*, *len* 4415e97a28aSMark Cave-Ayland 4425e97a28aSMark Cave-Ayland - | Deposit *t2* as a bitfield into *t1*, placing the result in *dest*. 4435e97a28aSMark Cave-Ayland | 4445e97a28aSMark Cave-Ayland | The bitfield is described by *pos*/*len*, which are immediate values: 4455e97a28aSMark Cave-Ayland | 4465e97a28aSMark Cave-Ayland | *len* - the length of the bitfield 4475e97a28aSMark Cave-Ayland | *pos* - the position of the first bit, counting from the LSB 4485e97a28aSMark Cave-Ayland | 4495e97a28aSMark Cave-Ayland | For example, "deposit_i32 dest, t1, t2, 8, 4" indicates a 4-bit field 4505e97a28aSMark Cave-Ayland at bit 8. This operation would be equivalent to 4515e97a28aSMark Cave-Ayland | 4525e97a28aSMark Cave-Ayland | *dest* = (*t1* & ~0x0f00) | ((*t2* << 8) & 0x0f00) 4535e97a28aSMark Cave-Ayland 4545e97a28aSMark Cave-Ayland * - extract_i32/i64 *dest*, *t1*, *pos*, *len* 4555e97a28aSMark Cave-Ayland 4565e97a28aSMark Cave-Ayland sextract_i32/i64 *dest*, *t1*, *pos*, *len* 4575e97a28aSMark Cave-Ayland 4585e97a28aSMark Cave-Ayland - | Extract a bitfield from *t1*, placing the result in *dest*. 4595e97a28aSMark Cave-Ayland | 4605e97a28aSMark Cave-Ayland | The bitfield is described by *pos*/*len*, which are immediate values, 4615e97a28aSMark Cave-Ayland as above for deposit. For extract_*, the result will be extended 4625e97a28aSMark Cave-Ayland to the left with zeros; for sextract_*, the result will be extended 4635e97a28aSMark Cave-Ayland to the left with copies of the bitfield sign bit at *pos* + *len* - 1. 4645e97a28aSMark Cave-Ayland | 4655e97a28aSMark Cave-Ayland | For example, "sextract_i32 dest, t1, 8, 4" indicates a 4-bit field 4665e97a28aSMark Cave-Ayland at bit 8. This operation would be equivalent to 4675e97a28aSMark Cave-Ayland | 4685e97a28aSMark Cave-Ayland | *dest* = (*t1* << 20) >> 28 4695e97a28aSMark Cave-Ayland | 4705e97a28aSMark Cave-Ayland | (using an arithmetic right shift). 4715e97a28aSMark Cave-Ayland 4725e97a28aSMark Cave-Ayland * - extract2_i32/i64 *dest*, *t1*, *t2*, *pos* 4735e97a28aSMark Cave-Ayland 4745e97a28aSMark Cave-Ayland - | For N = {32,64}, extract an N-bit quantity from the concatenation 4755e97a28aSMark Cave-Ayland of *t2*:*t1*, beginning at *pos*. The tcg_gen_extract2_{i32,i64} expander 4765e97a28aSMark Cave-Ayland accepts 0 <= *pos* <= N as inputs. The backend code generator will 4775e97a28aSMark Cave-Ayland not see either 0 or N as inputs for these opcodes. 4785e97a28aSMark Cave-Ayland 4795e97a28aSMark Cave-Ayland * - extrl_i64_i32 *t0*, *t1* 4805e97a28aSMark Cave-Ayland 4815e97a28aSMark Cave-Ayland - | For 64-bit hosts only, extract the low 32-bits of input *t1* and place it 4825e97a28aSMark Cave-Ayland into 32-bit output *t0*. Depending on the host, this may be a simple move, 4835e97a28aSMark Cave-Ayland or may require additional canonicalization. 4845e97a28aSMark Cave-Ayland 4855e97a28aSMark Cave-Ayland * - extrh_i64_i32 *t0*, *t1* 4865e97a28aSMark Cave-Ayland 4875e97a28aSMark Cave-Ayland - | For 64-bit hosts only, extract the high 32-bits of input *t1* and place it 4885e97a28aSMark Cave-Ayland into 32-bit output *t0*. Depending on the host, this may be a simple shift, 4895e97a28aSMark Cave-Ayland or may require additional canonicalization. 4905e97a28aSMark Cave-Ayland 4915e97a28aSMark Cave-Ayland 4925e97a28aSMark Cave-AylandConditional moves 4935e97a28aSMark Cave-Ayland----------------- 4945e97a28aSMark Cave-Ayland 4955e97a28aSMark Cave-Ayland.. list-table:: 4965e97a28aSMark Cave-Ayland 4975e97a28aSMark Cave-Ayland * - setcond_i32/i64 *dest*, *t1*, *t2*, *cond* 4985e97a28aSMark Cave-Ayland 4995e97a28aSMark Cave-Ayland - | *dest* = (*t1* *cond* *t2*) 5005e97a28aSMark Cave-Ayland | 5015e97a28aSMark Cave-Ayland | Set *dest* to 1 if (*t1* *cond* *t2*) is true, otherwise set to 0. 5025e97a28aSMark Cave-Ayland 5033635502dSRichard Henderson * - negsetcond_i32/i64 *dest*, *t1*, *t2*, *cond* 5043635502dSRichard Henderson 5053635502dSRichard Henderson - | *dest* = -(*t1* *cond* *t2*) 5063635502dSRichard Henderson | 5073635502dSRichard Henderson | Set *dest* to -1 if (*t1* *cond* *t2*) is true, otherwise set to 0. 5083635502dSRichard Henderson 5095e97a28aSMark Cave-Ayland * - movcond_i32/i64 *dest*, *c1*, *c2*, *v1*, *v2*, *cond* 5105e97a28aSMark Cave-Ayland 5115e97a28aSMark Cave-Ayland - | *dest* = (*c1* *cond* *c2* ? *v1* : *v2*) 5125e97a28aSMark Cave-Ayland | 5135e97a28aSMark Cave-Ayland | Set *dest* to *v1* if (*c1* *cond* *c2*) is true, otherwise set to *v2*. 5145e97a28aSMark Cave-Ayland 5155e97a28aSMark Cave-Ayland 5165e97a28aSMark Cave-AylandType conversions 5175e97a28aSMark Cave-Ayland---------------- 5185e97a28aSMark Cave-Ayland 5195e97a28aSMark Cave-Ayland.. list-table:: 5205e97a28aSMark Cave-Ayland 5215e97a28aSMark Cave-Ayland * - ext_i32_i64 *t0*, *t1* 5225e97a28aSMark Cave-Ayland 5235e97a28aSMark Cave-Ayland - | Convert *t1* (32 bit) to *t0* (64 bit) and does sign extension 5245e97a28aSMark Cave-Ayland 5255e97a28aSMark Cave-Ayland * - extu_i32_i64 *t0*, *t1* 5265e97a28aSMark Cave-Ayland 5275e97a28aSMark Cave-Ayland - | Convert *t1* (32 bit) to *t0* (64 bit) and does zero extension 5285e97a28aSMark Cave-Ayland 5295e97a28aSMark Cave-Ayland * - trunc_i64_i32 *t0*, *t1* 5305e97a28aSMark Cave-Ayland 5315e97a28aSMark Cave-Ayland - | Truncate *t1* (64 bit) to *t0* (32 bit) 5325e97a28aSMark Cave-Ayland 5335e97a28aSMark Cave-Ayland * - concat_i32_i64 *t0*, *t1*, *t2* 5345e97a28aSMark Cave-Ayland 5355e97a28aSMark Cave-Ayland - | Construct *t0* (64-bit) taking the low half from *t1* (32 bit) and the high half 5365e97a28aSMark Cave-Ayland from *t2* (32 bit). 5375e97a28aSMark Cave-Ayland 5385e97a28aSMark Cave-Ayland * - concat32_i64 *t0*, *t1*, *t2* 5395e97a28aSMark Cave-Ayland 5405e97a28aSMark Cave-Ayland - | Construct *t0* (64-bit) taking the low half from *t1* (64 bit) and the high half 5415e97a28aSMark Cave-Ayland from *t2* (64 bit). 5425e97a28aSMark Cave-Ayland 5435e97a28aSMark Cave-Ayland 5445e97a28aSMark Cave-AylandLoad/Store 5455e97a28aSMark Cave-Ayland---------- 5465e97a28aSMark Cave-Ayland 5475e97a28aSMark Cave-Ayland.. list-table:: 5485e97a28aSMark Cave-Ayland 5495e97a28aSMark Cave-Ayland * - ld_i32/i64 *t0*, *t1*, *offset* 5505e97a28aSMark Cave-Ayland 5515e97a28aSMark Cave-Ayland ld8s_i32/i64 *t0*, *t1*, *offset* 5525e97a28aSMark Cave-Ayland 5535e97a28aSMark Cave-Ayland ld8u_i32/i64 *t0*, *t1*, *offset* 5545e97a28aSMark Cave-Ayland 5555e97a28aSMark Cave-Ayland ld16s_i32/i64 *t0*, *t1*, *offset* 5565e97a28aSMark Cave-Ayland 5575e97a28aSMark Cave-Ayland ld16u_i32/i64 *t0*, *t1*, *offset* 5585e97a28aSMark Cave-Ayland 5595e97a28aSMark Cave-Ayland ld32s_i64 t0, *t1*, *offset* 5605e97a28aSMark Cave-Ayland 5615e97a28aSMark Cave-Ayland ld32u_i64 t0, *t1*, *offset* 5625e97a28aSMark Cave-Ayland 5635e97a28aSMark Cave-Ayland - | *t0* = read(*t1* + *offset*) 5645e97a28aSMark Cave-Ayland | 5655e97a28aSMark Cave-Ayland | Load 8, 16, 32 or 64 bits with or without sign extension from host memory. 5665e97a28aSMark Cave-Ayland *offset* must be a constant. 5675e97a28aSMark Cave-Ayland 5685e97a28aSMark Cave-Ayland * - st_i32/i64 *t0*, *t1*, *offset* 5695e97a28aSMark Cave-Ayland 5705e97a28aSMark Cave-Ayland st8_i32/i64 *t0*, *t1*, *offset* 5715e97a28aSMark Cave-Ayland 5725e97a28aSMark Cave-Ayland st16_i32/i64 *t0*, *t1*, *offset* 5735e97a28aSMark Cave-Ayland 5745e97a28aSMark Cave-Ayland st32_i64 *t0*, *t1*, *offset* 5755e97a28aSMark Cave-Ayland 5765e97a28aSMark Cave-Ayland - | write(*t0*, *t1* + *offset*) 5775e97a28aSMark Cave-Ayland | 5785e97a28aSMark Cave-Ayland | Write 8, 16, 32 or 64 bits to host memory. 5795e97a28aSMark Cave-Ayland 5805e97a28aSMark Cave-AylandAll this opcodes assume that the pointed host memory doesn't correspond 5815e97a28aSMark Cave-Aylandto a global. In the latter case the behaviour is unpredictable. 5825e97a28aSMark Cave-Ayland 5835e97a28aSMark Cave-Ayland 5845e97a28aSMark Cave-AylandMultiword arithmetic support 5855e97a28aSMark Cave-Ayland---------------------------- 5865e97a28aSMark Cave-Ayland 5875e97a28aSMark Cave-Ayland.. list-table:: 5885e97a28aSMark Cave-Ayland 5895e97a28aSMark Cave-Ayland * - add2_i32/i64 *t0_low*, *t0_high*, *t1_low*, *t1_high*, *t2_low*, *t2_high* 5905e97a28aSMark Cave-Ayland 5915e97a28aSMark Cave-Ayland sub2_i32/i64 *t0_low*, *t0_high*, *t1_low*, *t1_high*, *t2_low*, *t2_high* 5925e97a28aSMark Cave-Ayland 5935e97a28aSMark Cave-Ayland - | Similar to add/sub, except that the double-word inputs *t1* and *t2* are 5945e97a28aSMark Cave-Ayland formed from two single-word arguments, and the double-word output *t0* 5955e97a28aSMark Cave-Ayland is returned in two single-word outputs. 5965e97a28aSMark Cave-Ayland 5975e97a28aSMark Cave-Ayland * - mulu2_i32/i64 *t0_low*, *t0_high*, *t1*, *t2* 5985e97a28aSMark Cave-Ayland 5995e97a28aSMark Cave-Ayland - | Similar to mul, except two unsigned inputs *t1* and *t2* yielding the full 6005e97a28aSMark Cave-Ayland double-word product *t0*. The latter is returned in two single-word outputs. 6015e97a28aSMark Cave-Ayland 6025e97a28aSMark Cave-Ayland * - muls2_i32/i64 *t0_low*, *t0_high*, *t1*, *t2* 6035e97a28aSMark Cave-Ayland 6045e97a28aSMark Cave-Ayland - | Similar to mulu2, except the two inputs *t1* and *t2* are signed. 6055e97a28aSMark Cave-Ayland 6065e97a28aSMark Cave-Ayland * - mulsh_i32/i64 *t0*, *t1*, *t2* 6075e97a28aSMark Cave-Ayland 6085e97a28aSMark Cave-Ayland muluh_i32/i64 *t0*, *t1*, *t2* 6095e97a28aSMark Cave-Ayland 6105e97a28aSMark Cave-Ayland - | Provide the high part of a signed or unsigned multiply, respectively. 6115e97a28aSMark Cave-Ayland | 6125e97a28aSMark Cave-Ayland | If mulu2/muls2 are not provided by the backend, the tcg-op generator 6135e97a28aSMark Cave-Ayland can obtain the same results by emitting a pair of opcodes, mul + muluh/mulsh. 6145e97a28aSMark Cave-Ayland 6155e97a28aSMark Cave-Ayland 6165e97a28aSMark Cave-AylandMemory Barrier support 6175e97a28aSMark Cave-Ayland---------------------- 6185e97a28aSMark Cave-Ayland 6195e97a28aSMark Cave-Ayland.. list-table:: 6205e97a28aSMark Cave-Ayland 6215e97a28aSMark Cave-Ayland * - mb *<$arg>* 6225e97a28aSMark Cave-Ayland 6235e97a28aSMark Cave-Ayland - | Generate a target memory barrier instruction to ensure memory ordering 6245e97a28aSMark Cave-Ayland as being enforced by a corresponding guest memory barrier instruction. 6255e97a28aSMark Cave-Ayland | 6265e97a28aSMark Cave-Ayland | The ordering enforced by the backend may be stricter than the ordering 6275e97a28aSMark Cave-Ayland required by the guest. It cannot be weaker. This opcode takes a constant 6285e97a28aSMark Cave-Ayland argument which is required to generate the appropriate barrier 6295e97a28aSMark Cave-Ayland instruction. The backend should take care to emit the target barrier 6305e97a28aSMark Cave-Ayland instruction only when necessary i.e., for SMP guests and when MTTCG is 6315e97a28aSMark Cave-Ayland enabled. 6325e97a28aSMark Cave-Ayland | 6335e97a28aSMark Cave-Ayland | The guest translators should generate this opcode for all guest instructions 6345e97a28aSMark Cave-Ayland which have ordering side effects. 6355e97a28aSMark Cave-Ayland | 6365e97a28aSMark Cave-Ayland | Please see :ref:`atomics-ref` for more information on memory barriers. 6375e97a28aSMark Cave-Ayland 6385e97a28aSMark Cave-Ayland 6395e97a28aSMark Cave-Ayland64-bit guest on 32-bit host support 6405e97a28aSMark Cave-Ayland----------------------------------- 6415e97a28aSMark Cave-Ayland 6425e97a28aSMark Cave-AylandThe following opcodes are internal to TCG. Thus they are to be implemented by 6435e97a28aSMark Cave-Ayland32-bit host code generators, but are not to be emitted by guest translators. 6445e97a28aSMark Cave-AylandThey are emitted as needed by inline functions within ``tcg-op.h``. 6455e97a28aSMark Cave-Ayland 6465e97a28aSMark Cave-Ayland.. list-table:: 6475e97a28aSMark Cave-Ayland 6485e97a28aSMark Cave-Ayland * - brcond2_i32 *t0_low*, *t0_high*, *t1_low*, *t1_high*, *cond*, *label* 6495e97a28aSMark Cave-Ayland 6505e97a28aSMark Cave-Ayland - | Similar to brcond, except that the 64-bit values *t0* and *t1* 6515e97a28aSMark Cave-Ayland are formed from two 32-bit arguments. 6525e97a28aSMark Cave-Ayland 6535e97a28aSMark Cave-Ayland * - setcond2_i32 *dest*, *t1_low*, *t1_high*, *t2_low*, *t2_high*, *cond* 6545e97a28aSMark Cave-Ayland 6555e97a28aSMark Cave-Ayland - | Similar to setcond, except that the 64-bit values *t1* and *t2* are 6565e97a28aSMark Cave-Ayland formed from two 32-bit arguments. The result is a 32-bit value. 6575e97a28aSMark Cave-Ayland 6585e97a28aSMark Cave-Ayland 6595e97a28aSMark Cave-AylandQEMU specific operations 6605e97a28aSMark Cave-Ayland------------------------ 6615e97a28aSMark Cave-Ayland 6625e97a28aSMark Cave-Ayland.. list-table:: 6635e97a28aSMark Cave-Ayland 6645e97a28aSMark Cave-Ayland * - exit_tb *t0* 6655e97a28aSMark Cave-Ayland 6665e97a28aSMark Cave-Ayland - | Exit the current TB and return the value *t0* (word type). 6675e97a28aSMark Cave-Ayland 6685e97a28aSMark Cave-Ayland * - goto_tb *index* 6695e97a28aSMark Cave-Ayland 6705e97a28aSMark Cave-Ayland - | Exit the current TB and jump to the TB index *index* (constant) if the 6715e97a28aSMark Cave-Ayland current TB was linked to this TB. Otherwise execute the next 6725e97a28aSMark Cave-Ayland instructions. Only indices 0 and 1 are valid and tcg_gen_goto_tb may be issued 6735e97a28aSMark Cave-Ayland at most once with each slot index per TB. 6745e97a28aSMark Cave-Ayland 6755e97a28aSMark Cave-Ayland * - lookup_and_goto_ptr *tb_addr* 6765e97a28aSMark Cave-Ayland 6775e97a28aSMark Cave-Ayland - | Look up a TB address *tb_addr* and jump to it if valid. If not valid, 6785e97a28aSMark Cave-Ayland jump to the TCG epilogue to go back to the exec loop. 6795e97a28aSMark Cave-Ayland | 6805e97a28aSMark Cave-Ayland | This operation is optional. If the TCG backend does not implement the 6815e97a28aSMark Cave-Ayland goto_ptr opcode, emitting this op is equivalent to emitting exit_tb(0). 6825e97a28aSMark Cave-Ayland 68312fde9bcSRichard Henderson * - qemu_ld_i32/i64/i128 *t0*, *t1*, *flags*, *memidx* 6845e97a28aSMark Cave-Ayland 68512fde9bcSRichard Henderson qemu_st_i32/i64/i128 *t0*, *t1*, *flags*, *memidx* 6865e97a28aSMark Cave-Ayland 6875e97a28aSMark Cave-Ayland qemu_st8_i32 *t0*, *t1*, *flags*, *memidx* 6885e97a28aSMark Cave-Ayland 6895e97a28aSMark Cave-Ayland - | Load data at the guest address *t1* into *t0*, or store data in *t0* at guest 69012fde9bcSRichard Henderson address *t1*. The _i32/_i64/_i128 size applies to the size of the input/output 6915e97a28aSMark Cave-Ayland register *t0* only. The address *t1* is always sized according to the guest, 6925e97a28aSMark Cave-Ayland and the width of the memory operation is controlled by *flags*. 6935e97a28aSMark Cave-Ayland | 6945e97a28aSMark Cave-Ayland | Both *t0* and *t1* may be split into little-endian ordered pairs of registers 69512fde9bcSRichard Henderson if dealing with 64-bit quantities on a 32-bit host, or 128-bit quantities on 69612fde9bcSRichard Henderson a 64-bit host. 6975e97a28aSMark Cave-Ayland | 6985e97a28aSMark Cave-Ayland | The *memidx* selects the qemu tlb index to use (e.g. user or kernel access). 6995e97a28aSMark Cave-Ayland The flags are the MemOp bits, selecting the sign, width, and endianness 7005e97a28aSMark Cave-Ayland of the memory access. 7015e97a28aSMark Cave-Ayland | 7025e97a28aSMark Cave-Ayland | For a 32-bit host, qemu_ld/st_i64 is guaranteed to only be used with a 7035e97a28aSMark Cave-Ayland 64-bit memory access specified in *flags*. 7045e97a28aSMark Cave-Ayland | 70512fde9bcSRichard Henderson | For qemu_ld/st_i128, these are only supported for a 64-bit host. 70612fde9bcSRichard Henderson | 7075e97a28aSMark Cave-Ayland | For i386, qemu_st8_i32 is exactly like qemu_st_i32, except the size of 7085e97a28aSMark Cave-Ayland the memory operation is known to be 8-bit. This allows the backend to 7095e97a28aSMark Cave-Ayland provide a different set of register constraints. 7105e97a28aSMark Cave-Ayland 7115e97a28aSMark Cave-Ayland 7125e97a28aSMark Cave-AylandHost vector operations 7135e97a28aSMark Cave-Ayland---------------------- 7145e97a28aSMark Cave-Ayland 7155e97a28aSMark Cave-AylandAll of the vector ops have two parameters, ``TCGOP_VECL`` & ``TCGOP_VECE``. 7165e97a28aSMark Cave-AylandThe former specifies the length of the vector in log2 64-bit units; the 7175e97a28aSMark Cave-Aylandlatter specifies the length of the element (if applicable) in log2 8-bit units. 7185e97a28aSMark Cave-AylandE.g. VECL = 1 -> 64 << 1 -> v128, and VECE = 2 -> 1 << 2 -> i32. 7195e97a28aSMark Cave-Ayland 7205e97a28aSMark Cave-Ayland.. list-table:: 7215e97a28aSMark Cave-Ayland 7225e97a28aSMark Cave-Ayland * - mov_vec *v0*, *v1* 723b08caa6dSMark Cave-Ayland 7245e97a28aSMark Cave-Ayland ld_vec *v0*, *t1* 725b08caa6dSMark Cave-Ayland 7265e97a28aSMark Cave-Ayland st_vec *v0*, *t1* 7275e97a28aSMark Cave-Ayland 7285e97a28aSMark Cave-Ayland - | Move, load and store. 7295e97a28aSMark Cave-Ayland 7305e97a28aSMark Cave-Ayland * - dup_vec *v0*, *r1* 7315e97a28aSMark Cave-Ayland 7325e97a28aSMark Cave-Ayland - | Duplicate the low N bits of *r1* into VECL/VECE copies across *v0*. 7335e97a28aSMark Cave-Ayland 7345e97a28aSMark Cave-Ayland * - dupi_vec *v0*, *c* 7355e97a28aSMark Cave-Ayland 7365e97a28aSMark Cave-Ayland - | Similarly, for a constant. 7375e97a28aSMark Cave-Ayland | Smaller values will be replicated to host register size by the expanders. 7385e97a28aSMark Cave-Ayland 7395e97a28aSMark Cave-Ayland * - dup2_vec *v0*, *r1*, *r2* 7405e97a28aSMark Cave-Ayland 7415e97a28aSMark Cave-Ayland - | Duplicate *r2*:*r1* into VECL/64 copies across *v0*. This opcode is 7425e97a28aSMark Cave-Ayland only present for 32-bit hosts. 7435e97a28aSMark Cave-Ayland 7445e97a28aSMark Cave-Ayland * - add_vec *v0*, *v1*, *v2* 7455e97a28aSMark Cave-Ayland 7465e97a28aSMark Cave-Ayland - | *v0* = *v1* + *v2*, in elements across the vector. 7475e97a28aSMark Cave-Ayland 7485e97a28aSMark Cave-Ayland * - sub_vec *v0*, *v1*, *v2* 7495e97a28aSMark Cave-Ayland 7505e97a28aSMark Cave-Ayland - | Similarly, *v0* = *v1* - *v2*. 7515e97a28aSMark Cave-Ayland 7525e97a28aSMark Cave-Ayland * - mul_vec *v0*, *v1*, *v2* 7535e97a28aSMark Cave-Ayland 7545e97a28aSMark Cave-Ayland - | Similarly, *v0* = *v1* * *v2*. 7555e97a28aSMark Cave-Ayland 7565e97a28aSMark Cave-Ayland * - neg_vec *v0*, *v1* 7575e97a28aSMark Cave-Ayland 7585e97a28aSMark Cave-Ayland - | Similarly, *v0* = -*v1*. 7595e97a28aSMark Cave-Ayland 7605e97a28aSMark Cave-Ayland * - abs_vec *v0*, *v1* 7615e97a28aSMark Cave-Ayland 7625e97a28aSMark Cave-Ayland - | Similarly, *v0* = *v1* < 0 ? -*v1* : *v1*, in elements across the vector. 7635e97a28aSMark Cave-Ayland 7645e97a28aSMark Cave-Ayland * - smin_vec *v0*, *v1*, *v2* 7655e97a28aSMark Cave-Ayland 7665e97a28aSMark Cave-Ayland umin_vec *v0*, *v1*, *v2* 7675e97a28aSMark Cave-Ayland 7685e97a28aSMark Cave-Ayland - | Similarly, *v0* = MIN(*v1*, *v2*), for signed and unsigned element types. 7695e97a28aSMark Cave-Ayland 7705e97a28aSMark Cave-Ayland * - smax_vec *v0*, *v1*, *v2* 7715e97a28aSMark Cave-Ayland 7725e97a28aSMark Cave-Ayland umax_vec *v0*, *v1*, *v2* 7735e97a28aSMark Cave-Ayland 7745e97a28aSMark Cave-Ayland - | Similarly, *v0* = MAX(*v1*, *v2*), for signed and unsigned element types. 7755e97a28aSMark Cave-Ayland 7765e97a28aSMark Cave-Ayland * - ssadd_vec *v0*, *v1*, *v2* 7775e97a28aSMark Cave-Ayland 7785e97a28aSMark Cave-Ayland sssub_vec *v0*, *v1*, *v2* 7795e97a28aSMark Cave-Ayland 7805e97a28aSMark Cave-Ayland usadd_vec *v0*, *v1*, *v2* 7815e97a28aSMark Cave-Ayland 7825e97a28aSMark Cave-Ayland ussub_vec *v0*, *v1*, *v2* 7835e97a28aSMark Cave-Ayland 7845e97a28aSMark Cave-Ayland - | Signed and unsigned saturating addition and subtraction. 7855e97a28aSMark Cave-Ayland | 7865e97a28aSMark Cave-Ayland | If the true result is not representable within the element type, the 7875e97a28aSMark Cave-Ayland element is set to the minimum or maximum value for the type. 7885e97a28aSMark Cave-Ayland 7895e97a28aSMark Cave-Ayland * - and_vec *v0*, *v1*, *v2* 7905e97a28aSMark Cave-Ayland 7915e97a28aSMark Cave-Ayland or_vec *v0*, *v1*, *v2* 7925e97a28aSMark Cave-Ayland 7935e97a28aSMark Cave-Ayland xor_vec *v0*, *v1*, *v2* 7945e97a28aSMark Cave-Ayland 7955e97a28aSMark Cave-Ayland andc_vec *v0*, *v1*, *v2* 7965e97a28aSMark Cave-Ayland 7975e97a28aSMark Cave-Ayland orc_vec *v0*, *v1*, *v2* 7985e97a28aSMark Cave-Ayland 7995e97a28aSMark Cave-Ayland not_vec *v0*, *v1* 8005e97a28aSMark Cave-Ayland 8015e97a28aSMark Cave-Ayland - | Similarly, logical operations with and without complement. 8025e97a28aSMark Cave-Ayland | 8035e97a28aSMark Cave-Ayland | Note that VECE is unused. 8045e97a28aSMark Cave-Ayland 8055e97a28aSMark Cave-Ayland * - shli_vec *v0*, *v1*, *i2* 8065e97a28aSMark Cave-Ayland 8075e97a28aSMark Cave-Ayland shls_vec *v0*, *v1*, *s2* 8085e97a28aSMark Cave-Ayland 8095e97a28aSMark Cave-Ayland - | Shift all elements from v1 by a scalar *i2*/*s2*. I.e. 8105e97a28aSMark Cave-Ayland 8115e97a28aSMark Cave-Ayland .. code-block:: c 8125e97a28aSMark Cave-Ayland 8135e97a28aSMark Cave-Ayland for (i = 0; i < VECL/VECE; ++i) { 8145e97a28aSMark Cave-Ayland v0[i] = v1[i] << s2; 8155e97a28aSMark Cave-Ayland } 8165e97a28aSMark Cave-Ayland 8175e97a28aSMark Cave-Ayland * - shri_vec *v0*, *v1*, *i2* 8185e97a28aSMark Cave-Ayland 8195e97a28aSMark Cave-Ayland sari_vec *v0*, *v1*, *i2* 8205e97a28aSMark Cave-Ayland 8215e97a28aSMark Cave-Ayland rotli_vec *v0*, *v1*, *i2* 8225e97a28aSMark Cave-Ayland 8235e97a28aSMark Cave-Ayland shrs_vec *v0*, *v1*, *s2* 8245e97a28aSMark Cave-Ayland 8255e97a28aSMark Cave-Ayland sars_vec *v0*, *v1*, *s2* 8265e97a28aSMark Cave-Ayland 8275e97a28aSMark Cave-Ayland - | Similarly for logical and arithmetic right shift, and left rotate. 8285e97a28aSMark Cave-Ayland 8295e97a28aSMark Cave-Ayland * - shlv_vec *v0*, *v1*, *v2* 8305e97a28aSMark Cave-Ayland 8315e97a28aSMark Cave-Ayland - | Shift elements from *v1* by elements from *v2*. I.e. 8325e97a28aSMark Cave-Ayland 8335e97a28aSMark Cave-Ayland .. code-block:: c 8345e97a28aSMark Cave-Ayland 8355e97a28aSMark Cave-Ayland for (i = 0; i < VECL/VECE; ++i) { 8365e97a28aSMark Cave-Ayland v0[i] = v1[i] << v2[i]; 8375e97a28aSMark Cave-Ayland } 8385e97a28aSMark Cave-Ayland 8395e97a28aSMark Cave-Ayland * - shrv_vec *v0*, *v1*, *v2* 8405e97a28aSMark Cave-Ayland 8415e97a28aSMark Cave-Ayland sarv_vec *v0*, *v1*, *v2* 8425e97a28aSMark Cave-Ayland 8435e97a28aSMark Cave-Ayland rotlv_vec *v0*, *v1*, *v2* 8445e97a28aSMark Cave-Ayland 8455e97a28aSMark Cave-Ayland rotrv_vec *v0*, *v1*, *v2* 8465e97a28aSMark Cave-Ayland 8475e97a28aSMark Cave-Ayland - | Similarly for logical and arithmetic right shift, and rotates. 8485e97a28aSMark Cave-Ayland 8495e97a28aSMark Cave-Ayland * - cmp_vec *v0*, *v1*, *v2*, *cond* 8505e97a28aSMark Cave-Ayland 8515e97a28aSMark Cave-Ayland - | Compare vectors by element, storing -1 for true and 0 for false. 8525e97a28aSMark Cave-Ayland 8535e97a28aSMark Cave-Ayland * - bitsel_vec *v0*, *v1*, *v2*, *v3* 8545e97a28aSMark Cave-Ayland 8555e97a28aSMark Cave-Ayland - | Bitwise select, *v0* = (*v2* & *v1*) | (*v3* & ~\ *v1*), across the entire vector. 8565e97a28aSMark Cave-Ayland 8575e97a28aSMark Cave-Ayland * - cmpsel_vec *v0*, *c1*, *c2*, *v3*, *v4*, *cond* 8585e97a28aSMark Cave-Ayland 8595e97a28aSMark Cave-Ayland - | Select elements based on comparison results: 8605e97a28aSMark Cave-Ayland 8615e97a28aSMark Cave-Ayland .. code-block:: c 8625e97a28aSMark Cave-Ayland 8635e97a28aSMark Cave-Ayland for (i = 0; i < n; ++i) { 8645e97a28aSMark Cave-Ayland v0[i] = (c1[i] cond c2[i]) ? v3[i] : v4[i]. 8655e97a28aSMark Cave-Ayland } 8665e97a28aSMark Cave-Ayland 8675e97a28aSMark Cave-Ayland**Note 1**: Some shortcuts are defined when the last operand is known to be 8685e97a28aSMark Cave-Aylanda constant (e.g. addi for add, movi for mov). 8695e97a28aSMark Cave-Ayland 8705e97a28aSMark Cave-Ayland**Note 2**: When using TCG, the opcodes must never be generated directly 8715e97a28aSMark Cave-Aylandas some of them may not be available as "real" opcodes. Always use the 8725e97a28aSMark Cave-Aylandfunction tcg_gen_xxx(args). 8735e97a28aSMark Cave-Ayland 8745e97a28aSMark Cave-Ayland 8755e97a28aSMark Cave-AylandBackend 8765e97a28aSMark Cave-Ayland======= 8775e97a28aSMark Cave-Ayland 8785e97a28aSMark Cave-Ayland``tcg-target.h`` contains the target specific definitions. ``tcg-target.c.inc`` 8795e97a28aSMark Cave-Aylandcontains the target specific code; it is #included by ``tcg/tcg.c``, rather 8805e97a28aSMark Cave-Aylandthan being a standalone C file. 8815e97a28aSMark Cave-Ayland 8825e97a28aSMark Cave-AylandAssumptions 8835e97a28aSMark Cave-Ayland----------- 8845e97a28aSMark Cave-Ayland 8855e97a28aSMark Cave-AylandThe target word size (``TCG_TARGET_REG_BITS``) is expected to be 32 bit or 8865e97a28aSMark Cave-Ayland64 bit. It is expected that the pointer has the same size as the word. 8875e97a28aSMark Cave-Ayland 8885e97a28aSMark Cave-AylandOn a 32 bit target, all 64 bit operations are converted to 32 bits. A 8895e97a28aSMark Cave-Aylandfew specific operations must be implemented to allow it (see add2_i32, 8905e97a28aSMark Cave-Aylandsub2_i32, brcond2_i32). 8915e97a28aSMark Cave-Ayland 8925e97a28aSMark Cave-AylandOn a 64 bit target, the values are transferred between 32 and 64-bit 8935e97a28aSMark Cave-Aylandregisters using the following ops: 8945e97a28aSMark Cave-Ayland 895bb9d7ee8SPhilippe Mathieu-Daudé- extrl_i64_i32 896bb9d7ee8SPhilippe Mathieu-Daudé- extrh_i64_i32 8975e97a28aSMark Cave-Ayland- ext_i32_i64 8985e97a28aSMark Cave-Ayland- extu_i32_i64 8995e97a28aSMark Cave-Ayland 9005e97a28aSMark Cave-AylandThey ensure that the values are correctly truncated or extended when 9015e97a28aSMark Cave-Aylandmoved from a 32-bit to a 64-bit register or vice-versa. Note that the 902bb9d7ee8SPhilippe Mathieu-Daudéextrl_i64_i32 and extrh_i64_i32 are optional ops. It is not necessary 903bb9d7ee8SPhilippe Mathieu-Daudéto implement them if all the following conditions are met: 9045e97a28aSMark Cave-Ayland 9055e97a28aSMark Cave-Ayland- 64-bit registers can hold 32-bit values 9065e97a28aSMark Cave-Ayland- 32-bit values in a 64-bit register do not need to stay zero or 9075e97a28aSMark Cave-Ayland sign extended 9085e97a28aSMark Cave-Ayland- all 32-bit TCG ops ignore the high part of 64-bit registers 9095e97a28aSMark Cave-Ayland 9105e97a28aSMark Cave-AylandFloating point operations are not supported in this version. A 9115e97a28aSMark Cave-Aylandprevious incarnation of the code generator had full support of them, 9125e97a28aSMark Cave-Aylandbut it is better to concentrate on integer operations first. 9135e97a28aSMark Cave-Ayland 9145e97a28aSMark Cave-AylandConstraints 9155e97a28aSMark Cave-Ayland---------------- 9165e97a28aSMark Cave-Ayland 9175e97a28aSMark Cave-AylandGCC like constraints are used to define the constraints of every 9185e97a28aSMark Cave-Aylandinstruction. Memory constraints are not supported in this 9195e97a28aSMark Cave-Aylandversion. Aliases are specified in the input operands as for GCC. 9205e97a28aSMark Cave-Ayland 9215e97a28aSMark Cave-AylandThe same register may be used for both an input and an output, even when 9225e97a28aSMark Cave-Aylandthey are not explicitly aliased. If an op expands to multiple target 9235e97a28aSMark Cave-Aylandinstructions then care must be taken to avoid clobbering input values. 9245e97a28aSMark Cave-AylandGCC style "early clobber" outputs are supported, with '``&``'. 9255e97a28aSMark Cave-Ayland 9265e97a28aSMark Cave-AylandA target can define specific register or constant constraints. If an 9275e97a28aSMark Cave-Aylandoperation uses a constant input constraint which does not allow all 9285e97a28aSMark Cave-Aylandconstants, it must also accept registers in order to have a fallback. 9295e97a28aSMark Cave-AylandThe constraint '``i``' is defined generically to accept any constant. 9305e97a28aSMark Cave-AylandThe constraint '``r``' is not defined generically, but is consistently 9315e97a28aSMark Cave-Aylandused by each backend to indicate all registers. 9325e97a28aSMark Cave-Ayland 9335e97a28aSMark Cave-AylandThe movi_i32 and movi_i64 operations must accept any constants. 9345e97a28aSMark Cave-Ayland 9355e97a28aSMark Cave-AylandThe mov_i32 and mov_i64 operations must accept any registers of the 9365e97a28aSMark Cave-Aylandsame type. 9375e97a28aSMark Cave-Ayland 9385e97a28aSMark Cave-AylandThe ld/st/sti instructions must accept signed 32 bit constant offsets. 9395e97a28aSMark Cave-AylandThis can be implemented by reserving a specific register in which to 9405e97a28aSMark Cave-Aylandcompute the address if the offset is too big. 9415e97a28aSMark Cave-Ayland 9425e97a28aSMark Cave-AylandThe ld/st instructions must accept any destination (ld) or source (st) 9435e97a28aSMark Cave-Aylandregister. 9445e97a28aSMark Cave-Ayland 9455e97a28aSMark Cave-AylandThe sti instruction may fail if it cannot store the given constant. 9465e97a28aSMark Cave-Ayland 9475e97a28aSMark Cave-AylandFunction call assumptions 9485e97a28aSMark Cave-Ayland------------------------- 9495e97a28aSMark Cave-Ayland 9505e97a28aSMark Cave-Ayland- The only supported types for parameters and return value are: 32 and 9515e97a28aSMark Cave-Ayland 64 bit integers and pointer. 9525e97a28aSMark Cave-Ayland- The stack grows downwards. 9535e97a28aSMark Cave-Ayland- The first N parameters are passed in registers. 9545e97a28aSMark Cave-Ayland- The next parameters are passed on the stack by storing them as words. 9555e97a28aSMark Cave-Ayland- Some registers are clobbered during the call. 9565e97a28aSMark Cave-Ayland- The function can return 0 or 1 value in registers. On a 32 bit 9575e97a28aSMark Cave-Ayland target, functions must be able to return 2 values in registers for 9585e97a28aSMark Cave-Ayland 64 bit return type. 9595e97a28aSMark Cave-Ayland 9605e97a28aSMark Cave-Ayland 9615e97a28aSMark Cave-AylandRecommended coding rules for best performance 9625e97a28aSMark Cave-Ayland============================================= 9635e97a28aSMark Cave-Ayland 9645e97a28aSMark Cave-Ayland- Use globals to represent the parts of the QEMU CPU state which are 9655e97a28aSMark Cave-Ayland often modified, e.g. the integer registers and the condition 9665e97a28aSMark Cave-Ayland codes. TCG will be able to use host registers to store them. 9675e97a28aSMark Cave-Ayland 9685e97a28aSMark Cave-Ayland- Don't hesitate to use helpers for complicated or seldom used guest 9695e97a28aSMark Cave-Ayland instructions. There is little performance advantage in using TCG to 9705e97a28aSMark Cave-Ayland implement guest instructions taking more than about twenty TCG 9715e97a28aSMark Cave-Ayland instructions. Note that this rule of thumb is more applicable to 9725e97a28aSMark Cave-Ayland helpers doing complex logic or arithmetic, where the C compiler has 9735e97a28aSMark Cave-Ayland scope to do a good job of optimisation; it is less relevant where 9745e97a28aSMark Cave-Ayland the instruction is mostly doing loads and stores, and in those cases 9755e97a28aSMark Cave-Ayland inline TCG may still be faster for longer sequences. 9765e97a28aSMark Cave-Ayland 9775e97a28aSMark Cave-Ayland- Use the 'discard' instruction if you know that TCG won't be able to 9785e97a28aSMark Cave-Ayland prove that a given global is "dead" at a given program point. The 9795e97a28aSMark Cave-Ayland x86 guest uses it to improve the condition codes optimisation. 980