xref: /openbmc/qemu/target/hexagon/README (revision c3fb76b9b779198c8c7b9e0eee19dc57abc921fc)
1*c3fb76b9STaylor SimpsonHexagon is Qualcomm's very long instruction word (VLIW) digital signal
2*c3fb76b9STaylor Simpsonprocessor(DSP).
3*c3fb76b9STaylor Simpson
4*c3fb76b9STaylor SimpsonThe following versions of the Hexagon core are supported
5*c3fb76b9STaylor Simpson    Scalar core: v67
6*c3fb76b9STaylor Simpson    https://developer.qualcomm.com/downloads/qualcomm-hexagon-v67-programmer-s-reference-manual
7*c3fb76b9STaylor Simpson
8*c3fb76b9STaylor SimpsonWe presented an overview of the project at the 2019 KVM Forum.
9*c3fb76b9STaylor Simpson    https://kvmforum2019.sched.com/event/Tmwc/qemu-hexagon-automatic-translation-of-the-isa-manual-pseudcode-to-tiny-code-instructions-of-a-vliw-architecture-niccolo-izzo-revng-taylor-simpson-qualcomm-innovation-center
10*c3fb76b9STaylor Simpson
11*c3fb76b9STaylor Simpson*** Tour of the code ***
12*c3fb76b9STaylor Simpson
13*c3fb76b9STaylor SimpsonThe qemu-hexagon implementation is a combination of qemu and the Hexagon
14*c3fb76b9STaylor Simpsonarchitecture library (aka archlib).  The three primary directories with
15*c3fb76b9STaylor SimpsonHexagon-specific code are
16*c3fb76b9STaylor Simpson
17*c3fb76b9STaylor Simpson    qemu/target/hexagon
18*c3fb76b9STaylor Simpson        This has all the instruction and packet semantics
19*c3fb76b9STaylor Simpson    qemu/target/hexagon/imported
20*c3fb76b9STaylor Simpson        These files are imported with very little modification from archlib
21*c3fb76b9STaylor Simpson        *.idef                  Instruction semantics definition
22*c3fb76b9STaylor Simpson        macros.def              Mapping of macros to instruction attributes
23*c3fb76b9STaylor Simpson        encode*.def             Encoding patterns for each instruction
24*c3fb76b9STaylor Simpson        iclass.def              Instruction class definitions used to determine
25*c3fb76b9STaylor Simpson                                legal VLIW slots for each instruction
26*c3fb76b9STaylor Simpson    qemu/linux-user/hexagon
27*c3fb76b9STaylor Simpson        Helpers for loading the ELF file and making Linux system calls,
28*c3fb76b9STaylor Simpson        signals, etc
29*c3fb76b9STaylor Simpson
30*c3fb76b9STaylor SimpsonWe start with scripts that generate a bunch of include files.  This
31*c3fb76b9STaylor Simpsonis a two step process.  The first step is to use the C preprocessor to expand
32*c3fb76b9STaylor Simpsonmacros inside the architecture definition files.  This is done in
33*c3fb76b9STaylor Simpsontarget/hexagon/gen_semantics.c.  This step produces
34*c3fb76b9STaylor Simpson    <BUILD_DIR>/target/hexagon/semantics_generated.pyinc.
35*c3fb76b9STaylor SimpsonThat file is consumed by the following python scripts to produce the indicated
36*c3fb76b9STaylor Simpsonheader files in <BUILD_DIR>/target/hexagon
37*c3fb76b9STaylor Simpson        gen_opcodes_def.py              -> opcodes_def_generated.h.inc
38*c3fb76b9STaylor Simpson        gen_op_regs.py                  -> op_regs_generated.h.inc
39*c3fb76b9STaylor Simpson        gen_printinsn.py                -> printinsn_generated.h.inc
40*c3fb76b9STaylor Simpson        gen_op_attribs.py               -> op_attribs_generated.h.inc
41*c3fb76b9STaylor Simpson        gen_helper_protos.py            -> helper_protos_generated.h.inc
42*c3fb76b9STaylor Simpson        gen_shortcode.py                -> shortcode_generated.h.inc
43*c3fb76b9STaylor Simpson        gen_tcg_funcs.py                -> tcg_funcs_generated.c.inc
44*c3fb76b9STaylor Simpson        gen_tcg_func_table.py           -> tcg_func_table_generated.c.inc
45*c3fb76b9STaylor Simpson        gen_helper_funcs.py             -> helper_funcs_generated.c.inc
46*c3fb76b9STaylor Simpson
47*c3fb76b9STaylor SimpsonQemu helper functions have 3 parts
48*c3fb76b9STaylor Simpson    DEF_HELPER declaration indicates the signature of the helper
49*c3fb76b9STaylor Simpson    gen_helper_<NAME> will generate a TCG call to the helper function
50*c3fb76b9STaylor Simpson    The helper implementation
51*c3fb76b9STaylor Simpson
52*c3fb76b9STaylor SimpsonHere's an example of the A2_add instruction.
53*c3fb76b9STaylor Simpson    Instruction tag        A2_add
54*c3fb76b9STaylor Simpson    Assembly syntax        "Rd32=add(Rs32,Rt32)"
55*c3fb76b9STaylor Simpson    Instruction semantics  "{ RdV=RsV+RtV;}"
56*c3fb76b9STaylor Simpson
57*c3fb76b9STaylor SimpsonBy convention, the operands are identified by letter
58*c3fb76b9STaylor Simpson    RdV is the destination register
59*c3fb76b9STaylor Simpson    RsV, RtV are source registers
60*c3fb76b9STaylor Simpson
61*c3fb76b9STaylor SimpsonThe generator uses the operand naming conventions (see large comment in
62*c3fb76b9STaylor Simpsonhex_common.py) to determine the signature of the helper function.  Here are the
63*c3fb76b9STaylor Simpsonresults for A2_add
64*c3fb76b9STaylor Simpson
65*c3fb76b9STaylor Simpsonhelper_protos_generated.h.inc
66*c3fb76b9STaylor Simpson    DEF_HELPER_3(A2_add, s32, env, s32, s32)
67*c3fb76b9STaylor Simpson
68*c3fb76b9STaylor Simpsontcg_funcs_generated.c.inc
69*c3fb76b9STaylor Simpson    static void generate_A2_add(
70*c3fb76b9STaylor Simpson                    CPUHexagonState *env,
71*c3fb76b9STaylor Simpson                    DisasContext *ctx,
72*c3fb76b9STaylor Simpson                    Insn *insn,
73*c3fb76b9STaylor Simpson                    Packet *pkt)
74*c3fb76b9STaylor Simpson    {
75*c3fb76b9STaylor Simpson        TCGv RdV = tcg_temp_local_new();
76*c3fb76b9STaylor Simpson        const int RdN = insn->regno[0];
77*c3fb76b9STaylor Simpson        TCGv RsV = hex_gpr[insn->regno[1]];
78*c3fb76b9STaylor Simpson        TCGv RtV = hex_gpr[insn->regno[2]];
79*c3fb76b9STaylor Simpson        gen_helper_A2_add(RdV, cpu_env, RsV, RtV);
80*c3fb76b9STaylor Simpson        gen_log_reg_write(RdN, RdV);
81*c3fb76b9STaylor Simpson        ctx_log_reg_write(ctx, RdN);
82*c3fb76b9STaylor Simpson        tcg_temp_free(RdV);
83*c3fb76b9STaylor Simpson    }
84*c3fb76b9STaylor Simpson
85*c3fb76b9STaylor Simpsonhelper_funcs_generated.c.inc
86*c3fb76b9STaylor Simpson    int32_t HELPER(A2_add)(CPUHexagonState *env, int32_t RsV, int32_t RtV)
87*c3fb76b9STaylor Simpson    {
88*c3fb76b9STaylor Simpson        uint32_t slot __attribute__((unused)) = 4;
89*c3fb76b9STaylor Simpson        int32_t RdV = 0;
90*c3fb76b9STaylor Simpson        { RdV=RsV+RtV;}
91*c3fb76b9STaylor Simpson        return RdV;
92*c3fb76b9STaylor Simpson    }
93*c3fb76b9STaylor Simpson
94*c3fb76b9STaylor SimpsonNote that generate_A2_add updates the disassembly context to be processed
95*c3fb76b9STaylor Simpsonwhen the packet commits (see "Packet Semantics" below).
96*c3fb76b9STaylor Simpson
97*c3fb76b9STaylor SimpsonThe generator checks for fGEN_TCG_<tag> macro.  This allows us to generate
98*c3fb76b9STaylor SimpsonTCG code instead of a call to the helper.  If defined, the macro takes 1
99*c3fb76b9STaylor Simpsonargument.
100*c3fb76b9STaylor Simpson    C semantics (aka short code)
101*c3fb76b9STaylor Simpson
102*c3fb76b9STaylor SimpsonThis allows the code generator to override the auto-generated code.  In some
103*c3fb76b9STaylor Simpsoncases this is necessary for correct execution.  We can also override for
104*c3fb76b9STaylor Simpsonfaster emulation.  For example, calling a helper for add is more expensive
105*c3fb76b9STaylor Simpsonthan generating a TCG add operation.
106*c3fb76b9STaylor Simpson
107*c3fb76b9STaylor SimpsonThe gen_tcg.h file has any overrides. For example, we could write
108*c3fb76b9STaylor Simpson    #define fGEN_TCG_A2_add(GENHLPR, SHORTCODE) \
109*c3fb76b9STaylor Simpson        tcg_gen_add_tl(RdV, RsV, RtV)
110*c3fb76b9STaylor Simpson
111*c3fb76b9STaylor SimpsonThe instruction semantics C code relies heavily on macros.  In cases where the
112*c3fb76b9STaylor SimpsonC semantics are specified only with macros, we can override the default with
113*c3fb76b9STaylor Simpsonthe short semantics option and #define the macros to generate TCG code.  One
114*c3fb76b9STaylor Simpsonexample is L2_loadw_locked:
115*c3fb76b9STaylor Simpson    Instruction tag        L2_loadw_locked
116*c3fb76b9STaylor Simpson    Assembly syntax        "Rd32=memw_locked(Rs32)"
117*c3fb76b9STaylor Simpson    Instruction semantics  "{ fEA_REG(RsV); fLOAD_LOCKED(1,4,u,EA,RdV) }"
118*c3fb76b9STaylor Simpson
119*c3fb76b9STaylor SimpsonIn gen_tcg.h, we use the shortcode
120*c3fb76b9STaylor Simpson#define fGEN_TCG_L2_loadw_locked(SHORTCODE) \
121*c3fb76b9STaylor Simpson    SHORTCODE
122*c3fb76b9STaylor Simpson
123*c3fb76b9STaylor SimpsonThere are also cases where we brute force the TCG code generation.
124*c3fb76b9STaylor SimpsonInstructions with multiple definitions are examples.  These require special
125*c3fb76b9STaylor Simpsonhandling because qemu helpers can only return a single value.
126*c3fb76b9STaylor Simpson
127*c3fb76b9STaylor SimpsonIn addition to instruction semantics, we use a generator to create the decode
128*c3fb76b9STaylor Simpsontree.  This generation is also a two step process.  The first step is to run
129*c3fb76b9STaylor Simpsontarget/hexagon/gen_dectree_import.c to produce
130*c3fb76b9STaylor Simpson    <BUILD_DIR>/target/hexagon/iset.py
131*c3fb76b9STaylor SimpsonThis file is imported by target/hexagon/dectree.py to produce
132*c3fb76b9STaylor Simpson    <BUILD_DIR>/target/hexagon/dectree_generated.h.inc
133*c3fb76b9STaylor Simpson
134*c3fb76b9STaylor Simpson*** Key Files ***
135*c3fb76b9STaylor Simpson
136*c3fb76b9STaylor Simpsoncpu.h
137*c3fb76b9STaylor Simpson
138*c3fb76b9STaylor SimpsonThis file contains the definition of the CPUHexagonState struct.  It is the
139*c3fb76b9STaylor Simpsonruntime information for each thread and contains stuff like the GPR and
140*c3fb76b9STaylor Simpsonpredicate registers.
141*c3fb76b9STaylor Simpson
142*c3fb76b9STaylor Simpsonmacros.h
143*c3fb76b9STaylor Simpson
144*c3fb76b9STaylor SimpsonThe Hexagon arch lib relies heavily on macros for the instruction semantics.
145*c3fb76b9STaylor SimpsonThis is a great advantage for qemu because we can override them for different
146*c3fb76b9STaylor Simpsonpurposes.  You will also notice there are sometimes two definitions of a macro.
147*c3fb76b9STaylor SimpsonThe QEMU_GENERATE variable determines whether we want the macro to generate TCG
148*c3fb76b9STaylor Simpsoncode.  If QEMU_GENERATE is not defined, we want the macro to generate vanilla
149*c3fb76b9STaylor SimpsonC code that will work in the helper implementation.
150*c3fb76b9STaylor Simpson
151*c3fb76b9STaylor Simpsontranslate.c
152*c3fb76b9STaylor Simpson
153*c3fb76b9STaylor SimpsonThe functions in this file generate TCG code for a translation block.  Some
154*c3fb76b9STaylor Simpsonimportant functions in this file are
155*c3fb76b9STaylor Simpson
156*c3fb76b9STaylor Simpson    gen_start_packet - initialize the data structures for packet semantics
157*c3fb76b9STaylor Simpson    gen_commit_packet - commit the register writes, stores, etc for a packet
158*c3fb76b9STaylor Simpson    decode_and_translate_packet - disassemble a packet and generate code
159*c3fb76b9STaylor Simpson
160*c3fb76b9STaylor Simpsongenptr.c
161*c3fb76b9STaylor Simpsongen_tcg.h
162*c3fb76b9STaylor Simpson
163*c3fb76b9STaylor SimpsonThese files create a function for each instruction.  It is mostly composed of
164*c3fb76b9STaylor SimpsonfGEN_TCG_<tag> definitions followed by including tcg_funcs_generated.c.inc.
165*c3fb76b9STaylor Simpson
166*c3fb76b9STaylor Simpsonop_helper.c
167*c3fb76b9STaylor Simpson
168*c3fb76b9STaylor SimpsonThis file contains the implementations of all the helpers.  There are a few
169*c3fb76b9STaylor Simpsongeneral purpose helpers, but most of them are generated by including
170*c3fb76b9STaylor Simpsonhelper_funcs_generated.c.inc.  There are also several helpers used for debugging.
171*c3fb76b9STaylor Simpson
172*c3fb76b9STaylor Simpson
173*c3fb76b9STaylor Simpson*** Packet Semantics ***
174*c3fb76b9STaylor Simpson
175*c3fb76b9STaylor SimpsonVLIW packet semantics differ from serial semantics in that all input operands
176*c3fb76b9STaylor Simpsonare read, then the operations are performed, then all the results are written.
177*c3fb76b9STaylor SimpsonFor exmaple, this packet performs a swap of registers r0 and r1
178*c3fb76b9STaylor Simpson    { r0 = r1; r1 = r0 }
179*c3fb76b9STaylor SimpsonNote that the result is different if the instructions are executed serially.
180*c3fb76b9STaylor Simpson
181*c3fb76b9STaylor SimpsonPacket semantics dictate that we defer any changes of state until the entire
182*c3fb76b9STaylor Simpsonpacket is committed.  We record the results of each instruction in a side data
183*c3fb76b9STaylor Simpsonstructure, and update the visible processor state when we commit the packet.
184*c3fb76b9STaylor Simpson
185*c3fb76b9STaylor SimpsonThe data structures are divided between the runtime state and the translation
186*c3fb76b9STaylor Simpsoncontext.
187*c3fb76b9STaylor Simpson
188*c3fb76b9STaylor SimpsonDuring the TCG generation (see translate.[ch]), we use the DisasContext to
189*c3fb76b9STaylor Simpsontrack what needs to be done during packet commit.  Here are the relevant
190*c3fb76b9STaylor Simpsonfields
191*c3fb76b9STaylor Simpson
192*c3fb76b9STaylor Simpson    reg_log            list of registers written
193*c3fb76b9STaylor Simpson    reg_log_idx        index into ctx_reg_log
194*c3fb76b9STaylor Simpson    pred_log           list of predicates written
195*c3fb76b9STaylor Simpson    pred_log_idx       index into ctx_pred_log
196*c3fb76b9STaylor Simpson    store_width        width of stores (indexed by slot)
197*c3fb76b9STaylor Simpson
198*c3fb76b9STaylor SimpsonDuring runtime, the following fields in CPUHexagonState (see cpu.h) are used
199*c3fb76b9STaylor Simpson
200*c3fb76b9STaylor Simpson    new_value             new value of a given register
201*c3fb76b9STaylor Simpson    reg_written           boolean indicating if register was written
202*c3fb76b9STaylor Simpson    new_pred_value        new value of a predicate register
203*c3fb76b9STaylor Simpson    pred_written          boolean indicating if predicate was written
204*c3fb76b9STaylor Simpson    mem_log_stores        record of the stores (indexed by slot)
205*c3fb76b9STaylor Simpson
206*c3fb76b9STaylor Simpson*** Debugging ***
207*c3fb76b9STaylor Simpson
208*c3fb76b9STaylor SimpsonYou can turn on a lot of debugging by changing the HEX_DEBUG macro to 1 in
209*c3fb76b9STaylor Simpsoninternal.h.  This will stream a lot of information as it generates TCG and
210*c3fb76b9STaylor Simpsonexecutes the code.
211*c3fb76b9STaylor Simpson
212*c3fb76b9STaylor SimpsonTo track down nasty issues with Hexagon->TCG generation, we compare the
213*c3fb76b9STaylor Simpsonexecution results with actual hardware running on a Hexagon Linux target.
214*c3fb76b9STaylor SimpsonRun qemu with the "-d cpu" option.  Then, we can diff the results and figure
215*c3fb76b9STaylor Simpsonout where qemu and hardware behave differently.
216*c3fb76b9STaylor Simpson
217*c3fb76b9STaylor SimpsonThe stacks are located at different locations.  We handle this by changing
218*c3fb76b9STaylor Simpsonenv->stack_adjust in translate.c.  First, set this to zero and run qemu.
219*c3fb76b9STaylor SimpsonThen, change env->stack_adjust to the difference between the two stack
220*c3fb76b9STaylor Simpsonlocations.  Then rebuild qemu and run again. That will produce a very
221*c3fb76b9STaylor Simpsonclean diff.
222*c3fb76b9STaylor Simpson
223*c3fb76b9STaylor SimpsonHere are some handy places to set breakpoints
224*c3fb76b9STaylor Simpson
225*c3fb76b9STaylor Simpson    At the call to gen_start_packet for a given PC (note that the line number
226*c3fb76b9STaylor Simpson        might change in the future)
227*c3fb76b9STaylor Simpson        br translate.c:602 if ctx->base.pc_next == 0xdeadbeef
228*c3fb76b9STaylor Simpson    The helper function for each instruction is named helper_<TAG>, so here's
229*c3fb76b9STaylor Simpson        an example that will set a breakpoint at the start
230*c3fb76b9STaylor Simpson        br helper_A2_add
231*c3fb76b9STaylor Simpson    If you have the HEX_DEBUG macro set, the following will be useful
232*c3fb76b9STaylor Simpson        At the start of execution of a packet for a given PC
233*c3fb76b9STaylor Simpson            br helper_debug_start_packet if env->gpr[41] == 0xdeadbeef
234*c3fb76b9STaylor Simpson        At the end of execution of a packet for a given PC
235*c3fb76b9STaylor Simpson            br helper_debug_commit_end if env->this_PC == 0xdeadbeef
236