xref: /openbmc/qemu/target/hexagon/README (revision 7a819de850cce076fcb2456ea5e9d19433c02065)
1c3fb76b9STaylor SimpsonHexagon is Qualcomm's very long instruction word (VLIW) digital signal
2375bcf38STaylor Simpsonprocessor(DSP).  We also support Hexagon Vector eXtensions (HVX).  HVX
3375bcf38STaylor Simpsonis a wide vector coprocessor designed for high performance computer vision,
4375bcf38STaylor Simpsonimage processing, machine learning, and other workloads.
5c3fb76b9STaylor Simpson
6c3fb76b9STaylor SimpsonThe following versions of the Hexagon core are supported
7c3fb76b9STaylor Simpson    Scalar core: v67
8c3fb76b9STaylor Simpson    https://developer.qualcomm.com/downloads/qualcomm-hexagon-v67-programmer-s-reference-manual
9375bcf38STaylor Simpson    HVX extension: v66
10375bcf38STaylor Simpson    https://developer.qualcomm.com/downloads/qualcomm-hexagon-v66-hvx-programmer-s-reference-manual
11c3fb76b9STaylor Simpson
12c3fb76b9STaylor SimpsonWe presented an overview of the project at the 2019 KVM Forum.
13c3fb76b9STaylor Simpson    https://kvmforum2019.sched.com/event/Tmwc/qemu-hexagon-automatic-translation-of-the-isa-manual-pseudcode-to-tiny-code-instructions-of-a-vliw-architecture-niccolo-izzo-revng-taylor-simpson-qualcomm-innovation-center
14c3fb76b9STaylor Simpson
15c3fb76b9STaylor Simpson*** Tour of the code ***
16c3fb76b9STaylor Simpson
17c3fb76b9STaylor SimpsonThe qemu-hexagon implementation is a combination of qemu and the Hexagon
18c3fb76b9STaylor Simpsonarchitecture library (aka archlib).  The three primary directories with
19c3fb76b9STaylor SimpsonHexagon-specific code are
20c3fb76b9STaylor Simpson
21c3fb76b9STaylor Simpson    qemu/target/hexagon
22c3fb76b9STaylor Simpson        This has all the instruction and packet semantics
23c3fb76b9STaylor Simpson    qemu/target/hexagon/imported
24c3fb76b9STaylor Simpson        These files are imported with very little modification from archlib
25c3fb76b9STaylor Simpson        *.idef                  Instruction semantics definition
26c3fb76b9STaylor Simpson        macros.def              Mapping of macros to instruction attributes
27c3fb76b9STaylor Simpson        encode*.def             Encoding patterns for each instruction
28c3fb76b9STaylor Simpson        iclass.def              Instruction class definitions used to determine
29c3fb76b9STaylor Simpson                                legal VLIW slots for each instruction
3095e11505SAlessandro Di Federico    qemu/target/hexagon/idef-parser
3195e11505SAlessandro Di Federico        Parser that, given the high-level definitions of an instruction,
3295e11505SAlessandro Di Federico        produces a C function generating equivalent tiny code instructions.
3395e11505SAlessandro Di Federico        See README.rst.
34c3fb76b9STaylor Simpson    qemu/linux-user/hexagon
35c3fb76b9STaylor Simpson        Helpers for loading the ELF file and making Linux system calls,
36c3fb76b9STaylor Simpson        signals, etc
37c3fb76b9STaylor Simpson
38c3fb76b9STaylor SimpsonWe start with scripts that generate a bunch of include files.  This
39c3fb76b9STaylor Simpsonis a two step process.  The first step is to use the C preprocessor to expand
40c3fb76b9STaylor Simpsonmacros inside the architecture definition files.  This is done in
41c3fb76b9STaylor Simpsontarget/hexagon/gen_semantics.c.  This step produces
42c3fb76b9STaylor Simpson    <BUILD_DIR>/target/hexagon/semantics_generated.pyinc.
43c3fb76b9STaylor SimpsonThat file is consumed by the following python scripts to produce the indicated
44c3fb76b9STaylor Simpsonheader files in <BUILD_DIR>/target/hexagon
45c3fb76b9STaylor Simpson        gen_opcodes_def.py              -> opcodes_def_generated.h.inc
46c3fb76b9STaylor Simpson        gen_op_regs.py                  -> op_regs_generated.h.inc
47c3fb76b9STaylor Simpson        gen_printinsn.py                -> printinsn_generated.h.inc
48c3fb76b9STaylor Simpson        gen_op_attribs.py               -> op_attribs_generated.h.inc
49c3fb76b9STaylor Simpson        gen_helper_protos.py            -> helper_protos_generated.h.inc
50c3fb76b9STaylor Simpson        gen_shortcode.py                -> shortcode_generated.h.inc
51c3fb76b9STaylor Simpson        gen_tcg_funcs.py                -> tcg_funcs_generated.c.inc
52c3fb76b9STaylor Simpson        gen_tcg_func_table.py           -> tcg_func_table_generated.c.inc
53c3fb76b9STaylor Simpson        gen_helper_funcs.py             -> helper_funcs_generated.c.inc
5495e11505SAlessandro Di Federico        gen_idef_parser_funcs.py        -> idef_parser_input.h
55c3fb76b9STaylor Simpson
56c3fb76b9STaylor SimpsonQemu helper functions have 3 parts
57c3fb76b9STaylor Simpson    DEF_HELPER declaration indicates the signature of the helper
58c3fb76b9STaylor Simpson    gen_helper_<NAME> will generate a TCG call to the helper function
59c3fb76b9STaylor Simpson    The helper implementation
60c3fb76b9STaylor Simpson
61c3fb76b9STaylor SimpsonHere's an example of the A2_add instruction.
62c3fb76b9STaylor Simpson    Instruction tag        A2_add
63c3fb76b9STaylor Simpson    Assembly syntax        "Rd32=add(Rs32,Rt32)"
64c3fb76b9STaylor Simpson    Instruction semantics  "{ RdV=RsV+RtV;}"
65c3fb76b9STaylor Simpson
66c3fb76b9STaylor SimpsonBy convention, the operands are identified by letter
67c3fb76b9STaylor Simpson    RdV is the destination register
68c3fb76b9STaylor Simpson    RsV, RtV are source registers
69c3fb76b9STaylor Simpson
70c3fb76b9STaylor SimpsonThe generator uses the operand naming conventions (see large comment in
71c3fb76b9STaylor Simpsonhex_common.py) to determine the signature of the helper function.  Here are the
72c3fb76b9STaylor Simpsonresults for A2_add
73c3fb76b9STaylor Simpson
74c3fb76b9STaylor Simpsonhelper_protos_generated.h.inc
75c3fb76b9STaylor Simpson    DEF_HELPER_3(A2_add, s32, env, s32, s32)
76c3fb76b9STaylor Simpson
77c3fb76b9STaylor Simpsontcg_funcs_generated.c.inc
78c3fb76b9STaylor Simpson    static void generate_A2_add(
79c3fb76b9STaylor Simpson                    CPUHexagonState *env,
80c3fb76b9STaylor Simpson                    DisasContext *ctx,
81c3fb76b9STaylor Simpson                    Insn *insn,
82c3fb76b9STaylor Simpson                    Packet *pkt)
83c3fb76b9STaylor Simpson    {
84*7a819de8SRichard Henderson        TCGv RdV = tcg_temp_new();
85c3fb76b9STaylor Simpson        const int RdN = insn->regno[0];
86c3fb76b9STaylor Simpson        TCGv RsV = hex_gpr[insn->regno[1]];
87c3fb76b9STaylor Simpson        TCGv RtV = hex_gpr[insn->regno[2]];
88c3fb76b9STaylor Simpson        gen_helper_A2_add(RdV, cpu_env, RsV, RtV);
89c3fb76b9STaylor Simpson        gen_log_reg_write(RdN, RdV);
90c3fb76b9STaylor Simpson        ctx_log_reg_write(ctx, RdN);
91c3fb76b9STaylor Simpson        tcg_temp_free(RdV);
92c3fb76b9STaylor Simpson    }
93c3fb76b9STaylor Simpson
94c3fb76b9STaylor Simpsonhelper_funcs_generated.c.inc
95c3fb76b9STaylor Simpson    int32_t HELPER(A2_add)(CPUHexagonState *env, int32_t RsV, int32_t RtV)
96c3fb76b9STaylor Simpson    {
97c3fb76b9STaylor Simpson        uint32_t slot __attribute__((unused)) = 4;
98c3fb76b9STaylor Simpson        int32_t RdV = 0;
99c3fb76b9STaylor Simpson        { RdV=RsV+RtV;}
100c3fb76b9STaylor Simpson        return RdV;
101c3fb76b9STaylor Simpson    }
102c3fb76b9STaylor Simpson
103c3fb76b9STaylor SimpsonNote that generate_A2_add updates the disassembly context to be processed
104c3fb76b9STaylor Simpsonwhen the packet commits (see "Packet Semantics" below).
105c3fb76b9STaylor Simpson
106c3fb76b9STaylor SimpsonThe generator checks for fGEN_TCG_<tag> macro.  This allows us to generate
107c3fb76b9STaylor SimpsonTCG code instead of a call to the helper.  If defined, the macro takes 1
108c3fb76b9STaylor Simpsonargument.
109c3fb76b9STaylor Simpson    C semantics (aka short code)
110c3fb76b9STaylor Simpson
111c3fb76b9STaylor SimpsonThis allows the code generator to override the auto-generated code.  In some
112c3fb76b9STaylor Simpsoncases this is necessary for correct execution.  We can also override for
113c3fb76b9STaylor Simpsonfaster emulation.  For example, calling a helper for add is more expensive
114c3fb76b9STaylor Simpsonthan generating a TCG add operation.
115c3fb76b9STaylor Simpson
116c3fb76b9STaylor SimpsonThe gen_tcg.h file has any overrides. For example, we could write
117c3fb76b9STaylor Simpson    #define fGEN_TCG_A2_add(GENHLPR, SHORTCODE) \
118c3fb76b9STaylor Simpson        tcg_gen_add_tl(RdV, RsV, RtV)
119c3fb76b9STaylor Simpson
120c3fb76b9STaylor SimpsonThe instruction semantics C code relies heavily on macros.  In cases where the
121c3fb76b9STaylor SimpsonC semantics are specified only with macros, we can override the default with
122c3fb76b9STaylor Simpsonthe short semantics option and #define the macros to generate TCG code.  One
123c3fb76b9STaylor Simpsonexample is L2_loadw_locked:
124c3fb76b9STaylor Simpson    Instruction tag        L2_loadw_locked
125c3fb76b9STaylor Simpson    Assembly syntax        "Rd32=memw_locked(Rs32)"
126c3fb76b9STaylor Simpson    Instruction semantics  "{ fEA_REG(RsV); fLOAD_LOCKED(1,4,u,EA,RdV) }"
127c3fb76b9STaylor Simpson
128c3fb76b9STaylor SimpsonIn gen_tcg.h, we use the shortcode
129c3fb76b9STaylor Simpson#define fGEN_TCG_L2_loadw_locked(SHORTCODE) \
130c3fb76b9STaylor Simpson    SHORTCODE
131c3fb76b9STaylor Simpson
132c3fb76b9STaylor SimpsonThere are also cases where we brute force the TCG code generation.
133c3fb76b9STaylor SimpsonInstructions with multiple definitions are examples.  These require special
134c3fb76b9STaylor Simpsonhandling because qemu helpers can only return a single value.
135c3fb76b9STaylor Simpson
136375bcf38STaylor SimpsonFor HVX vectors, the generator behaves slightly differently.  The wide vectors
137375bcf38STaylor Simpsonwon't fit in a TCGv or TCGv_i64, so we pass TCGv_ptr variables to pass the
138375bcf38STaylor Simpsonaddress to helper functions.  Here's an example for an HVX vector-add-word
139375bcf38STaylor Simpsonistruction.
140375bcf38STaylor Simpson    static void generate_V6_vaddw(
141375bcf38STaylor Simpson                    CPUHexagonState *env,
142375bcf38STaylor Simpson                    DisasContext *ctx,
143375bcf38STaylor Simpson                    Insn *insn,
144375bcf38STaylor Simpson                    Packet *pkt)
145375bcf38STaylor Simpson    {
146375bcf38STaylor Simpson        const int VdN = insn->regno[0];
147375bcf38STaylor Simpson        const intptr_t VdV_off =
148375bcf38STaylor Simpson            ctx_future_vreg_off(ctx, VdN, 1, true);
149*7a819de8SRichard Henderson        TCGv_ptr VdV = tcg_temp_new_ptr();
150375bcf38STaylor Simpson        tcg_gen_addi_ptr(VdV, cpu_env, VdV_off);
151375bcf38STaylor Simpson        const int VuN = insn->regno[1];
152375bcf38STaylor Simpson        const intptr_t VuV_off =
153375bcf38STaylor Simpson            vreg_src_off(ctx, VuN);
154*7a819de8SRichard Henderson        TCGv_ptr VuV = tcg_temp_new_ptr();
155375bcf38STaylor Simpson        const int VvN = insn->regno[2];
156375bcf38STaylor Simpson        const intptr_t VvV_off =
157375bcf38STaylor Simpson            vreg_src_off(ctx, VvN);
158*7a819de8SRichard Henderson        TCGv_ptr VvV = tcg_temp_new_ptr();
159375bcf38STaylor Simpson        tcg_gen_addi_ptr(VuV, cpu_env, VuV_off);
160375bcf38STaylor Simpson        tcg_gen_addi_ptr(VvV, cpu_env, VvV_off);
161375bcf38STaylor Simpson        TCGv slot = tcg_constant_tl(insn->slot);
162375bcf38STaylor Simpson        gen_helper_V6_vaddw(cpu_env, VdV, VuV, VvV, slot);
163375bcf38STaylor Simpson        tcg_temp_free(slot);
164375bcf38STaylor Simpson        gen_log_vreg_write(ctx, VdV_off, VdN, EXT_DFL, insn->slot, false);
165375bcf38STaylor Simpson        ctx_log_vreg_write(ctx, VdN, EXT_DFL, false);
166375bcf38STaylor Simpson        tcg_temp_free_ptr(VdV);
167375bcf38STaylor Simpson        tcg_temp_free_ptr(VuV);
168375bcf38STaylor Simpson        tcg_temp_free_ptr(VvV);
169375bcf38STaylor Simpson    }
170375bcf38STaylor Simpson
171375bcf38STaylor SimpsonNotice that we also generate a variable named <operand>_off for each operand of
172375bcf38STaylor Simpsonthe instruction.  This makes it easy to override the instruction semantics with
173375bcf38STaylor Simpsonfunctions from tcg-op-gvec.h.  Here's the override for this instruction.
174375bcf38STaylor Simpson    #define fGEN_TCG_V6_vaddw(SHORTCODE) \
175375bcf38STaylor Simpson        tcg_gen_gvec_add(MO_32, VdV_off, VuV_off, VvV_off, \
176375bcf38STaylor Simpson                         sizeof(MMVector), sizeof(MMVector))
177375bcf38STaylor Simpson
178375bcf38STaylor SimpsonFinally, we notice that the override doesn't use the TCGv_ptr variables, so
179375bcf38STaylor Simpsonwe don't generate them when an override is present.  Here is what we generate
180375bcf38STaylor Simpsonwhen the override is present.
181375bcf38STaylor Simpson    static void generate_V6_vaddw(
182375bcf38STaylor Simpson                    CPUHexagonState *env,
183375bcf38STaylor Simpson                    DisasContext *ctx,
184375bcf38STaylor Simpson                    Insn *insn,
185375bcf38STaylor Simpson                    Packet *pkt)
186375bcf38STaylor Simpson    {
187375bcf38STaylor Simpson        const int VdN = insn->regno[0];
188375bcf38STaylor Simpson        const intptr_t VdV_off =
189375bcf38STaylor Simpson            ctx_future_vreg_off(ctx, VdN, 1, true);
190375bcf38STaylor Simpson        const int VuN = insn->regno[1];
191375bcf38STaylor Simpson        const intptr_t VuV_off =
192375bcf38STaylor Simpson            vreg_src_off(ctx, VuN);
193375bcf38STaylor Simpson        const int VvN = insn->regno[2];
194375bcf38STaylor Simpson        const intptr_t VvV_off =
195375bcf38STaylor Simpson            vreg_src_off(ctx, VvN);
196375bcf38STaylor Simpson        fGEN_TCG_V6_vaddw({ fHIDE(int i;) fVFOREACH(32, i) { VdV.w[i] = VuV.w[i] + VvV.w[i] ; } });
197375bcf38STaylor Simpson        gen_log_vreg_write(ctx, VdV_off, VdN, EXT_DFL, insn->slot, false);
198375bcf38STaylor Simpson        ctx_log_vreg_write(ctx, VdN, EXT_DFL, false);
199375bcf38STaylor Simpson    }
200375bcf38STaylor Simpson
201c3fb76b9STaylor SimpsonIn addition to instruction semantics, we use a generator to create the decode
202c3fb76b9STaylor Simpsontree.  This generation is also a two step process.  The first step is to run
203c3fb76b9STaylor Simpsontarget/hexagon/gen_dectree_import.c to produce
204c3fb76b9STaylor Simpson    <BUILD_DIR>/target/hexagon/iset.py
205c3fb76b9STaylor SimpsonThis file is imported by target/hexagon/dectree.py to produce
206c3fb76b9STaylor Simpson    <BUILD_DIR>/target/hexagon/dectree_generated.h.inc
207c3fb76b9STaylor Simpson
208c3fb76b9STaylor Simpson*** Key Files ***
209c3fb76b9STaylor Simpson
210c3fb76b9STaylor Simpsoncpu.h
211c3fb76b9STaylor Simpson
212c3fb76b9STaylor SimpsonThis file contains the definition of the CPUHexagonState struct.  It is the
213c3fb76b9STaylor Simpsonruntime information for each thread and contains stuff like the GPR and
214c3fb76b9STaylor Simpsonpredicate registers.
215c3fb76b9STaylor Simpson
216c3fb76b9STaylor Simpsonmacros.h
217375bcf38STaylor Simpsonmmvec/macros.h
218c3fb76b9STaylor Simpson
219c3fb76b9STaylor SimpsonThe Hexagon arch lib relies heavily on macros for the instruction semantics.
220c3fb76b9STaylor SimpsonThis is a great advantage for qemu because we can override them for different
221c3fb76b9STaylor Simpsonpurposes.  You will also notice there are sometimes two definitions of a macro.
222c3fb76b9STaylor SimpsonThe QEMU_GENERATE variable determines whether we want the macro to generate TCG
223c3fb76b9STaylor Simpsoncode.  If QEMU_GENERATE is not defined, we want the macro to generate vanilla
224c3fb76b9STaylor SimpsonC code that will work in the helper implementation.
225c3fb76b9STaylor Simpson
226c3fb76b9STaylor Simpsontranslate.c
227c3fb76b9STaylor Simpson
228c3fb76b9STaylor SimpsonThe functions in this file generate TCG code for a translation block.  Some
229c3fb76b9STaylor Simpsonimportant functions in this file are
230c3fb76b9STaylor Simpson
231c3fb76b9STaylor Simpson    gen_start_packet - initialize the data structures for packet semantics
232c3fb76b9STaylor Simpson    gen_commit_packet - commit the register writes, stores, etc for a packet
233c3fb76b9STaylor Simpson    decode_and_translate_packet - disassemble a packet and generate code
234c3fb76b9STaylor Simpson
235c3fb76b9STaylor Simpsongenptr.c
236c3fb76b9STaylor Simpsongen_tcg.h
237c3fb76b9STaylor Simpson
238c3fb76b9STaylor SimpsonThese files create a function for each instruction.  It is mostly composed of
239c3fb76b9STaylor SimpsonfGEN_TCG_<tag> definitions followed by including tcg_funcs_generated.c.inc.
240c3fb76b9STaylor Simpson
241c3fb76b9STaylor Simpsonop_helper.c
242c3fb76b9STaylor Simpson
243c3fb76b9STaylor SimpsonThis file contains the implementations of all the helpers.  There are a few
244c3fb76b9STaylor Simpsongeneral purpose helpers, but most of them are generated by including
245c3fb76b9STaylor Simpsonhelper_funcs_generated.c.inc.  There are also several helpers used for debugging.
246c3fb76b9STaylor Simpson
247c3fb76b9STaylor Simpson
248c3fb76b9STaylor Simpson*** Packet Semantics ***
249c3fb76b9STaylor Simpson
250c3fb76b9STaylor SimpsonVLIW packet semantics differ from serial semantics in that all input operands
251c3fb76b9STaylor Simpsonare read, then the operations are performed, then all the results are written.
252c3fb76b9STaylor SimpsonFor exmaple, this packet performs a swap of registers r0 and r1
253c3fb76b9STaylor Simpson    { r0 = r1; r1 = r0 }
254c3fb76b9STaylor SimpsonNote that the result is different if the instructions are executed serially.
255c3fb76b9STaylor Simpson
256c3fb76b9STaylor SimpsonPacket semantics dictate that we defer any changes of state until the entire
257c3fb76b9STaylor Simpsonpacket is committed.  We record the results of each instruction in a side data
258c3fb76b9STaylor Simpsonstructure, and update the visible processor state when we commit the packet.
259c3fb76b9STaylor Simpson
260c3fb76b9STaylor SimpsonThe data structures are divided between the runtime state and the translation
261c3fb76b9STaylor Simpsoncontext.
262c3fb76b9STaylor Simpson
263c3fb76b9STaylor SimpsonDuring the TCG generation (see translate.[ch]), we use the DisasContext to
264c3fb76b9STaylor Simpsontrack what needs to be done during packet commit.  Here are the relevant
265c3fb76b9STaylor Simpsonfields
266c3fb76b9STaylor Simpson
267c3fb76b9STaylor Simpson    reg_log            list of registers written
268c3fb76b9STaylor Simpson    reg_log_idx        index into ctx_reg_log
269c3fb76b9STaylor Simpson    pred_log           list of predicates written
270c3fb76b9STaylor Simpson    pred_log_idx       index into ctx_pred_log
271c3fb76b9STaylor Simpson    store_width        width of stores (indexed by slot)
272c3fb76b9STaylor Simpson
273c3fb76b9STaylor SimpsonDuring runtime, the following fields in CPUHexagonState (see cpu.h) are used
274c3fb76b9STaylor Simpson
275c3fb76b9STaylor Simpson    new_value             new value of a given register
276c3fb76b9STaylor Simpson    reg_written           boolean indicating if register was written
277c3fb76b9STaylor Simpson    new_pred_value        new value of a predicate register
278c3fb76b9STaylor Simpson    pred_written          boolean indicating if predicate was written
279c3fb76b9STaylor Simpson    mem_log_stores        record of the stores (indexed by slot)
280c3fb76b9STaylor Simpson
281375bcf38STaylor SimpsonFor Hexagon Vector eXtensions (HVX), the following fields are used
282375bcf38STaylor Simpson    VRegs                       Vector registers
283375bcf38STaylor Simpson    future_VRegs                Registers to be stored during packet commit
284375bcf38STaylor Simpson    tmp_VRegs                   Temporary registers *not* stored during commit
285375bcf38STaylor Simpson    VRegs_updated               Mask of predicated vector writes
286375bcf38STaylor Simpson    QRegs                       Q (vector predicate) registers
287375bcf38STaylor Simpson    future_QRegs                Registers to be stored during packet commit
288375bcf38STaylor Simpson    QRegs_updated               Mask of predicated vector writes
289375bcf38STaylor Simpson
290c3fb76b9STaylor Simpson*** Debugging ***
291c3fb76b9STaylor Simpson
292c3fb76b9STaylor SimpsonYou can turn on a lot of debugging by changing the HEX_DEBUG macro to 1 in
293c3fb76b9STaylor Simpsoninternal.h.  This will stream a lot of information as it generates TCG and
294c3fb76b9STaylor Simpsonexecutes the code.
295c3fb76b9STaylor Simpson
296c3fb76b9STaylor SimpsonTo track down nasty issues with Hexagon->TCG generation, we compare the
297c3fb76b9STaylor Simpsonexecution results with actual hardware running on a Hexagon Linux target.
298c3fb76b9STaylor SimpsonRun qemu with the "-d cpu" option.  Then, we can diff the results and figure
299c3fb76b9STaylor Simpsonout where qemu and hardware behave differently.
300c3fb76b9STaylor Simpson
301c3fb76b9STaylor SimpsonThe stacks are located at different locations.  We handle this by changing
302c3fb76b9STaylor Simpsonenv->stack_adjust in translate.c.  First, set this to zero and run qemu.
303c3fb76b9STaylor SimpsonThen, change env->stack_adjust to the difference between the two stack
304c3fb76b9STaylor Simpsonlocations.  Then rebuild qemu and run again. That will produce a very
305c3fb76b9STaylor Simpsonclean diff.
306c3fb76b9STaylor Simpson
307c3fb76b9STaylor SimpsonHere are some handy places to set breakpoints
308c3fb76b9STaylor Simpson
309c3fb76b9STaylor Simpson    At the call to gen_start_packet for a given PC (note that the line number
310c3fb76b9STaylor Simpson        might change in the future)
311c3fb76b9STaylor Simpson        br translate.c:602 if ctx->base.pc_next == 0xdeadbeef
312c3fb76b9STaylor Simpson    The helper function for each instruction is named helper_<TAG>, so here's
313c3fb76b9STaylor Simpson        an example that will set a breakpoint at the start
314c3fb76b9STaylor Simpson        br helper_A2_add
315c3fb76b9STaylor Simpson    If you have the HEX_DEBUG macro set, the following will be useful
316c3fb76b9STaylor Simpson        At the start of execution of a packet for a given PC
317c3fb76b9STaylor Simpson            br helper_debug_start_packet if env->gpr[41] == 0xdeadbeef
318c3fb76b9STaylor Simpson        At the end of execution of a packet for a given PC
319c3fb76b9STaylor Simpson            br helper_debug_commit_end if env->this_PC == 0xdeadbeef
320