xref: /openbmc/qemu/docs/devel/decodetree.rst (revision 51bdb0b57a2d9e84d6915fbae7b5d76c8820cf3c)
13fdbf5d6SRichard Henderson========================
23fdbf5d6SRichard HendersonDecodetree Specification
33fdbf5d6SRichard Henderson========================
43fdbf5d6SRichard Henderson
53fdbf5d6SRichard HendersonA *decodetree* is built from instruction *patterns*.  A pattern may
63fdbf5d6SRichard Hendersonrepresent a single architectural instruction or a group of same, depending
73fdbf5d6SRichard Hendersonon what is convenient for further processing.
83fdbf5d6SRichard Henderson
93fdbf5d6SRichard HendersonEach pattern has both *fixedbits* and *fixedmask*, the combination of which
103fdbf5d6SRichard Hendersondescribes the condition under which the pattern is matched::
113fdbf5d6SRichard Henderson
123fdbf5d6SRichard Henderson  (insn & fixedmask) == fixedbits
133fdbf5d6SRichard Henderson
143fdbf5d6SRichard HendersonEach pattern may have *fields*, which are extracted from the insn and
153fdbf5d6SRichard Hendersonpassed along to the translator.  Examples of such are registers,
163fdbf5d6SRichard Hendersonimmediates, and sub-opcodes.
173fdbf5d6SRichard Henderson
183fdbf5d6SRichard HendersonIn support of patterns, one may declare *fields*, *argument sets*, and
193fdbf5d6SRichard Henderson*formats*, each of which may be re-used to simplify further definitions.
203fdbf5d6SRichard Henderson
213fdbf5d6SRichard HendersonFields
223fdbf5d6SRichard Henderson======
233fdbf5d6SRichard Henderson
243fdbf5d6SRichard HendersonSyntax::
253fdbf5d6SRichard Henderson
26*7e626093SPeter Maydell  field_def     := '%' identifier ( field )* ( !function=identifier )?
27*7e626093SPeter Maydell  field         := unnamed_field | named_field
283fdbf5d6SRichard Henderson  unnamed_field := number ':' ( 's' ) number
29*7e626093SPeter Maydell  named_field   := identifier ':' ( 's' ) number
303fdbf5d6SRichard Henderson
313fdbf5d6SRichard HendersonFor *unnamed_field*, the first number is the least-significant bit position
323fdbf5d6SRichard Hendersonof the field and the second number is the length of the field.  If the 's' is
33*7e626093SPeter Maydellpresent, the field is considered signed.
34*7e626093SPeter Maydell
35*7e626093SPeter MaydellA *named_field* refers to some other field in the instruction pattern
36*7e626093SPeter Maydellor format. Regardless of the length of the other field where it is
37*7e626093SPeter Maydelldefined, it will be inserted into this field with the specified
38*7e626093SPeter Maydellsignedness and bit width.
39*7e626093SPeter Maydell
40*7e626093SPeter MaydellField definitions that involve loops (i.e. where a field is defined
41*7e626093SPeter Maydelldirectly or indirectly in terms of itself) are errors.
42*7e626093SPeter Maydell
43*7e626093SPeter MaydellA format can include fields that refer to named fields that are
44*7e626093SPeter Maydelldefined in the instruction pattern(s) that use the format.
45*7e626093SPeter MaydellConversely, an instruction pattern can include fields that refer to
46*7e626093SPeter Maydellnamed fields that are defined in the format it uses. However you
47*7e626093SPeter Maydellcannot currently do both at once (i.e. pattern P uses format F; F has
48*7e626093SPeter Maydella field A that refers to a named field B that is defined in P, and P
49*7e626093SPeter Maydellhas a field C that refers to a named field D that is defined in F).
50*7e626093SPeter Maydell
51*7e626093SPeter MaydellIf multiple ``fields`` are present, they are concatenated.
52*7e626093SPeter MaydellIn this way one can define disjoint fields.
533fdbf5d6SRichard Henderson
543fdbf5d6SRichard HendersonIf ``!function`` is specified, the concatenated result is passed through the
553fdbf5d6SRichard Hendersonnamed function, taking and returning an integral value.
563fdbf5d6SRichard Henderson
57*7e626093SPeter MaydellOne may use ``!function`` with zero ``fields``.  This case is called
5894597b61SRichard Hendersona *parameter*, and the named function is only passed the ``DisasContext``
5994597b61SRichard Hendersonand returns an integral value extracted from there.
6094597b61SRichard Henderson
61*7e626093SPeter MaydellA field with no ``fields`` and no ``!function`` is in error.
6294597b61SRichard Henderson
633fdbf5d6SRichard HendersonField examples:
643fdbf5d6SRichard Henderson
653fdbf5d6SRichard Henderson+---------------------------+---------------------------------------------+
663fdbf5d6SRichard Henderson| Input                     | Generated code                              |
673fdbf5d6SRichard Henderson+===========================+=============================================+
683fdbf5d6SRichard Henderson| %disp   0:s16             | sextract(i, 0, 16)                          |
693fdbf5d6SRichard Henderson+---------------------------+---------------------------------------------+
703fdbf5d6SRichard Henderson| %imm9   16:6 10:3         | extract(i, 16, 6) << 3 | extract(i, 10, 3)  |
713fdbf5d6SRichard Henderson+---------------------------+---------------------------------------------+
723fdbf5d6SRichard Henderson| %disp12 0:s1 1:1 2:10     | sextract(i, 0, 1) << 11 |                   |
733fdbf5d6SRichard Henderson|                           |    extract(i, 1, 1) << 10 |                 |
743fdbf5d6SRichard Henderson|                           |    extract(i, 2, 10)                        |
753fdbf5d6SRichard Henderson+---------------------------+---------------------------------------------+
763fdbf5d6SRichard Henderson| %shimm8 5:s8 13:1         | expand_shimm8(sextract(i, 5, 8) << 1 |      |
773fdbf5d6SRichard Henderson|   !function=expand_shimm8 |               extract(i, 13, 1))            |
783fdbf5d6SRichard Henderson+---------------------------+---------------------------------------------+
79*7e626093SPeter Maydell| %sz_imm 10:2 sz:3         | expand_sz_imm(extract(i, 10, 2) << 3 |      |
80*7e626093SPeter Maydell|   !function=expand_sz_imm |               extract(a->sz, 0, 3))         |
81*7e626093SPeter Maydell+---------------------------+---------------------------------------------+
823fdbf5d6SRichard Henderson
833fdbf5d6SRichard HendersonArgument Sets
843fdbf5d6SRichard Henderson=============
853fdbf5d6SRichard Henderson
863fdbf5d6SRichard HendersonSyntax::
873fdbf5d6SRichard Henderson
883fdbf5d6SRichard Henderson  args_def    := '&' identifier ( args_elt )+ ( !extern )?
89af93ccacSRichard Henderson  args_elt    := identifier (':' identifier)?
903fdbf5d6SRichard Henderson
913fdbf5d6SRichard HendersonEach *args_elt* defines an argument within the argument set.
92af93ccacSRichard HendersonIf the form of the *args_elt* contains a colon, the first
93af93ccacSRichard Hendersonidentifier is the argument name and the second identifier is
94af93ccacSRichard Hendersonthe argument type.  If the colon is missing, the argument
95af93ccacSRichard Hendersontype will be ``int``.
96af93ccacSRichard Henderson
973fdbf5d6SRichard HendersonEach argument set will be rendered as a C structure "arg_$name"
983fdbf5d6SRichard Hendersonwith each of the fields being one of the member arguments.
993fdbf5d6SRichard Henderson
1003fdbf5d6SRichard HendersonIf ``!extern`` is specified, the backing structure is assumed
1013fdbf5d6SRichard Hendersonto have been already declared, typically via a second decoder.
1023fdbf5d6SRichard Henderson
1035d53b0f5SRichard HendersonArgument sets are useful when one wants to define helper functions
1045d53b0f5SRichard Hendersonfor the translator functions that can perform operations on a common
1055d53b0f5SRichard Hendersonset of arguments.  This can ensure, for instance, that the ``AND``
1065d53b0f5SRichard Hendersonpattern and the ``OR`` pattern put their operands into the same named
1075d53b0f5SRichard Hendersonstructure, so that a common ``gen_logic_insn`` may be able to handle
1085d53b0f5SRichard Hendersonthe operations common between the two.
1095d53b0f5SRichard Henderson
1103fdbf5d6SRichard HendersonArgument set examples::
1113fdbf5d6SRichard Henderson
1123fdbf5d6SRichard Henderson  &reg3       ra rb rc
1133fdbf5d6SRichard Henderson  &loadstore  reg base offset
114af93ccacSRichard Henderson  &longldst   reg base offset:int64_t
1153fdbf5d6SRichard Henderson
1163fdbf5d6SRichard Henderson
1173fdbf5d6SRichard HendersonFormats
1183fdbf5d6SRichard Henderson=======
1193fdbf5d6SRichard Henderson
1203fdbf5d6SRichard HendersonSyntax::
1213fdbf5d6SRichard Henderson
1223fdbf5d6SRichard Henderson  fmt_def      := '@' identifier ( fmt_elt )+
1233fdbf5d6SRichard Henderson  fmt_elt      := fixedbit_elt | field_elt | field_ref | args_ref
1243fdbf5d6SRichard Henderson  fixedbit_elt := [01.-]+
1253fdbf5d6SRichard Henderson  field_elt    := identifier ':' 's'? number
1263fdbf5d6SRichard Henderson  field_ref    := '%' identifier | identifier '=' '%' identifier
1273fdbf5d6SRichard Henderson  args_ref     := '&' identifier
1283fdbf5d6SRichard Henderson
1293fdbf5d6SRichard HendersonDefining a format is a handy way to avoid replicating groups of fields
1303fdbf5d6SRichard Hendersonacross many instruction patterns.
1313fdbf5d6SRichard Henderson
1323fdbf5d6SRichard HendersonA *fixedbit_elt* describes a contiguous sequence of bits that must
1333fdbf5d6SRichard Hendersonbe 1, 0, or don't care.  The difference between '.' and '-'
1343fdbf5d6SRichard Hendersonis that '.' means that the bit will be covered with a field or a
1353fdbf5d6SRichard Hendersonfinal 0 or 1 from the pattern, and '-' means that the bit is really
1363fdbf5d6SRichard Hendersonignored by the cpu and will not be specified.
1373fdbf5d6SRichard Henderson
1383fdbf5d6SRichard HendersonA *field_elt* describes a simple field only given a width; the position of
1393fdbf5d6SRichard Hendersonthe field is implied by its position with respect to other *fixedbit_elt*
1403fdbf5d6SRichard Hendersonand *field_elt*.
1413fdbf5d6SRichard Henderson
1423fdbf5d6SRichard HendersonIf any *fixedbit_elt* or *field_elt* appear, then all bits must be defined.
1433fdbf5d6SRichard HendersonPadding with a *fixedbit_elt* of all '.' is an easy way to accomplish that.
1443fdbf5d6SRichard Henderson
1453fdbf5d6SRichard HendersonA *field_ref* incorporates a field by reference.  This is the only way to
1463fdbf5d6SRichard Hendersonadd a complex field to a format.  A field may be renamed in the process
1473fdbf5d6SRichard Hendersonvia assignment to another identifier.  This is intended to allow the
1483fdbf5d6SRichard Hendersonsame argument set be used with disjoint named fields.
1493fdbf5d6SRichard Henderson
1503fdbf5d6SRichard HendersonA single *args_ref* may specify an argument set to use for the format.
1513fdbf5d6SRichard HendersonThe set of fields in the format must be a subset of the arguments in
1523fdbf5d6SRichard Hendersonthe argument set.  If an argument set is not specified, one will be
1533fdbf5d6SRichard Hendersoninferred from the set of fields.
1543fdbf5d6SRichard Henderson
1553fdbf5d6SRichard HendersonIt is recommended, but not required, that all *field_ref* and *args_ref*
1563fdbf5d6SRichard Hendersonappear at the end of the line, not interleaving with *fixedbit_elf* or
1573fdbf5d6SRichard Henderson*field_elt*.
1583fdbf5d6SRichard Henderson
1593fdbf5d6SRichard HendersonFormat examples::
1603fdbf5d6SRichard Henderson
1613fdbf5d6SRichard Henderson  @opr    ...... ra:5 rb:5 ... 0 ....... rc:5
1623fdbf5d6SRichard Henderson  @opi    ...... ra:5 lit:8    1 ....... rc:5
1633fdbf5d6SRichard Henderson
1643fdbf5d6SRichard HendersonPatterns
1653fdbf5d6SRichard Henderson========
1663fdbf5d6SRichard Henderson
1673fdbf5d6SRichard HendersonSyntax::
1683fdbf5d6SRichard Henderson
1693fdbf5d6SRichard Henderson  pat_def      := identifier ( pat_elt )+
1703fdbf5d6SRichard Henderson  pat_elt      := fixedbit_elt | field_elt | field_ref | args_ref | fmt_ref | const_elt
1713fdbf5d6SRichard Henderson  fmt_ref      := '@' identifier
1723fdbf5d6SRichard Henderson  const_elt    := identifier '=' number
1733fdbf5d6SRichard Henderson
1743fdbf5d6SRichard HendersonThe *fixedbit_elt* and *field_elt* specifiers are unchanged from formats.
1753fdbf5d6SRichard HendersonA pattern that does not specify a named format will have one inferred
1763fdbf5d6SRichard Hendersonfrom a referenced argument set (if present) and the set of fields.
1773fdbf5d6SRichard Henderson
1783fdbf5d6SRichard HendersonA *const_elt* allows a argument to be set to a constant value.  This may
1793fdbf5d6SRichard Hendersoncome in handy when fields overlap between patterns and one has to
1803fdbf5d6SRichard Hendersoninclude the values in the *fixedbit_elt* instead.
1813fdbf5d6SRichard Henderson
1823fdbf5d6SRichard HendersonThe decoder will call a translator function for each pattern matched.
1833fdbf5d6SRichard Henderson
1843fdbf5d6SRichard HendersonPattern examples::
1853fdbf5d6SRichard Henderson
1863fdbf5d6SRichard Henderson  addl_r   010000 ..... ..... .... 0000000 ..... @opr
1873fdbf5d6SRichard Henderson  addl_i   010000 ..... ..... .... 0000000 ..... @opi
1883fdbf5d6SRichard Henderson
1893fdbf5d6SRichard Hendersonwhich will, in part, invoke::
1903fdbf5d6SRichard Henderson
1913fdbf5d6SRichard Henderson  trans_addl_r(ctx, &arg_opr, insn)
1923fdbf5d6SRichard Henderson
1933fdbf5d6SRichard Hendersonand::
1943fdbf5d6SRichard Henderson
1953fdbf5d6SRichard Henderson  trans_addl_i(ctx, &arg_opi, insn)
1960eff2df4SRichard Henderson
1970eff2df4SRichard HendersonPattern Groups
1980eff2df4SRichard Henderson==============
1990eff2df4SRichard Henderson
2000eff2df4SRichard HendersonSyntax::
2010eff2df4SRichard Henderson
202ffdfca6fSRichard Henderson  group            := overlap_group | no_overlap_group
203ffdfca6fSRichard Henderson  overlap_group    := '{' ( pat_def | group )+ '}'
204ffdfca6fSRichard Henderson  no_overlap_group := '[' ( pat_def | group )+ ']'
2050eff2df4SRichard Henderson
206ffdfca6fSRichard HendersonA *group* begins with a lone open-brace or open-bracket, with all
207ffdfca6fSRichard Hendersonsubsequent lines indented two spaces, and ending with a lone
208ffdfca6fSRichard Hendersonclose-brace or close-bracket.  Groups may be nested, increasing the
209ffdfca6fSRichard Hendersonrequired indentation of the lines within the nested group to two
210ffdfca6fSRichard Hendersonspaces per nesting level.
2110eff2df4SRichard Henderson
212ffdfca6fSRichard HendersonPatterns within overlap groups are allowed to overlap.  Conflicts are
213ffdfca6fSRichard Hendersonresolved by selecting the patterns in order.  If all of the fixedbits
214ffdfca6fSRichard Hendersonfor a pattern match, its translate function will be called.  If the
215ffdfca6fSRichard Hendersontranslate function returns false, then subsequent patterns within the
216ffdfca6fSRichard Hendersongroup will be matched.
217ffdfca6fSRichard Henderson
218ffdfca6fSRichard HendersonPatterns within no-overlap groups are not allowed to overlap, just
219ffdfca6fSRichard Hendersonthe same as ungrouped patterns.  Thus no-overlap groups are intended
220ffdfca6fSRichard Hendersonto be nested inside overlap groups.
2210eff2df4SRichard Henderson
2220eff2df4SRichard HendersonThe following example from PA-RISC shows specialization of the *or*
2230eff2df4SRichard Hendersoninstruction::
2240eff2df4SRichard Henderson
2250eff2df4SRichard Henderson  {
2260eff2df4SRichard Henderson    {
2270eff2df4SRichard Henderson      nop   000010 ----- ----- 0000 001001 0 00000
2280eff2df4SRichard Henderson      copy  000010 00000 r1:5  0000 001001 0 rt:5
2290eff2df4SRichard Henderson    }
2300eff2df4SRichard Henderson    or      000010 rt2:5 r1:5  cf:4 001001 0 rt:5
2310eff2df4SRichard Henderson  }
2320eff2df4SRichard Henderson
2330eff2df4SRichard HendersonWhen the *cf* field is zero, the instruction has no side effects,
2340eff2df4SRichard Hendersonand may be specialized.  When the *rt* field is zero, the output
2350eff2df4SRichard Hendersonis discarded and so the instruction has no effect.  When the *rt2*
236ffdfca6fSRichard Hendersonfield is zero, the operation is ``reg[r1] | 0`` and so encodes
2370eff2df4SRichard Hendersonthe canonical register copy operation.
2380eff2df4SRichard Henderson
2390eff2df4SRichard HendersonThe output from the generator might look like::
2400eff2df4SRichard Henderson
2410eff2df4SRichard Henderson  switch (insn & 0xfc000fe0) {
2420eff2df4SRichard Henderson  case 0x08000240:
2430eff2df4SRichard Henderson    /* 000010.. ........ ....0010 010..... */
2440eff2df4SRichard Henderson    if ((insn & 0x0000f000) == 0x00000000) {
2450eff2df4SRichard Henderson        /* 000010.. ........ 00000010 010..... */
2460eff2df4SRichard Henderson        if ((insn & 0x0000001f) == 0x00000000) {
2470eff2df4SRichard Henderson            /* 000010.. ........ 00000010 01000000 */
2480eff2df4SRichard Henderson            extract_decode_Fmt_0(&u.f_decode0, insn);
2490eff2df4SRichard Henderson            if (trans_nop(ctx, &u.f_decode0)) return true;
2500eff2df4SRichard Henderson        }
2510eff2df4SRichard Henderson        if ((insn & 0x03e00000) == 0x00000000) {
2520eff2df4SRichard Henderson            /* 00001000 000..... 00000010 010..... */
2530eff2df4SRichard Henderson            extract_decode_Fmt_1(&u.f_decode1, insn);
2540eff2df4SRichard Henderson            if (trans_copy(ctx, &u.f_decode1)) return true;
2550eff2df4SRichard Henderson        }
2560eff2df4SRichard Henderson    }
2570eff2df4SRichard Henderson    extract_decode_Fmt_2(&u.f_decode2, insn);
2580eff2df4SRichard Henderson    if (trans_or(ctx, &u.f_decode2)) return true;
2590eff2df4SRichard Henderson    return false;
2600eff2df4SRichard Henderson  }
261