xref: /openbmc/qemu/docs/devel/decodetree.rst (revision 58ea30f5)
1========================
2Decodetree Specification
3========================
4
5A *decodetree* is built from instruction *patterns*.  A pattern may
6represent a single architectural instruction or a group of same, depending
7on what is convenient for further processing.
8
9Each pattern has both *fixedbits* and *fixedmask*, the combination of which
10describes the condition under which the pattern is matched::
11
12  (insn & fixedmask) == fixedbits
13
14Each pattern may have *fields*, which are extracted from the insn and
15passed along to the translator.  Examples of such are registers,
16immediates, and sub-opcodes.
17
18In support of patterns, one may declare *fields*, *argument sets*, and
19*formats*, each of which may be re-used to simplify further definitions.
20
21Fields
22======
23
24Syntax::
25
26  field_def     := '%' identifier ( unnamed_field )+ ( !function=identifier )?
27  unnamed_field := number ':' ( 's' ) number
28
29For *unnamed_field*, the first number is the least-significant bit position
30of the field and the second number is the length of the field.  If the 's' is
31present, the field is considered signed.  If multiple ``unnamed_fields`` are
32present, they are concatenated.  In this way one can define disjoint fields.
33
34If ``!function`` is specified, the concatenated result is passed through the
35named function, taking and returning an integral value.
36
37FIXME: the fields of the structure into which this result will be stored
38is restricted to ``int``.  Which means that we cannot expand 64-bit items.
39
40Field examples:
41
42+---------------------------+---------------------------------------------+
43| Input                     | Generated code                              |
44+===========================+=============================================+
45| %disp   0:s16             | sextract(i, 0, 16)                          |
46+---------------------------+---------------------------------------------+
47| %imm9   16:6 10:3         | extract(i, 16, 6) << 3 | extract(i, 10, 3)  |
48+---------------------------+---------------------------------------------+
49| %disp12 0:s1 1:1 2:10     | sextract(i, 0, 1) << 11 |                   |
50|                           |    extract(i, 1, 1) << 10 |                 |
51|                           |    extract(i, 2, 10)                        |
52+---------------------------+---------------------------------------------+
53| %shimm8 5:s8 13:1         | expand_shimm8(sextract(i, 5, 8) << 1 |      |
54|   !function=expand_shimm8 |               extract(i, 13, 1))            |
55+---------------------------+---------------------------------------------+
56
57Argument Sets
58=============
59
60Syntax::
61
62  args_def    := '&' identifier ( args_elt )+ ( !extern )?
63  args_elt    := identifier
64
65Each *args_elt* defines an argument within the argument set.
66Each argument set will be rendered as a C structure "arg_$name"
67with each of the fields being one of the member arguments.
68
69If ``!extern`` is specified, the backing structure is assumed
70to have been already declared, typically via a second decoder.
71
72Argument sets are useful when one wants to define helper functions
73for the translator functions that can perform operations on a common
74set of arguments.  This can ensure, for instance, that the ``AND``
75pattern and the ``OR`` pattern put their operands into the same named
76structure, so that a common ``gen_logic_insn`` may be able to handle
77the operations common between the two.
78
79Argument set examples::
80
81  &reg3       ra rb rc
82  &loadstore  reg base offset
83
84
85Formats
86=======
87
88Syntax::
89
90  fmt_def      := '@' identifier ( fmt_elt )+
91  fmt_elt      := fixedbit_elt | field_elt | field_ref | args_ref
92  fixedbit_elt := [01.-]+
93  field_elt    := identifier ':' 's'? number
94  field_ref    := '%' identifier | identifier '=' '%' identifier
95  args_ref     := '&' identifier
96
97Defining a format is a handy way to avoid replicating groups of fields
98across many instruction patterns.
99
100A *fixedbit_elt* describes a contiguous sequence of bits that must
101be 1, 0, or don't care.  The difference between '.' and '-'
102is that '.' means that the bit will be covered with a field or a
103final 0 or 1 from the pattern, and '-' means that the bit is really
104ignored by the cpu and will not be specified.
105
106A *field_elt* describes a simple field only given a width; the position of
107the field is implied by its position with respect to other *fixedbit_elt*
108and *field_elt*.
109
110If any *fixedbit_elt* or *field_elt* appear, then all bits must be defined.
111Padding with a *fixedbit_elt* of all '.' is an easy way to accomplish that.
112
113A *field_ref* incorporates a field by reference.  This is the only way to
114add a complex field to a format.  A field may be renamed in the process
115via assignment to another identifier.  This is intended to allow the
116same argument set be used with disjoint named fields.
117
118A single *args_ref* may specify an argument set to use for the format.
119The set of fields in the format must be a subset of the arguments in
120the argument set.  If an argument set is not specified, one will be
121inferred from the set of fields.
122
123It is recommended, but not required, that all *field_ref* and *args_ref*
124appear at the end of the line, not interleaving with *fixedbit_elf* or
125*field_elt*.
126
127Format examples::
128
129  @opr    ...... ra:5 rb:5 ... 0 ....... rc:5
130  @opi    ...... ra:5 lit:8    1 ....... rc:5
131
132Patterns
133========
134
135Syntax::
136
137  pat_def      := identifier ( pat_elt )+
138  pat_elt      := fixedbit_elt | field_elt | field_ref | args_ref | fmt_ref | const_elt
139  fmt_ref      := '@' identifier
140  const_elt    := identifier '=' number
141
142The *fixedbit_elt* and *field_elt* specifiers are unchanged from formats.
143A pattern that does not specify a named format will have one inferred
144from a referenced argument set (if present) and the set of fields.
145
146A *const_elt* allows a argument to be set to a constant value.  This may
147come in handy when fields overlap between patterns and one has to
148include the values in the *fixedbit_elt* instead.
149
150The decoder will call a translator function for each pattern matched.
151
152Pattern examples::
153
154  addl_r   010000 ..... ..... .... 0000000 ..... @opr
155  addl_i   010000 ..... ..... .... 0000000 ..... @opi
156
157which will, in part, invoke::
158
159  trans_addl_r(ctx, &arg_opr, insn)
160
161and::
162
163  trans_addl_i(ctx, &arg_opi, insn)
164
165Pattern Groups
166==============
167
168Syntax::
169
170  group    := '{' ( pat_def | group )+ '}'
171
172A *group* begins with a lone open-brace, with all subsequent lines
173indented two spaces, and ending with a lone close-brace.  Groups
174may be nested, increasing the required indentation of the lines
175within the nested group to two spaces per nesting level.
176
177Unlike ungrouped patterns, grouped patterns are allowed to overlap.
178Conflicts are resolved by selecting the patterns in order.  If all
179of the fixedbits for a pattern match, its translate function will
180be called.  If the translate function returns false, then subsequent
181patterns within the group will be matched.
182
183The following example from PA-RISC shows specialization of the *or*
184instruction::
185
186  {
187    {
188      nop   000010 ----- ----- 0000 001001 0 00000
189      copy  000010 00000 r1:5  0000 001001 0 rt:5
190    }
191    or      000010 rt2:5 r1:5  cf:4 001001 0 rt:5
192  }
193
194When the *cf* field is zero, the instruction has no side effects,
195and may be specialized.  When the *rt* field is zero, the output
196is discarded and so the instruction has no effect.  When the *rt2*
197field is zero, the operation is ``reg[rt] | 0`` and so encodes
198the canonical register copy operation.
199
200The output from the generator might look like::
201
202  switch (insn & 0xfc000fe0) {
203  case 0x08000240:
204    /* 000010.. ........ ....0010 010..... */
205    if ((insn & 0x0000f000) == 0x00000000) {
206        /* 000010.. ........ 00000010 010..... */
207        if ((insn & 0x0000001f) == 0x00000000) {
208            /* 000010.. ........ 00000010 01000000 */
209            extract_decode_Fmt_0(&u.f_decode0, insn);
210            if (trans_nop(ctx, &u.f_decode0)) return true;
211        }
212        if ((insn & 0x03e00000) == 0x00000000) {
213            /* 00001000 000..... 00000010 010..... */
214            extract_decode_Fmt_1(&u.f_decode1, insn);
215            if (trans_copy(ctx, &u.f_decode1)) return true;
216        }
217    }
218    extract_decode_Fmt_2(&u.f_decode2, insn);
219    if (trans_or(ctx, &u.f_decode2)) return true;
220    return false;
221  }
222