1======================== 2Decodetree Specification 3======================== 4 5A *decodetree* is built from instruction *patterns*. A pattern may 6represent a single architectural instruction or a group of same, depending 7on what is convenient for further processing. 8 9Each pattern has both *fixedbits* and *fixedmask*, the combination of which 10describes the condition under which the pattern is matched:: 11 12 (insn & fixedmask) == fixedbits 13 14Each pattern may have *fields*, which are extracted from the insn and 15passed along to the translator. Examples of such are registers, 16immediates, and sub-opcodes. 17 18In support of patterns, one may declare *fields*, *argument sets*, and 19*formats*, each of which may be re-used to simplify further definitions. 20 21Fields 22====== 23 24Syntax:: 25 26 field_def := '%' identifier ( unnamed_field )+ ( !function=identifier )? 27 unnamed_field := number ':' ( 's' ) number 28 29For *unnamed_field*, the first number is the least-significant bit position 30of the field and the second number is the length of the field. If the 's' is 31present, the field is considered signed. If multiple ``unnamed_fields`` are 32present, they are concatenated. In this way one can define disjoint fields. 33 34If ``!function`` is specified, the concatenated result is passed through the 35named function, taking and returning an integral value. 36 37FIXME: the fields of the structure into which this result will be stored 38is restricted to ``int``. Which means that we cannot expand 64-bit items. 39 40Field examples: 41 42+---------------------------+---------------------------------------------+ 43| Input | Generated code | 44+===========================+=============================================+ 45| %disp 0:s16 | sextract(i, 0, 16) | 46+---------------------------+---------------------------------------------+ 47| %imm9 16:6 10:3 | extract(i, 16, 6) << 3 | extract(i, 10, 3) | 48+---------------------------+---------------------------------------------+ 49| %disp12 0:s1 1:1 2:10 | sextract(i, 0, 1) << 11 | | 50| | extract(i, 1, 1) << 10 | | 51| | extract(i, 2, 10) | 52+---------------------------+---------------------------------------------+ 53| %shimm8 5:s8 13:1 | expand_shimm8(sextract(i, 5, 8) << 1 | | 54| !function=expand_shimm8 | extract(i, 13, 1)) | 55+---------------------------+---------------------------------------------+ 56 57Argument Sets 58============= 59 60Syntax:: 61 62 args_def := '&' identifier ( args_elt )+ ( !extern )? 63 args_elt := identifier 64 65Each *args_elt* defines an argument within the argument set. 66Each argument set will be rendered as a C structure "arg_$name" 67with each of the fields being one of the member arguments. 68 69If ``!extern`` is specified, the backing structure is assumed 70to have been already declared, typically via a second decoder. 71 72Argument sets are useful when one wants to define helper functions 73for the translator functions that can perform operations on a common 74set of arguments. This can ensure, for instance, that the ``AND`` 75pattern and the ``OR`` pattern put their operands into the same named 76structure, so that a common ``gen_logic_insn`` may be able to handle 77the operations common between the two. 78 79Argument set examples:: 80 81 ®3 ra rb rc 82 &loadstore reg base offset 83 84 85Formats 86======= 87 88Syntax:: 89 90 fmt_def := '@' identifier ( fmt_elt )+ 91 fmt_elt := fixedbit_elt | field_elt | field_ref | args_ref 92 fixedbit_elt := [01.-]+ 93 field_elt := identifier ':' 's'? number 94 field_ref := '%' identifier | identifier '=' '%' identifier 95 args_ref := '&' identifier 96 97Defining a format is a handy way to avoid replicating groups of fields 98across many instruction patterns. 99 100A *fixedbit_elt* describes a contiguous sequence of bits that must 101be 1, 0, or don't care. The difference between '.' and '-' 102is that '.' means that the bit will be covered with a field or a 103final 0 or 1 from the pattern, and '-' means that the bit is really 104ignored by the cpu and will not be specified. 105 106A *field_elt* describes a simple field only given a width; the position of 107the field is implied by its position with respect to other *fixedbit_elt* 108and *field_elt*. 109 110If any *fixedbit_elt* or *field_elt* appear, then all bits must be defined. 111Padding with a *fixedbit_elt* of all '.' is an easy way to accomplish that. 112 113A *field_ref* incorporates a field by reference. This is the only way to 114add a complex field to a format. A field may be renamed in the process 115via assignment to another identifier. This is intended to allow the 116same argument set be used with disjoint named fields. 117 118A single *args_ref* may specify an argument set to use for the format. 119The set of fields in the format must be a subset of the arguments in 120the argument set. If an argument set is not specified, one will be 121inferred from the set of fields. 122 123It is recommended, but not required, that all *field_ref* and *args_ref* 124appear at the end of the line, not interleaving with *fixedbit_elf* or 125*field_elt*. 126 127Format examples:: 128 129 @opr ...... ra:5 rb:5 ... 0 ....... rc:5 130 @opi ...... ra:5 lit:8 1 ....... rc:5 131 132Patterns 133======== 134 135Syntax:: 136 137 pat_def := identifier ( pat_elt )+ 138 pat_elt := fixedbit_elt | field_elt | field_ref | args_ref | fmt_ref | const_elt 139 fmt_ref := '@' identifier 140 const_elt := identifier '=' number 141 142The *fixedbit_elt* and *field_elt* specifiers are unchanged from formats. 143A pattern that does not specify a named format will have one inferred 144from a referenced argument set (if present) and the set of fields. 145 146A *const_elt* allows a argument to be set to a constant value. This may 147come in handy when fields overlap between patterns and one has to 148include the values in the *fixedbit_elt* instead. 149 150The decoder will call a translator function for each pattern matched. 151 152Pattern examples:: 153 154 addl_r 010000 ..... ..... .... 0000000 ..... @opr 155 addl_i 010000 ..... ..... .... 0000000 ..... @opi 156 157which will, in part, invoke:: 158 159 trans_addl_r(ctx, &arg_opr, insn) 160 161and:: 162 163 trans_addl_i(ctx, &arg_opi, insn) 164 165Pattern Groups 166============== 167 168Syntax:: 169 170 group := '{' ( pat_def | group )+ '}' 171 172A *group* begins with a lone open-brace, with all subsequent lines 173indented two spaces, and ending with a lone close-brace. Groups 174may be nested, increasing the required indentation of the lines 175within the nested group to two spaces per nesting level. 176 177Unlike ungrouped patterns, grouped patterns are allowed to overlap. 178Conflicts are resolved by selecting the patterns in order. If all 179of the fixedbits for a pattern match, its translate function will 180be called. If the translate function returns false, then subsequent 181patterns within the group will be matched. 182 183The following example from PA-RISC shows specialization of the *or* 184instruction:: 185 186 { 187 { 188 nop 000010 ----- ----- 0000 001001 0 00000 189 copy 000010 00000 r1:5 0000 001001 0 rt:5 190 } 191 or 000010 rt2:5 r1:5 cf:4 001001 0 rt:5 192 } 193 194When the *cf* field is zero, the instruction has no side effects, 195and may be specialized. When the *rt* field is zero, the output 196is discarded and so the instruction has no effect. When the *rt2* 197field is zero, the operation is ``reg[rt] | 0`` and so encodes 198the canonical register copy operation. 199 200The output from the generator might look like:: 201 202 switch (insn & 0xfc000fe0) { 203 case 0x08000240: 204 /* 000010.. ........ ....0010 010..... */ 205 if ((insn & 0x0000f000) == 0x00000000) { 206 /* 000010.. ........ 00000010 010..... */ 207 if ((insn & 0x0000001f) == 0x00000000) { 208 /* 000010.. ........ 00000010 01000000 */ 209 extract_decode_Fmt_0(&u.f_decode0, insn); 210 if (trans_nop(ctx, &u.f_decode0)) return true; 211 } 212 if ((insn & 0x03e00000) == 0x00000000) { 213 /* 00001000 000..... 00000010 010..... */ 214 extract_decode_Fmt_1(&u.f_decode1, insn); 215 if (trans_copy(ctx, &u.f_decode1)) return true; 216 } 217 } 218 extract_decode_Fmt_2(&u.f_decode2, insn); 219 if (trans_or(ctx, &u.f_decode2)) return true; 220 return false; 221 } 222