1======================== 2Decodetree Specification 3======================== 4 5A *decodetree* is built from instruction *patterns*. A pattern may 6represent a single architectural instruction or a group of same, depending 7on what is convenient for further processing. 8 9Each pattern has both *fixedbits* and *fixedmask*, the combination of which 10describes the condition under which the pattern is matched:: 11 12 (insn & fixedmask) == fixedbits 13 14Each pattern may have *fields*, which are extracted from the insn and 15passed along to the translator. Examples of such are registers, 16immediates, and sub-opcodes. 17 18In support of patterns, one may declare *fields*, *argument sets*, and 19*formats*, each of which may be re-used to simplify further definitions. 20 21Fields 22====== 23 24Syntax:: 25 26 field_def := '%' identifier ( unnamed_field )* ( !function=identifier )? 27 unnamed_field := number ':' ( 's' ) number 28 29For *unnamed_field*, the first number is the least-significant bit position 30of the field and the second number is the length of the field. If the 's' is 31present, the field is considered signed. If multiple ``unnamed_fields`` are 32present, they are concatenated. In this way one can define disjoint fields. 33 34If ``!function`` is specified, the concatenated result is passed through the 35named function, taking and returning an integral value. 36 37One may use ``!function`` with zero ``unnamed_fields``. This case is called 38a *parameter*, and the named function is only passed the ``DisasContext`` 39and returns an integral value extracted from there. 40 41A field with no ``unnamed_fields`` and no ``!function`` is in error. 42 43FIXME: the fields of the structure into which this result will be stored 44is restricted to ``int``. Which means that we cannot expand 64-bit items. 45 46Field examples: 47 48+---------------------------+---------------------------------------------+ 49| Input | Generated code | 50+===========================+=============================================+ 51| %disp 0:s16 | sextract(i, 0, 16) | 52+---------------------------+---------------------------------------------+ 53| %imm9 16:6 10:3 | extract(i, 16, 6) << 3 | extract(i, 10, 3) | 54+---------------------------+---------------------------------------------+ 55| %disp12 0:s1 1:1 2:10 | sextract(i, 0, 1) << 11 | | 56| | extract(i, 1, 1) << 10 | | 57| | extract(i, 2, 10) | 58+---------------------------+---------------------------------------------+ 59| %shimm8 5:s8 13:1 | expand_shimm8(sextract(i, 5, 8) << 1 | | 60| !function=expand_shimm8 | extract(i, 13, 1)) | 61+---------------------------+---------------------------------------------+ 62 63Argument Sets 64============= 65 66Syntax:: 67 68 args_def := '&' identifier ( args_elt )+ ( !extern )? 69 args_elt := identifier 70 71Each *args_elt* defines an argument within the argument set. 72Each argument set will be rendered as a C structure "arg_$name" 73with each of the fields being one of the member arguments. 74 75If ``!extern`` is specified, the backing structure is assumed 76to have been already declared, typically via a second decoder. 77 78Argument sets are useful when one wants to define helper functions 79for the translator functions that can perform operations on a common 80set of arguments. This can ensure, for instance, that the ``AND`` 81pattern and the ``OR`` pattern put their operands into the same named 82structure, so that a common ``gen_logic_insn`` may be able to handle 83the operations common between the two. 84 85Argument set examples:: 86 87 ®3 ra rb rc 88 &loadstore reg base offset 89 90 91Formats 92======= 93 94Syntax:: 95 96 fmt_def := '@' identifier ( fmt_elt )+ 97 fmt_elt := fixedbit_elt | field_elt | field_ref | args_ref 98 fixedbit_elt := [01.-]+ 99 field_elt := identifier ':' 's'? number 100 field_ref := '%' identifier | identifier '=' '%' identifier 101 args_ref := '&' identifier 102 103Defining a format is a handy way to avoid replicating groups of fields 104across many instruction patterns. 105 106A *fixedbit_elt* describes a contiguous sequence of bits that must 107be 1, 0, or don't care. The difference between '.' and '-' 108is that '.' means that the bit will be covered with a field or a 109final 0 or 1 from the pattern, and '-' means that the bit is really 110ignored by the cpu and will not be specified. 111 112A *field_elt* describes a simple field only given a width; the position of 113the field is implied by its position with respect to other *fixedbit_elt* 114and *field_elt*. 115 116If any *fixedbit_elt* or *field_elt* appear, then all bits must be defined. 117Padding with a *fixedbit_elt* of all '.' is an easy way to accomplish that. 118 119A *field_ref* incorporates a field by reference. This is the only way to 120add a complex field to a format. A field may be renamed in the process 121via assignment to another identifier. This is intended to allow the 122same argument set be used with disjoint named fields. 123 124A single *args_ref* may specify an argument set to use for the format. 125The set of fields in the format must be a subset of the arguments in 126the argument set. If an argument set is not specified, one will be 127inferred from the set of fields. 128 129It is recommended, but not required, that all *field_ref* and *args_ref* 130appear at the end of the line, not interleaving with *fixedbit_elf* or 131*field_elt*. 132 133Format examples:: 134 135 @opr ...... ra:5 rb:5 ... 0 ....... rc:5 136 @opi ...... ra:5 lit:8 1 ....... rc:5 137 138Patterns 139======== 140 141Syntax:: 142 143 pat_def := identifier ( pat_elt )+ 144 pat_elt := fixedbit_elt | field_elt | field_ref | args_ref | fmt_ref | const_elt 145 fmt_ref := '@' identifier 146 const_elt := identifier '=' number 147 148The *fixedbit_elt* and *field_elt* specifiers are unchanged from formats. 149A pattern that does not specify a named format will have one inferred 150from a referenced argument set (if present) and the set of fields. 151 152A *const_elt* allows a argument to be set to a constant value. This may 153come in handy when fields overlap between patterns and one has to 154include the values in the *fixedbit_elt* instead. 155 156The decoder will call a translator function for each pattern matched. 157 158Pattern examples:: 159 160 addl_r 010000 ..... ..... .... 0000000 ..... @opr 161 addl_i 010000 ..... ..... .... 0000000 ..... @opi 162 163which will, in part, invoke:: 164 165 trans_addl_r(ctx, &arg_opr, insn) 166 167and:: 168 169 trans_addl_i(ctx, &arg_opi, insn) 170 171Pattern Groups 172============== 173 174Syntax:: 175 176 group := overlap_group | no_overlap_group 177 overlap_group := '{' ( pat_def | group )+ '}' 178 no_overlap_group := '[' ( pat_def | group )+ ']' 179 180A *group* begins with a lone open-brace or open-bracket, with all 181subsequent lines indented two spaces, and ending with a lone 182close-brace or close-bracket. Groups may be nested, increasing the 183required indentation of the lines within the nested group to two 184spaces per nesting level. 185 186Patterns within overlap groups are allowed to overlap. Conflicts are 187resolved by selecting the patterns in order. If all of the fixedbits 188for a pattern match, its translate function will be called. If the 189translate function returns false, then subsequent patterns within the 190group will be matched. 191 192Patterns within no-overlap groups are not allowed to overlap, just 193the same as ungrouped patterns. Thus no-overlap groups are intended 194to be nested inside overlap groups. 195 196The following example from PA-RISC shows specialization of the *or* 197instruction:: 198 199 { 200 { 201 nop 000010 ----- ----- 0000 001001 0 00000 202 copy 000010 00000 r1:5 0000 001001 0 rt:5 203 } 204 or 000010 rt2:5 r1:5 cf:4 001001 0 rt:5 205 } 206 207When the *cf* field is zero, the instruction has no side effects, 208and may be specialized. When the *rt* field is zero, the output 209is discarded and so the instruction has no effect. When the *rt2* 210field is zero, the operation is ``reg[r1] | 0`` and so encodes 211the canonical register copy operation. 212 213The output from the generator might look like:: 214 215 switch (insn & 0xfc000fe0) { 216 case 0x08000240: 217 /* 000010.. ........ ....0010 010..... */ 218 if ((insn & 0x0000f000) == 0x00000000) { 219 /* 000010.. ........ 00000010 010..... */ 220 if ((insn & 0x0000001f) == 0x00000000) { 221 /* 000010.. ........ 00000010 01000000 */ 222 extract_decode_Fmt_0(&u.f_decode0, insn); 223 if (trans_nop(ctx, &u.f_decode0)) return true; 224 } 225 if ((insn & 0x03e00000) == 0x00000000) { 226 /* 00001000 000..... 00000010 010..... */ 227 extract_decode_Fmt_1(&u.f_decode1, insn); 228 if (trans_copy(ctx, &u.f_decode1)) return true; 229 } 230 } 231 extract_decode_Fmt_2(&u.f_decode2, insn); 232 if (trans_or(ctx, &u.f_decode2)) return true; 233 return false; 234 } 235