1======================== 2Decodetree Specification 3======================== 4 5A *decodetree* is built from instruction *patterns*. A pattern may 6represent a single architectural instruction or a group of same, depending 7on what is convenient for further processing. 8 9Each pattern has both *fixedbits* and *fixedmask*, the combination of which 10describes the condition under which the pattern is matched:: 11 12 (insn & fixedmask) == fixedbits 13 14Each pattern may have *fields*, which are extracted from the insn and 15passed along to the translator. Examples of such are registers, 16immediates, and sub-opcodes. 17 18In support of patterns, one may declare *fields*, *argument sets*, and 19*formats*, each of which may be re-used to simplify further definitions. 20 21Fields 22====== 23 24Syntax:: 25 26 field_def := '%' identifier ( field )* ( !function=identifier )? 27 field := unnamed_field | named_field 28 unnamed_field := number ':' ( 's' ) number 29 named_field := identifier ':' ( 's' ) number 30 31For *unnamed_field*, the first number is the least-significant bit position 32of the field and the second number is the length of the field. If the 's' is 33present, the field is considered signed. 34 35A *named_field* refers to some other field in the instruction pattern 36or format. Regardless of the length of the other field where it is 37defined, it will be inserted into this field with the specified 38signedness and bit width. 39 40Field definitions that involve loops (i.e. where a field is defined 41directly or indirectly in terms of itself) are errors. 42 43A format can include fields that refer to named fields that are 44defined in the instruction pattern(s) that use the format. 45Conversely, an instruction pattern can include fields that refer to 46named fields that are defined in the format it uses. However you 47cannot currently do both at once (i.e. pattern P uses format F; F has 48a field A that refers to a named field B that is defined in P, and P 49has a field C that refers to a named field D that is defined in F). 50 51If multiple ``fields`` are present, they are concatenated. 52In this way one can define disjoint fields. 53 54If ``!function`` is specified, the concatenated result is passed through the 55named function, taking and returning an integral value. 56 57One may use ``!function`` with zero ``fields``. This case is called 58a *parameter*, and the named function is only passed the ``DisasContext`` 59and returns an integral value extracted from there. 60 61A field with no ``fields`` and no ``!function`` is in error. 62 63Field examples: 64 65+---------------------------+---------------------------------------------+ 66| Input | Generated code | 67+===========================+=============================================+ 68| %disp 0:s16 | sextract(i, 0, 16) | 69+---------------------------+---------------------------------------------+ 70| %imm9 16:6 10:3 | extract(i, 16, 6) << 3 | extract(i, 10, 3) | 71+---------------------------+---------------------------------------------+ 72| %disp12 0:s1 1:1 2:10 | sextract(i, 0, 1) << 11 | | 73| | extract(i, 1, 1) << 10 | | 74| | extract(i, 2, 10) | 75+---------------------------+---------------------------------------------+ 76| %shimm8 5:s8 13:1 | expand_shimm8(sextract(i, 5, 8) << 1 | | 77| !function=expand_shimm8 | extract(i, 13, 1)) | 78+---------------------------+---------------------------------------------+ 79| %sz_imm 10:2 sz:3 | expand_sz_imm(extract(i, 10, 2) << 3 | | 80| !function=expand_sz_imm | extract(a->sz, 0, 3)) | 81+---------------------------+---------------------------------------------+ 82 83Argument Sets 84============= 85 86Syntax:: 87 88 args_def := '&' identifier ( args_elt )+ ( !extern )? 89 args_elt := identifier (':' identifier)? 90 91Each *args_elt* defines an argument within the argument set. 92If the form of the *args_elt* contains a colon, the first 93identifier is the argument name and the second identifier is 94the argument type. If the colon is missing, the argument 95type will be ``int``. 96 97Each argument set will be rendered as a C structure "arg_$name" 98with each of the fields being one of the member arguments. 99 100If ``!extern`` is specified, the backing structure is assumed 101to have been already declared, typically via a second decoder. 102 103Argument sets are useful when one wants to define helper functions 104for the translator functions that can perform operations on a common 105set of arguments. This can ensure, for instance, that the ``AND`` 106pattern and the ``OR`` pattern put their operands into the same named 107structure, so that a common ``gen_logic_insn`` may be able to handle 108the operations common between the two. 109 110Argument set examples:: 111 112 ®3 ra rb rc 113 &loadstore reg base offset 114 &longldst reg base offset:int64_t 115 116 117Formats 118======= 119 120Syntax:: 121 122 fmt_def := '@' identifier ( fmt_elt )+ 123 fmt_elt := fixedbit_elt | field_elt | field_ref | args_ref 124 fixedbit_elt := [01.-]+ 125 field_elt := identifier ':' 's'? number 126 field_ref := '%' identifier | identifier '=' '%' identifier 127 args_ref := '&' identifier 128 129Defining a format is a handy way to avoid replicating groups of fields 130across many instruction patterns. 131 132A *fixedbit_elt* describes a contiguous sequence of bits that must 133be 1, 0, or don't care. The difference between '.' and '-' 134is that '.' means that the bit will be covered with a field or a 135final 0 or 1 from the pattern, and '-' means that the bit is really 136ignored by the cpu and will not be specified. 137 138A *field_elt* describes a simple field only given a width; the position of 139the field is implied by its position with respect to other *fixedbit_elt* 140and *field_elt*. 141 142If any *fixedbit_elt* or *field_elt* appear, then all bits must be defined. 143Padding with a *fixedbit_elt* of all '.' is an easy way to accomplish that. 144 145A *field_ref* incorporates a field by reference. This is the only way to 146add a complex field to a format. A field may be renamed in the process 147via assignment to another identifier. This is intended to allow the 148same argument set be used with disjoint named fields. 149 150A single *args_ref* may specify an argument set to use for the format. 151The set of fields in the format must be a subset of the arguments in 152the argument set. If an argument set is not specified, one will be 153inferred from the set of fields. 154 155It is recommended, but not required, that all *field_ref* and *args_ref* 156appear at the end of the line, not interleaving with *fixedbit_elf* or 157*field_elt*. 158 159Format examples:: 160 161 @opr ...... ra:5 rb:5 ... 0 ....... rc:5 162 @opi ...... ra:5 lit:8 1 ....... rc:5 163 164Patterns 165======== 166 167Syntax:: 168 169 pat_def := identifier ( pat_elt )+ 170 pat_elt := fixedbit_elt | field_elt | field_ref | args_ref | fmt_ref | const_elt 171 fmt_ref := '@' identifier 172 const_elt := identifier '=' number 173 174The *fixedbit_elt* and *field_elt* specifiers are unchanged from formats. 175A pattern that does not specify a named format will have one inferred 176from a referenced argument set (if present) and the set of fields. 177 178A *const_elt* allows a argument to be set to a constant value. This may 179come in handy when fields overlap between patterns and one has to 180include the values in the *fixedbit_elt* instead. 181 182The decoder will call a translator function for each pattern matched. 183 184Pattern examples:: 185 186 addl_r 010000 ..... ..... .... 0000000 ..... @opr 187 addl_i 010000 ..... ..... .... 0000000 ..... @opi 188 189which will, in part, invoke:: 190 191 trans_addl_r(ctx, &arg_opr, insn) 192 193and:: 194 195 trans_addl_i(ctx, &arg_opi, insn) 196 197Pattern Groups 198============== 199 200Syntax:: 201 202 group := overlap_group | no_overlap_group 203 overlap_group := '{' ( pat_def | group )+ '}' 204 no_overlap_group := '[' ( pat_def | group )+ ']' 205 206A *group* begins with a lone open-brace or open-bracket, with all 207subsequent lines indented two spaces, and ending with a lone 208close-brace or close-bracket. Groups may be nested, increasing the 209required indentation of the lines within the nested group to two 210spaces per nesting level. 211 212Patterns within overlap groups are allowed to overlap. Conflicts are 213resolved by selecting the patterns in order. If all of the fixedbits 214for a pattern match, its translate function will be called. If the 215translate function returns false, then subsequent patterns within the 216group will be matched. 217 218Patterns within no-overlap groups are not allowed to overlap, just 219the same as ungrouped patterns. Thus no-overlap groups are intended 220to be nested inside overlap groups. 221 222The following example from PA-RISC shows specialization of the *or* 223instruction:: 224 225 { 226 { 227 nop 000010 ----- ----- 0000 001001 0 00000 228 copy 000010 00000 r1:5 0000 001001 0 rt:5 229 } 230 or 000010 rt2:5 r1:5 cf:4 001001 0 rt:5 231 } 232 233When the *cf* field is zero, the instruction has no side effects, 234and may be specialized. When the *rt* field is zero, the output 235is discarded and so the instruction has no effect. When the *rt2* 236field is zero, the operation is ``reg[r1] | 0`` and so encodes 237the canonical register copy operation. 238 239The output from the generator might look like:: 240 241 switch (insn & 0xfc000fe0) { 242 case 0x08000240: 243 /* 000010.. ........ ....0010 010..... */ 244 if ((insn & 0x0000f000) == 0x00000000) { 245 /* 000010.. ........ 00000010 010..... */ 246 if ((insn & 0x0000001f) == 0x00000000) { 247 /* 000010.. ........ 00000010 01000000 */ 248 extract_decode_Fmt_0(&u.f_decode0, insn); 249 if (trans_nop(ctx, &u.f_decode0)) return true; 250 } 251 if ((insn & 0x03e00000) == 0x00000000) { 252 /* 00001000 000..... 00000010 010..... */ 253 extract_decode_Fmt_1(&u.f_decode1, insn); 254 if (trans_copy(ctx, &u.f_decode1)) return true; 255 } 256 } 257 extract_decode_Fmt_2(&u.f_decode2, insn); 258 if (trans_or(ctx, &u.f_decode2)) return true; 259 return false; 260 } 261