13fdbf5d6SRichard Henderson======================== 23fdbf5d6SRichard HendersonDecodetree Specification 33fdbf5d6SRichard Henderson======================== 43fdbf5d6SRichard Henderson 53fdbf5d6SRichard HendersonA *decodetree* is built from instruction *patterns*. A pattern may 63fdbf5d6SRichard Hendersonrepresent a single architectural instruction or a group of same, depending 73fdbf5d6SRichard Hendersonon what is convenient for further processing. 83fdbf5d6SRichard Henderson 93fdbf5d6SRichard HendersonEach pattern has both *fixedbits* and *fixedmask*, the combination of which 103fdbf5d6SRichard Hendersondescribes the condition under which the pattern is matched:: 113fdbf5d6SRichard Henderson 123fdbf5d6SRichard Henderson (insn & fixedmask) == fixedbits 133fdbf5d6SRichard Henderson 143fdbf5d6SRichard HendersonEach pattern may have *fields*, which are extracted from the insn and 153fdbf5d6SRichard Hendersonpassed along to the translator. Examples of such are registers, 163fdbf5d6SRichard Hendersonimmediates, and sub-opcodes. 173fdbf5d6SRichard Henderson 183fdbf5d6SRichard HendersonIn support of patterns, one may declare *fields*, *argument sets*, and 193fdbf5d6SRichard Henderson*formats*, each of which may be re-used to simplify further definitions. 203fdbf5d6SRichard Henderson 213fdbf5d6SRichard HendersonFields 223fdbf5d6SRichard Henderson====== 233fdbf5d6SRichard Henderson 243fdbf5d6SRichard HendersonSyntax:: 253fdbf5d6SRichard Henderson 26*7e626093SPeter Maydell field_def := '%' identifier ( field )* ( !function=identifier )? 27*7e626093SPeter Maydell field := unnamed_field | named_field 283fdbf5d6SRichard Henderson unnamed_field := number ':' ( 's' ) number 29*7e626093SPeter Maydell named_field := identifier ':' ( 's' ) number 303fdbf5d6SRichard Henderson 313fdbf5d6SRichard HendersonFor *unnamed_field*, the first number is the least-significant bit position 323fdbf5d6SRichard Hendersonof the field and the second number is the length of the field. If the 's' is 33*7e626093SPeter Maydellpresent, the field is considered signed. 34*7e626093SPeter Maydell 35*7e626093SPeter MaydellA *named_field* refers to some other field in the instruction pattern 36*7e626093SPeter Maydellor format. Regardless of the length of the other field where it is 37*7e626093SPeter Maydelldefined, it will be inserted into this field with the specified 38*7e626093SPeter Maydellsignedness and bit width. 39*7e626093SPeter Maydell 40*7e626093SPeter MaydellField definitions that involve loops (i.e. where a field is defined 41*7e626093SPeter Maydelldirectly or indirectly in terms of itself) are errors. 42*7e626093SPeter Maydell 43*7e626093SPeter MaydellA format can include fields that refer to named fields that are 44*7e626093SPeter Maydelldefined in the instruction pattern(s) that use the format. 45*7e626093SPeter MaydellConversely, an instruction pattern can include fields that refer to 46*7e626093SPeter Maydellnamed fields that are defined in the format it uses. However you 47*7e626093SPeter Maydellcannot currently do both at once (i.e. pattern P uses format F; F has 48*7e626093SPeter Maydella field A that refers to a named field B that is defined in P, and P 49*7e626093SPeter Maydellhas a field C that refers to a named field D that is defined in F). 50*7e626093SPeter Maydell 51*7e626093SPeter MaydellIf multiple ``fields`` are present, they are concatenated. 52*7e626093SPeter MaydellIn this way one can define disjoint fields. 533fdbf5d6SRichard Henderson 543fdbf5d6SRichard HendersonIf ``!function`` is specified, the concatenated result is passed through the 553fdbf5d6SRichard Hendersonnamed function, taking and returning an integral value. 563fdbf5d6SRichard Henderson 57*7e626093SPeter MaydellOne may use ``!function`` with zero ``fields``. This case is called 5894597b61SRichard Hendersona *parameter*, and the named function is only passed the ``DisasContext`` 5994597b61SRichard Hendersonand returns an integral value extracted from there. 6094597b61SRichard Henderson 61*7e626093SPeter MaydellA field with no ``fields`` and no ``!function`` is in error. 6294597b61SRichard Henderson 633fdbf5d6SRichard HendersonField examples: 643fdbf5d6SRichard Henderson 653fdbf5d6SRichard Henderson+---------------------------+---------------------------------------------+ 663fdbf5d6SRichard Henderson| Input | Generated code | 673fdbf5d6SRichard Henderson+===========================+=============================================+ 683fdbf5d6SRichard Henderson| %disp 0:s16 | sextract(i, 0, 16) | 693fdbf5d6SRichard Henderson+---------------------------+---------------------------------------------+ 703fdbf5d6SRichard Henderson| %imm9 16:6 10:3 | extract(i, 16, 6) << 3 | extract(i, 10, 3) | 713fdbf5d6SRichard Henderson+---------------------------+---------------------------------------------+ 723fdbf5d6SRichard Henderson| %disp12 0:s1 1:1 2:10 | sextract(i, 0, 1) << 11 | | 733fdbf5d6SRichard Henderson| | extract(i, 1, 1) << 10 | | 743fdbf5d6SRichard Henderson| | extract(i, 2, 10) | 753fdbf5d6SRichard Henderson+---------------------------+---------------------------------------------+ 763fdbf5d6SRichard Henderson| %shimm8 5:s8 13:1 | expand_shimm8(sextract(i, 5, 8) << 1 | | 773fdbf5d6SRichard Henderson| !function=expand_shimm8 | extract(i, 13, 1)) | 783fdbf5d6SRichard Henderson+---------------------------+---------------------------------------------+ 79*7e626093SPeter Maydell| %sz_imm 10:2 sz:3 | expand_sz_imm(extract(i, 10, 2) << 3 | | 80*7e626093SPeter Maydell| !function=expand_sz_imm | extract(a->sz, 0, 3)) | 81*7e626093SPeter Maydell+---------------------------+---------------------------------------------+ 823fdbf5d6SRichard Henderson 833fdbf5d6SRichard HendersonArgument Sets 843fdbf5d6SRichard Henderson============= 853fdbf5d6SRichard Henderson 863fdbf5d6SRichard HendersonSyntax:: 873fdbf5d6SRichard Henderson 883fdbf5d6SRichard Henderson args_def := '&' identifier ( args_elt )+ ( !extern )? 89af93ccacSRichard Henderson args_elt := identifier (':' identifier)? 903fdbf5d6SRichard Henderson 913fdbf5d6SRichard HendersonEach *args_elt* defines an argument within the argument set. 92af93ccacSRichard HendersonIf the form of the *args_elt* contains a colon, the first 93af93ccacSRichard Hendersonidentifier is the argument name and the second identifier is 94af93ccacSRichard Hendersonthe argument type. If the colon is missing, the argument 95af93ccacSRichard Hendersontype will be ``int``. 96af93ccacSRichard Henderson 973fdbf5d6SRichard HendersonEach argument set will be rendered as a C structure "arg_$name" 983fdbf5d6SRichard Hendersonwith each of the fields being one of the member arguments. 993fdbf5d6SRichard Henderson 1003fdbf5d6SRichard HendersonIf ``!extern`` is specified, the backing structure is assumed 1013fdbf5d6SRichard Hendersonto have been already declared, typically via a second decoder. 1023fdbf5d6SRichard Henderson 1035d53b0f5SRichard HendersonArgument sets are useful when one wants to define helper functions 1045d53b0f5SRichard Hendersonfor the translator functions that can perform operations on a common 1055d53b0f5SRichard Hendersonset of arguments. This can ensure, for instance, that the ``AND`` 1065d53b0f5SRichard Hendersonpattern and the ``OR`` pattern put their operands into the same named 1075d53b0f5SRichard Hendersonstructure, so that a common ``gen_logic_insn`` may be able to handle 1085d53b0f5SRichard Hendersonthe operations common between the two. 1095d53b0f5SRichard Henderson 1103fdbf5d6SRichard HendersonArgument set examples:: 1113fdbf5d6SRichard Henderson 1123fdbf5d6SRichard Henderson ®3 ra rb rc 1133fdbf5d6SRichard Henderson &loadstore reg base offset 114af93ccacSRichard Henderson &longldst reg base offset:int64_t 1153fdbf5d6SRichard Henderson 1163fdbf5d6SRichard Henderson 1173fdbf5d6SRichard HendersonFormats 1183fdbf5d6SRichard Henderson======= 1193fdbf5d6SRichard Henderson 1203fdbf5d6SRichard HendersonSyntax:: 1213fdbf5d6SRichard Henderson 1223fdbf5d6SRichard Henderson fmt_def := '@' identifier ( fmt_elt )+ 1233fdbf5d6SRichard Henderson fmt_elt := fixedbit_elt | field_elt | field_ref | args_ref 1243fdbf5d6SRichard Henderson fixedbit_elt := [01.-]+ 1253fdbf5d6SRichard Henderson field_elt := identifier ':' 's'? number 1263fdbf5d6SRichard Henderson field_ref := '%' identifier | identifier '=' '%' identifier 1273fdbf5d6SRichard Henderson args_ref := '&' identifier 1283fdbf5d6SRichard Henderson 1293fdbf5d6SRichard HendersonDefining a format is a handy way to avoid replicating groups of fields 1303fdbf5d6SRichard Hendersonacross many instruction patterns. 1313fdbf5d6SRichard Henderson 1323fdbf5d6SRichard HendersonA *fixedbit_elt* describes a contiguous sequence of bits that must 1333fdbf5d6SRichard Hendersonbe 1, 0, or don't care. The difference between '.' and '-' 1343fdbf5d6SRichard Hendersonis that '.' means that the bit will be covered with a field or a 1353fdbf5d6SRichard Hendersonfinal 0 or 1 from the pattern, and '-' means that the bit is really 1363fdbf5d6SRichard Hendersonignored by the cpu and will not be specified. 1373fdbf5d6SRichard Henderson 1383fdbf5d6SRichard HendersonA *field_elt* describes a simple field only given a width; the position of 1393fdbf5d6SRichard Hendersonthe field is implied by its position with respect to other *fixedbit_elt* 1403fdbf5d6SRichard Hendersonand *field_elt*. 1413fdbf5d6SRichard Henderson 1423fdbf5d6SRichard HendersonIf any *fixedbit_elt* or *field_elt* appear, then all bits must be defined. 1433fdbf5d6SRichard HendersonPadding with a *fixedbit_elt* of all '.' is an easy way to accomplish that. 1443fdbf5d6SRichard Henderson 1453fdbf5d6SRichard HendersonA *field_ref* incorporates a field by reference. This is the only way to 1463fdbf5d6SRichard Hendersonadd a complex field to a format. A field may be renamed in the process 1473fdbf5d6SRichard Hendersonvia assignment to another identifier. This is intended to allow the 1483fdbf5d6SRichard Hendersonsame argument set be used with disjoint named fields. 1493fdbf5d6SRichard Henderson 1503fdbf5d6SRichard HendersonA single *args_ref* may specify an argument set to use for the format. 1513fdbf5d6SRichard HendersonThe set of fields in the format must be a subset of the arguments in 1523fdbf5d6SRichard Hendersonthe argument set. If an argument set is not specified, one will be 1533fdbf5d6SRichard Hendersoninferred from the set of fields. 1543fdbf5d6SRichard Henderson 1553fdbf5d6SRichard HendersonIt is recommended, but not required, that all *field_ref* and *args_ref* 1563fdbf5d6SRichard Hendersonappear at the end of the line, not interleaving with *fixedbit_elf* or 1573fdbf5d6SRichard Henderson*field_elt*. 1583fdbf5d6SRichard Henderson 1593fdbf5d6SRichard HendersonFormat examples:: 1603fdbf5d6SRichard Henderson 1613fdbf5d6SRichard Henderson @opr ...... ra:5 rb:5 ... 0 ....... rc:5 1623fdbf5d6SRichard Henderson @opi ...... ra:5 lit:8 1 ....... rc:5 1633fdbf5d6SRichard Henderson 1643fdbf5d6SRichard HendersonPatterns 1653fdbf5d6SRichard Henderson======== 1663fdbf5d6SRichard Henderson 1673fdbf5d6SRichard HendersonSyntax:: 1683fdbf5d6SRichard Henderson 1693fdbf5d6SRichard Henderson pat_def := identifier ( pat_elt )+ 1703fdbf5d6SRichard Henderson pat_elt := fixedbit_elt | field_elt | field_ref | args_ref | fmt_ref | const_elt 1713fdbf5d6SRichard Henderson fmt_ref := '@' identifier 1723fdbf5d6SRichard Henderson const_elt := identifier '=' number 1733fdbf5d6SRichard Henderson 1743fdbf5d6SRichard HendersonThe *fixedbit_elt* and *field_elt* specifiers are unchanged from formats. 1753fdbf5d6SRichard HendersonA pattern that does not specify a named format will have one inferred 1763fdbf5d6SRichard Hendersonfrom a referenced argument set (if present) and the set of fields. 1773fdbf5d6SRichard Henderson 1783fdbf5d6SRichard HendersonA *const_elt* allows a argument to be set to a constant value. This may 1793fdbf5d6SRichard Hendersoncome in handy when fields overlap between patterns and one has to 1803fdbf5d6SRichard Hendersoninclude the values in the *fixedbit_elt* instead. 1813fdbf5d6SRichard Henderson 1823fdbf5d6SRichard HendersonThe decoder will call a translator function for each pattern matched. 1833fdbf5d6SRichard Henderson 1843fdbf5d6SRichard HendersonPattern examples:: 1853fdbf5d6SRichard Henderson 1863fdbf5d6SRichard Henderson addl_r 010000 ..... ..... .... 0000000 ..... @opr 1873fdbf5d6SRichard Henderson addl_i 010000 ..... ..... .... 0000000 ..... @opi 1883fdbf5d6SRichard Henderson 1893fdbf5d6SRichard Hendersonwhich will, in part, invoke:: 1903fdbf5d6SRichard Henderson 1913fdbf5d6SRichard Henderson trans_addl_r(ctx, &arg_opr, insn) 1923fdbf5d6SRichard Henderson 1933fdbf5d6SRichard Hendersonand:: 1943fdbf5d6SRichard Henderson 1953fdbf5d6SRichard Henderson trans_addl_i(ctx, &arg_opi, insn) 1960eff2df4SRichard Henderson 1970eff2df4SRichard HendersonPattern Groups 1980eff2df4SRichard Henderson============== 1990eff2df4SRichard Henderson 2000eff2df4SRichard HendersonSyntax:: 2010eff2df4SRichard Henderson 202ffdfca6fSRichard Henderson group := overlap_group | no_overlap_group 203ffdfca6fSRichard Henderson overlap_group := '{' ( pat_def | group )+ '}' 204ffdfca6fSRichard Henderson no_overlap_group := '[' ( pat_def | group )+ ']' 2050eff2df4SRichard Henderson 206ffdfca6fSRichard HendersonA *group* begins with a lone open-brace or open-bracket, with all 207ffdfca6fSRichard Hendersonsubsequent lines indented two spaces, and ending with a lone 208ffdfca6fSRichard Hendersonclose-brace or close-bracket. Groups may be nested, increasing the 209ffdfca6fSRichard Hendersonrequired indentation of the lines within the nested group to two 210ffdfca6fSRichard Hendersonspaces per nesting level. 2110eff2df4SRichard Henderson 212ffdfca6fSRichard HendersonPatterns within overlap groups are allowed to overlap. Conflicts are 213ffdfca6fSRichard Hendersonresolved by selecting the patterns in order. If all of the fixedbits 214ffdfca6fSRichard Hendersonfor a pattern match, its translate function will be called. If the 215ffdfca6fSRichard Hendersontranslate function returns false, then subsequent patterns within the 216ffdfca6fSRichard Hendersongroup will be matched. 217ffdfca6fSRichard Henderson 218ffdfca6fSRichard HendersonPatterns within no-overlap groups are not allowed to overlap, just 219ffdfca6fSRichard Hendersonthe same as ungrouped patterns. Thus no-overlap groups are intended 220ffdfca6fSRichard Hendersonto be nested inside overlap groups. 2210eff2df4SRichard Henderson 2220eff2df4SRichard HendersonThe following example from PA-RISC shows specialization of the *or* 2230eff2df4SRichard Hendersoninstruction:: 2240eff2df4SRichard Henderson 2250eff2df4SRichard Henderson { 2260eff2df4SRichard Henderson { 2270eff2df4SRichard Henderson nop 000010 ----- ----- 0000 001001 0 00000 2280eff2df4SRichard Henderson copy 000010 00000 r1:5 0000 001001 0 rt:5 2290eff2df4SRichard Henderson } 2300eff2df4SRichard Henderson or 000010 rt2:5 r1:5 cf:4 001001 0 rt:5 2310eff2df4SRichard Henderson } 2320eff2df4SRichard Henderson 2330eff2df4SRichard HendersonWhen the *cf* field is zero, the instruction has no side effects, 2340eff2df4SRichard Hendersonand may be specialized. When the *rt* field is zero, the output 2350eff2df4SRichard Hendersonis discarded and so the instruction has no effect. When the *rt2* 236ffdfca6fSRichard Hendersonfield is zero, the operation is ``reg[r1] | 0`` and so encodes 2370eff2df4SRichard Hendersonthe canonical register copy operation. 2380eff2df4SRichard Henderson 2390eff2df4SRichard HendersonThe output from the generator might look like:: 2400eff2df4SRichard Henderson 2410eff2df4SRichard Henderson switch (insn & 0xfc000fe0) { 2420eff2df4SRichard Henderson case 0x08000240: 2430eff2df4SRichard Henderson /* 000010.. ........ ....0010 010..... */ 2440eff2df4SRichard Henderson if ((insn & 0x0000f000) == 0x00000000) { 2450eff2df4SRichard Henderson /* 000010.. ........ 00000010 010..... */ 2460eff2df4SRichard Henderson if ((insn & 0x0000001f) == 0x00000000) { 2470eff2df4SRichard Henderson /* 000010.. ........ 00000010 01000000 */ 2480eff2df4SRichard Henderson extract_decode_Fmt_0(&u.f_decode0, insn); 2490eff2df4SRichard Henderson if (trans_nop(ctx, &u.f_decode0)) return true; 2500eff2df4SRichard Henderson } 2510eff2df4SRichard Henderson if ((insn & 0x03e00000) == 0x00000000) { 2520eff2df4SRichard Henderson /* 00001000 000..... 00000010 010..... */ 2530eff2df4SRichard Henderson extract_decode_Fmt_1(&u.f_decode1, insn); 2540eff2df4SRichard Henderson if (trans_copy(ctx, &u.f_decode1)) return true; 2550eff2df4SRichard Henderson } 2560eff2df4SRichard Henderson } 2570eff2df4SRichard Henderson extract_decode_Fmt_2(&u.f_decode2, insn); 2580eff2df4SRichard Henderson if (trans_or(ctx, &u.f_decode2)) return true; 2590eff2df4SRichard Henderson return false; 2600eff2df4SRichard Henderson } 261