11da177e4SLinus Torvalds| 21da177e4SLinus Torvalds| round.sa 3.4 7/29/91 31da177e4SLinus Torvalds| 41da177e4SLinus Torvalds| handle rounding and normalization tasks 51da177e4SLinus Torvalds| 61da177e4SLinus Torvalds| 71da177e4SLinus Torvalds| 81da177e4SLinus Torvalds| Copyright (C) Motorola, Inc. 1990 91da177e4SLinus Torvalds| All Rights Reserved 101da177e4SLinus Torvalds| 11e00d82d0SMatt Waddel| For details on the license for this file, please see the 12e00d82d0SMatt Waddel| file, README, in this same directory. 131da177e4SLinus Torvalds 141da177e4SLinus Torvalds|ROUND idnt 2,1 | Motorola 040 Floating Point Software Package 151da177e4SLinus Torvalds 161da177e4SLinus Torvalds |section 8 171da177e4SLinus Torvalds 181da177e4SLinus Torvalds#include "fpsp.h" 191da177e4SLinus Torvalds 201da177e4SLinus Torvalds| 211da177e4SLinus Torvalds| round --- round result according to precision/mode 221da177e4SLinus Torvalds| 231da177e4SLinus Torvalds| a0 points to the input operand in the internal extended format 241da177e4SLinus Torvalds| d1(high word) contains rounding precision: 251da177e4SLinus Torvalds| ext = $0000xxxx 261da177e4SLinus Torvalds| sgl = $0001xxxx 271da177e4SLinus Torvalds| dbl = $0002xxxx 281da177e4SLinus Torvalds| d1(low word) contains rounding mode: 291da177e4SLinus Torvalds| RN = $xxxx0000 301da177e4SLinus Torvalds| RZ = $xxxx0001 311da177e4SLinus Torvalds| RM = $xxxx0010 321da177e4SLinus Torvalds| RP = $xxxx0011 331da177e4SLinus Torvalds| d0{31:29} contains the g,r,s bits (extended) 341da177e4SLinus Torvalds| 351da177e4SLinus Torvalds| On return the value pointed to by a0 is correctly rounded, 361da177e4SLinus Torvalds| a0 is preserved and the g-r-s bits in d0 are cleared. 371da177e4SLinus Torvalds| The result is not typed - the tag field is invalid. The 381da177e4SLinus Torvalds| result is still in the internal extended format. 391da177e4SLinus Torvalds| 401da177e4SLinus Torvalds| The INEX bit of USER_FPSR will be set if the rounded result was 411da177e4SLinus Torvalds| inexact (i.e. if any of the g-r-s bits were set). 421da177e4SLinus Torvalds| 431da177e4SLinus Torvalds 441da177e4SLinus Torvalds .global round 451da177e4SLinus Torvaldsround: 461da177e4SLinus Torvalds| If g=r=s=0 then result is exact and round is done, else set 471da177e4SLinus Torvalds| the inex flag in status reg and continue. 481da177e4SLinus Torvalds| 491da177e4SLinus Torvalds bsrs ext_grs |this subroutine looks at the 501da177e4SLinus Torvalds| :rounding precision and sets 511da177e4SLinus Torvalds| ;the appropriate g-r-s bits. 521da177e4SLinus Torvalds tstl %d0 |if grs are zero, go force 531da177e4SLinus Torvalds bne rnd_cont |lower bits to zero for size 541da177e4SLinus Torvalds 551da177e4SLinus Torvalds swap %d1 |set up d1.w for round prec. 561da177e4SLinus Torvalds bra truncate 571da177e4SLinus Torvalds 581da177e4SLinus Torvaldsrnd_cont: 591da177e4SLinus Torvalds| 601da177e4SLinus Torvalds| Use rounding mode as an index into a jump table for these modes. 611da177e4SLinus Torvalds| 621da177e4SLinus Torvalds orl #inx2a_mask,USER_FPSR(%a6) |set inex2/ainex 631da177e4SLinus Torvalds lea mode_tab,%a1 641da177e4SLinus Torvalds movel (%a1,%d1.w*4),%a1 651da177e4SLinus Torvalds jmp (%a1) 661da177e4SLinus Torvalds| 671da177e4SLinus Torvalds| Jump table indexed by rounding mode in d1.w. All following assumes 681da177e4SLinus Torvalds| grs != 0. 691da177e4SLinus Torvalds| 701da177e4SLinus Torvaldsmode_tab: 711da177e4SLinus Torvalds .long rnd_near 721da177e4SLinus Torvalds .long rnd_zero 731da177e4SLinus Torvalds .long rnd_mnus 741da177e4SLinus Torvalds .long rnd_plus 751da177e4SLinus Torvalds| 761da177e4SLinus Torvalds| ROUND PLUS INFINITY 771da177e4SLinus Torvalds| 781da177e4SLinus Torvalds| If sign of fp number = 0 (positive), then add 1 to l. 791da177e4SLinus Torvalds| 801da177e4SLinus Torvaldsrnd_plus: 811da177e4SLinus Torvalds swap %d1 |set up d1 for round prec. 821da177e4SLinus Torvalds tstb LOCAL_SGN(%a0) |check for sign 831da177e4SLinus Torvalds bmi truncate |if positive then truncate 841da177e4SLinus Torvalds movel #0xffffffff,%d0 |force g,r,s to be all f's 851da177e4SLinus Torvalds lea add_to_l,%a1 861da177e4SLinus Torvalds movel (%a1,%d1.w*4),%a1 871da177e4SLinus Torvalds jmp (%a1) 881da177e4SLinus Torvalds| 891da177e4SLinus Torvalds| ROUND MINUS INFINITY 901da177e4SLinus Torvalds| 911da177e4SLinus Torvalds| If sign of fp number = 1 (negative), then add 1 to l. 921da177e4SLinus Torvalds| 931da177e4SLinus Torvaldsrnd_mnus: 941da177e4SLinus Torvalds swap %d1 |set up d1 for round prec. 951da177e4SLinus Torvalds tstb LOCAL_SGN(%a0) |check for sign 961da177e4SLinus Torvalds bpl truncate |if negative then truncate 971da177e4SLinus Torvalds movel #0xffffffff,%d0 |force g,r,s to be all f's 981da177e4SLinus Torvalds lea add_to_l,%a1 991da177e4SLinus Torvalds movel (%a1,%d1.w*4),%a1 1001da177e4SLinus Torvalds jmp (%a1) 1011da177e4SLinus Torvalds| 1021da177e4SLinus Torvalds| ROUND ZERO 1031da177e4SLinus Torvalds| 1041da177e4SLinus Torvalds| Always truncate. 1051da177e4SLinus Torvaldsrnd_zero: 1061da177e4SLinus Torvalds swap %d1 |set up d1 for round prec. 1071da177e4SLinus Torvalds bra truncate 1081da177e4SLinus Torvalds| 1091da177e4SLinus Torvalds| 1101da177e4SLinus Torvalds| ROUND NEAREST 1111da177e4SLinus Torvalds| 1121da177e4SLinus Torvalds| If (g=1), then add 1 to l and if (r=s=0), then clear l 1131da177e4SLinus Torvalds| Note that this will round to even in case of a tie. 1141da177e4SLinus Torvalds| 1151da177e4SLinus Torvaldsrnd_near: 1161da177e4SLinus Torvalds swap %d1 |set up d1 for round prec. 1171da177e4SLinus Torvalds asll #1,%d0 |shift g-bit to c-bit 1181da177e4SLinus Torvalds bcc truncate |if (g=1) then 1191da177e4SLinus Torvalds lea add_to_l,%a1 1201da177e4SLinus Torvalds movel (%a1,%d1.w*4),%a1 1211da177e4SLinus Torvalds jmp (%a1) 1221da177e4SLinus Torvalds 1231da177e4SLinus Torvalds| 1241da177e4SLinus Torvalds| ext_grs --- extract guard, round and sticky bits 1251da177e4SLinus Torvalds| 1261da177e4SLinus Torvalds| Input: d1 = PREC:ROUND 1271da177e4SLinus Torvalds| Output: d0{31:29}= guard, round, sticky 1281da177e4SLinus Torvalds| 1291da177e4SLinus Torvalds| The ext_grs extract the guard/round/sticky bits according to the 1301da177e4SLinus Torvalds| selected rounding precision. It is called by the round subroutine 1311da177e4SLinus Torvalds| only. All registers except d0 are kept intact. d0 becomes an 1321da177e4SLinus Torvalds| updated guard,round,sticky in d0{31:29} 1331da177e4SLinus Torvalds| 1341da177e4SLinus Torvalds| Notes: the ext_grs uses the round PREC, and therefore has to swap d1 1351da177e4SLinus Torvalds| prior to usage, and needs to restore d1 to original. 1361da177e4SLinus Torvalds| 1371da177e4SLinus Torvaldsext_grs: 1381da177e4SLinus Torvalds swap %d1 |have d1.w point to round precision 1391da177e4SLinus Torvalds cmpiw #0,%d1 1401da177e4SLinus Torvalds bnes sgl_or_dbl 1411da177e4SLinus Torvalds bras end_ext_grs 1421da177e4SLinus Torvalds 1431da177e4SLinus Torvaldssgl_or_dbl: 1441da177e4SLinus Torvalds moveml %d2/%d3,-(%a7) |make some temp registers 1451da177e4SLinus Torvalds cmpiw #1,%d1 1461da177e4SLinus Torvalds bnes grs_dbl 1471da177e4SLinus Torvaldsgrs_sgl: 1481da177e4SLinus Torvalds bfextu LOCAL_HI(%a0){#24:#2},%d3 |sgl prec. g-r are 2 bits right 1491da177e4SLinus Torvalds movel #30,%d2 |of the sgl prec. limits 1501da177e4SLinus Torvalds lsll %d2,%d3 |shift g-r bits to MSB of d3 1511da177e4SLinus Torvalds movel LOCAL_HI(%a0),%d2 |get word 2 for s-bit test 1521da177e4SLinus Torvalds andil #0x0000003f,%d2 |s bit is the or of all other 1531da177e4SLinus Torvalds bnes st_stky |bits to the right of g-r 1541da177e4SLinus Torvalds tstl LOCAL_LO(%a0) |test lower mantissa 1551da177e4SLinus Torvalds bnes st_stky |if any are set, set sticky 1561da177e4SLinus Torvalds tstl %d0 |test original g,r,s 1571da177e4SLinus Torvalds bnes st_stky |if any are set, set sticky 1581da177e4SLinus Torvalds bras end_sd |if words 3 and 4 are clr, exit 1591da177e4SLinus Torvaldsgrs_dbl: 1601da177e4SLinus Torvalds bfextu LOCAL_LO(%a0){#21:#2},%d3 |dbl-prec. g-r are 2 bits right 1611da177e4SLinus Torvalds movel #30,%d2 |of the dbl prec. limits 1621da177e4SLinus Torvalds lsll %d2,%d3 |shift g-r bits to the MSB of d3 1631da177e4SLinus Torvalds movel LOCAL_LO(%a0),%d2 |get lower mantissa for s-bit test 1641da177e4SLinus Torvalds andil #0x000001ff,%d2 |s bit is the or-ing of all 1651da177e4SLinus Torvalds bnes st_stky |other bits to the right of g-r 1661da177e4SLinus Torvalds tstl %d0 |test word original g,r,s 1671da177e4SLinus Torvalds bnes st_stky |if any are set, set sticky 1681da177e4SLinus Torvalds bras end_sd |if clear, exit 1691da177e4SLinus Torvaldsst_stky: 1701da177e4SLinus Torvalds bset #rnd_stky_bit,%d3 1711da177e4SLinus Torvaldsend_sd: 1721da177e4SLinus Torvalds movel %d3,%d0 |return grs to d0 1731da177e4SLinus Torvalds moveml (%a7)+,%d2/%d3 |restore scratch registers 1741da177e4SLinus Torvaldsend_ext_grs: 1751da177e4SLinus Torvalds swap %d1 |restore d1 to original 1761da177e4SLinus Torvalds rts 1771da177e4SLinus Torvalds 1781da177e4SLinus Torvalds|******************* Local Equates 1791da177e4SLinus Torvalds .set ad_1_sgl,0x00000100 | constant to add 1 to l-bit in sgl prec 1801da177e4SLinus Torvalds .set ad_1_dbl,0x00000800 | constant to add 1 to l-bit in dbl prec 1811da177e4SLinus Torvalds 1821da177e4SLinus Torvalds 1831da177e4SLinus Torvalds|Jump table for adding 1 to the l-bit indexed by rnd prec 1841da177e4SLinus Torvalds 1851da177e4SLinus Torvaldsadd_to_l: 1861da177e4SLinus Torvalds .long add_ext 1871da177e4SLinus Torvalds .long add_sgl 1881da177e4SLinus Torvalds .long add_dbl 1891da177e4SLinus Torvalds .long add_dbl 1901da177e4SLinus Torvalds| 1911da177e4SLinus Torvalds| ADD SINGLE 1921da177e4SLinus Torvalds| 1931da177e4SLinus Torvaldsadd_sgl: 1941da177e4SLinus Torvalds addl #ad_1_sgl,LOCAL_HI(%a0) 1951da177e4SLinus Torvalds bccs scc_clr |no mantissa overflow 1961da177e4SLinus Torvalds roxrw LOCAL_HI(%a0) |shift v-bit back in 1971da177e4SLinus Torvalds roxrw LOCAL_HI+2(%a0) |shift v-bit back in 1981da177e4SLinus Torvalds addw #0x1,LOCAL_EX(%a0) |and incr exponent 1991da177e4SLinus Torvaldsscc_clr: 2001da177e4SLinus Torvalds tstl %d0 |test for rs = 0 2011da177e4SLinus Torvalds bnes sgl_done 2021da177e4SLinus Torvalds andiw #0xfe00,LOCAL_HI+2(%a0) |clear the l-bit 2031da177e4SLinus Torvaldssgl_done: 2041da177e4SLinus Torvalds andil #0xffffff00,LOCAL_HI(%a0) |truncate bits beyond sgl limit 2051da177e4SLinus Torvalds clrl LOCAL_LO(%a0) |clear d2 2061da177e4SLinus Torvalds rts 2071da177e4SLinus Torvalds 2081da177e4SLinus Torvalds| 2091da177e4SLinus Torvalds| ADD EXTENDED 2101da177e4SLinus Torvalds| 2111da177e4SLinus Torvaldsadd_ext: 2121da177e4SLinus Torvalds addql #1,LOCAL_LO(%a0) |add 1 to l-bit 2131da177e4SLinus Torvalds bccs xcc_clr |test for carry out 2141da177e4SLinus Torvalds addql #1,LOCAL_HI(%a0) |propagate carry 2151da177e4SLinus Torvalds bccs xcc_clr 2161da177e4SLinus Torvalds roxrw LOCAL_HI(%a0) |mant is 0 so restore v-bit 2171da177e4SLinus Torvalds roxrw LOCAL_HI+2(%a0) |mant is 0 so restore v-bit 2181da177e4SLinus Torvalds roxrw LOCAL_LO(%a0) 2191da177e4SLinus Torvalds roxrw LOCAL_LO+2(%a0) 2201da177e4SLinus Torvalds addw #0x1,LOCAL_EX(%a0) |and inc exp 2211da177e4SLinus Torvaldsxcc_clr: 2221da177e4SLinus Torvalds tstl %d0 |test rs = 0 2231da177e4SLinus Torvalds bnes add_ext_done 2241da177e4SLinus Torvalds andib #0xfe,LOCAL_LO+3(%a0) |clear the l bit 2251da177e4SLinus Torvaldsadd_ext_done: 2261da177e4SLinus Torvalds rts 2271da177e4SLinus Torvalds| 2281da177e4SLinus Torvalds| ADD DOUBLE 2291da177e4SLinus Torvalds| 2301da177e4SLinus Torvaldsadd_dbl: 2311da177e4SLinus Torvalds addl #ad_1_dbl,LOCAL_LO(%a0) 2321da177e4SLinus Torvalds bccs dcc_clr 2331da177e4SLinus Torvalds addql #1,LOCAL_HI(%a0) |propagate carry 2341da177e4SLinus Torvalds bccs dcc_clr 2351da177e4SLinus Torvalds roxrw LOCAL_HI(%a0) |mant is 0 so restore v-bit 2361da177e4SLinus Torvalds roxrw LOCAL_HI+2(%a0) |mant is 0 so restore v-bit 2371da177e4SLinus Torvalds roxrw LOCAL_LO(%a0) 2381da177e4SLinus Torvalds roxrw LOCAL_LO+2(%a0) 2391da177e4SLinus Torvalds addw #0x1,LOCAL_EX(%a0) |incr exponent 2401da177e4SLinus Torvaldsdcc_clr: 2411da177e4SLinus Torvalds tstl %d0 |test for rs = 0 2421da177e4SLinus Torvalds bnes dbl_done 2431da177e4SLinus Torvalds andiw #0xf000,LOCAL_LO+2(%a0) |clear the l-bit 2441da177e4SLinus Torvalds 2451da177e4SLinus Torvaldsdbl_done: 2461da177e4SLinus Torvalds andil #0xfffff800,LOCAL_LO(%a0) |truncate bits beyond dbl limit 2471da177e4SLinus Torvalds rts 2481da177e4SLinus Torvalds 2491da177e4SLinus Torvaldserror: 2501da177e4SLinus Torvalds rts 2511da177e4SLinus Torvalds| 2521da177e4SLinus Torvalds| Truncate all other bits 2531da177e4SLinus Torvalds| 2541da177e4SLinus Torvaldstrunct: 2551da177e4SLinus Torvalds .long end_rnd 2561da177e4SLinus Torvalds .long sgl_done 2571da177e4SLinus Torvalds .long dbl_done 2581da177e4SLinus Torvalds .long dbl_done 2591da177e4SLinus Torvalds 2601da177e4SLinus Torvaldstruncate: 2611da177e4SLinus Torvalds lea trunct,%a1 2621da177e4SLinus Torvalds movel (%a1,%d1.w*4),%a1 2631da177e4SLinus Torvalds jmp (%a1) 2641da177e4SLinus Torvalds 2651da177e4SLinus Torvaldsend_rnd: 2661da177e4SLinus Torvalds rts 2671da177e4SLinus Torvalds 2681da177e4SLinus Torvalds| 2691da177e4SLinus Torvalds| NORMALIZE 2701da177e4SLinus Torvalds| 2711da177e4SLinus Torvalds| These routines (nrm_zero & nrm_set) normalize the unnorm. This 2721da177e4SLinus Torvalds| is done by shifting the mantissa left while decrementing the 2731da177e4SLinus Torvalds| exponent. 2741da177e4SLinus Torvalds| 2751da177e4SLinus Torvalds| NRM_SET shifts and decrements until there is a 1 set in the integer 2761da177e4SLinus Torvalds| bit of the mantissa (msb in d1). 2771da177e4SLinus Torvalds| 2781da177e4SLinus Torvalds| NRM_ZERO shifts and decrements until there is a 1 set in the integer 2791da177e4SLinus Torvalds| bit of the mantissa (msb in d1) unless this would mean the exponent 2801da177e4SLinus Torvalds| would go less than 0. In that case the number becomes a denorm - the 2811da177e4SLinus Torvalds| exponent (d0) is set to 0 and the mantissa (d1 & d2) is not 2821da177e4SLinus Torvalds| normalized. 2831da177e4SLinus Torvalds| 2841da177e4SLinus Torvalds| Note that both routines have been optimized (for the worst case) and 2851da177e4SLinus Torvalds| therefore do not have the easy to follow decrement/shift loop. 2861da177e4SLinus Torvalds| 2871da177e4SLinus Torvalds| NRM_ZERO 2881da177e4SLinus Torvalds| 2891da177e4SLinus Torvalds| Distance to first 1 bit in mantissa = X 2901da177e4SLinus Torvalds| Distance to 0 from exponent = Y 2911da177e4SLinus Torvalds| If X < Y 2921da177e4SLinus Torvalds| Then 2931da177e4SLinus Torvalds| nrm_set 2941da177e4SLinus Torvalds| Else 2951da177e4SLinus Torvalds| shift mantissa by Y 2961da177e4SLinus Torvalds| set exponent = 0 2971da177e4SLinus Torvalds| 2981da177e4SLinus Torvalds|input: 2991da177e4SLinus Torvalds| FP_SCR1 = exponent, ms mantissa part, ls mantissa part 3001da177e4SLinus Torvalds|output: 3011da177e4SLinus Torvalds| L_SCR1{4} = fpte15 or ete15 bit 3021da177e4SLinus Torvalds| 3031da177e4SLinus Torvalds .global nrm_zero 3041da177e4SLinus Torvaldsnrm_zero: 3051da177e4SLinus Torvalds movew LOCAL_EX(%a0),%d0 3061da177e4SLinus Torvalds cmpw #64,%d0 |see if exp > 64 3071da177e4SLinus Torvalds bmis d0_less 3081da177e4SLinus Torvalds bsr nrm_set |exp > 64 so exp won't exceed 0 3091da177e4SLinus Torvalds rts 3101da177e4SLinus Torvaldsd0_less: 3111da177e4SLinus Torvalds moveml %d2/%d3/%d5/%d6,-(%a7) 3121da177e4SLinus Torvalds movel LOCAL_HI(%a0),%d1 3131da177e4SLinus Torvalds movel LOCAL_LO(%a0),%d2 3141da177e4SLinus Torvalds 3151da177e4SLinus Torvalds bfffo %d1{#0:#32},%d3 |get the distance to the first 1 3161da177e4SLinus Torvalds| ;in ms mant 3171da177e4SLinus Torvalds beqs ms_clr |branch if no bits were set 3181da177e4SLinus Torvalds cmpw %d3,%d0 |of X>Y 3191da177e4SLinus Torvalds bmis greater |then exp will go past 0 (neg) if 3201da177e4SLinus Torvalds| ;it is just shifted 3211da177e4SLinus Torvalds bsr nrm_set |else exp won't go past 0 3221da177e4SLinus Torvalds moveml (%a7)+,%d2/%d3/%d5/%d6 3231da177e4SLinus Torvalds rts 3241da177e4SLinus Torvaldsgreater: 3251da177e4SLinus Torvalds movel %d2,%d6 |save ls mant in d6 3261da177e4SLinus Torvalds lsll %d0,%d2 |shift ls mant by count 3271da177e4SLinus Torvalds lsll %d0,%d1 |shift ms mant by count 3281da177e4SLinus Torvalds movel #32,%d5 3291da177e4SLinus Torvalds subl %d0,%d5 |make op a denorm by shifting bits 3301da177e4SLinus Torvalds lsrl %d5,%d6 |by the number in the exp, then 3311da177e4SLinus Torvalds| ;set exp = 0. 3321da177e4SLinus Torvalds orl %d6,%d1 |shift the ls mant bits into the ms mant 3331da177e4SLinus Torvalds movel #0,%d0 |same as if decremented exp to 0 3341da177e4SLinus Torvalds| ;while shifting 3351da177e4SLinus Torvalds movew %d0,LOCAL_EX(%a0) 3361da177e4SLinus Torvalds movel %d1,LOCAL_HI(%a0) 3371da177e4SLinus Torvalds movel %d2,LOCAL_LO(%a0) 3381da177e4SLinus Torvalds moveml (%a7)+,%d2/%d3/%d5/%d6 3391da177e4SLinus Torvalds rts 3401da177e4SLinus Torvaldsms_clr: 3411da177e4SLinus Torvalds bfffo %d2{#0:#32},%d3 |check if any bits set in ls mant 3421da177e4SLinus Torvalds beqs all_clr |branch if none set 3431da177e4SLinus Torvalds addw #32,%d3 3441da177e4SLinus Torvalds cmpw %d3,%d0 |if X>Y 3451da177e4SLinus Torvalds bmis greater |then branch 3461da177e4SLinus Torvalds bsr nrm_set |else exp won't go past 0 3471da177e4SLinus Torvalds moveml (%a7)+,%d2/%d3/%d5/%d6 3481da177e4SLinus Torvalds rts 3491da177e4SLinus Torvaldsall_clr: 3501da177e4SLinus Torvalds movew #0,LOCAL_EX(%a0) |no mantissa bits set. Set exp = 0. 3511da177e4SLinus Torvalds moveml (%a7)+,%d2/%d3/%d5/%d6 3521da177e4SLinus Torvalds rts 3531da177e4SLinus Torvalds| 3541da177e4SLinus Torvalds| NRM_SET 3551da177e4SLinus Torvalds| 3561da177e4SLinus Torvalds .global nrm_set 3571da177e4SLinus Torvaldsnrm_set: 3581da177e4SLinus Torvalds movel %d7,-(%a7) 3591da177e4SLinus Torvalds bfffo LOCAL_HI(%a0){#0:#32},%d7 |find first 1 in ms mant to d7) 3601da177e4SLinus Torvalds beqs lower |branch if ms mant is all 0's 3611da177e4SLinus Torvalds 3621da177e4SLinus Torvalds movel %d6,-(%a7) 3631da177e4SLinus Torvalds 3641da177e4SLinus Torvalds subw %d7,LOCAL_EX(%a0) |sub exponent by count 3651da177e4SLinus Torvalds movel LOCAL_HI(%a0),%d0 |d0 has ms mant 3661da177e4SLinus Torvalds movel LOCAL_LO(%a0),%d1 |d1 has ls mant 3671da177e4SLinus Torvalds 3681da177e4SLinus Torvalds lsll %d7,%d0 |shift first 1 to j bit position 3691da177e4SLinus Torvalds movel %d1,%d6 |copy ls mant into d6 3701da177e4SLinus Torvalds lsll %d7,%d6 |shift ls mant by count 3711da177e4SLinus Torvalds movel %d6,LOCAL_LO(%a0) |store ls mant into memory 3721da177e4SLinus Torvalds moveql #32,%d6 3731da177e4SLinus Torvalds subl %d7,%d6 |continue shift 3741da177e4SLinus Torvalds lsrl %d6,%d1 |shift off all bits but those that will 3751da177e4SLinus Torvalds| ;be shifted into ms mant 3761da177e4SLinus Torvalds orl %d1,%d0 |shift the ls mant bits into the ms mant 3771da177e4SLinus Torvalds movel %d0,LOCAL_HI(%a0) |store ms mant into memory 3781da177e4SLinus Torvalds moveml (%a7)+,%d7/%d6 |restore registers 3791da177e4SLinus Torvalds rts 3801da177e4SLinus Torvalds 3811da177e4SLinus Torvalds| 3821da177e4SLinus Torvalds| We get here if ms mant was = 0, and we assume ls mant has bits 3831da177e4SLinus Torvalds| set (otherwise this would have been tagged a zero not a denorm). 3841da177e4SLinus Torvalds| 3851da177e4SLinus Torvaldslower: 3861da177e4SLinus Torvalds movew LOCAL_EX(%a0),%d0 |d0 has exponent 3871da177e4SLinus Torvalds movel LOCAL_LO(%a0),%d1 |d1 has ls mant 3881da177e4SLinus Torvalds subw #32,%d0 |account for ms mant being all zeros 3891da177e4SLinus Torvalds bfffo %d1{#0:#32},%d7 |find first 1 in ls mant to d7) 3901da177e4SLinus Torvalds subw %d7,%d0 |subtract shift count from exp 3911da177e4SLinus Torvalds lsll %d7,%d1 |shift first 1 to integer bit in ms mant 3921da177e4SLinus Torvalds movew %d0,LOCAL_EX(%a0) |store ms mant 3931da177e4SLinus Torvalds movel %d1,LOCAL_HI(%a0) |store exp 3941da177e4SLinus Torvalds clrl LOCAL_LO(%a0) |clear ls mant 3951da177e4SLinus Torvalds movel (%a7)+,%d7 3961da177e4SLinus Torvalds rts 3971da177e4SLinus Torvalds| 3981da177e4SLinus Torvalds| denorm --- denormalize an intermediate result 3991da177e4SLinus Torvalds| 4001da177e4SLinus Torvalds| Used by underflow. 4011da177e4SLinus Torvalds| 4021da177e4SLinus Torvalds| Input: 4031da177e4SLinus Torvalds| a0 points to the operand to be denormalized 4041da177e4SLinus Torvalds| (in the internal extended format) 4051da177e4SLinus Torvalds| 4061da177e4SLinus Torvalds| d0: rounding precision 4071da177e4SLinus Torvalds| Output: 4081da177e4SLinus Torvalds| a0 points to the denormalized result 4091da177e4SLinus Torvalds| (in the internal extended format) 4101da177e4SLinus Torvalds| 4111da177e4SLinus Torvalds| d0 is guard,round,sticky 4121da177e4SLinus Torvalds| 4131da177e4SLinus Torvalds| d0 comes into this routine with the rounding precision. It 4141da177e4SLinus Torvalds| is then loaded with the denormalized exponent threshold for the 4151da177e4SLinus Torvalds| rounding precision. 4161da177e4SLinus Torvalds| 4171da177e4SLinus Torvalds 4181da177e4SLinus Torvalds .global denorm 4191da177e4SLinus Torvaldsdenorm: 4201da177e4SLinus Torvalds btstb #6,LOCAL_EX(%a0) |check for exponents between $7fff-$4000 4211da177e4SLinus Torvalds beqs no_sgn_ext 4221da177e4SLinus Torvalds bsetb #7,LOCAL_EX(%a0) |sign extend if it is so 4231da177e4SLinus Torvaldsno_sgn_ext: 4241da177e4SLinus Torvalds 4251da177e4SLinus Torvalds cmpib #0,%d0 |if 0 then extended precision 4261da177e4SLinus Torvalds bnes not_ext |else branch 4271da177e4SLinus Torvalds 4281da177e4SLinus Torvalds clrl %d1 |load d1 with ext threshold 4291da177e4SLinus Torvalds clrl %d0 |clear the sticky flag 4301da177e4SLinus Torvalds bsr dnrm_lp |denormalize the number 4311da177e4SLinus Torvalds tstb %d1 |check for inex 4321da177e4SLinus Torvalds beq no_inex |if clr, no inex 4331da177e4SLinus Torvalds bras dnrm_inex |if set, set inex 4341da177e4SLinus Torvalds 4351da177e4SLinus Torvaldsnot_ext: 4361da177e4SLinus Torvalds cmpil #1,%d0 |if 1 then single precision 4371da177e4SLinus Torvalds beqs load_sgl |else must be 2, double prec 4381da177e4SLinus Torvalds 4391da177e4SLinus Torvaldsload_dbl: 4401da177e4SLinus Torvalds movew #dbl_thresh,%d1 |put copy of threshold in d1 4411da177e4SLinus Torvalds movel %d1,%d0 |copy d1 into d0 4421da177e4SLinus Torvalds subw LOCAL_EX(%a0),%d0 |diff = threshold - exp 4431da177e4SLinus Torvalds cmpw #67,%d0 |if diff > 67 (mant + grs bits) 4441da177e4SLinus Torvalds bpls chk_stky |then branch (all bits would be 4451da177e4SLinus Torvalds| ; shifted off in denorm routine) 4461da177e4SLinus Torvalds clrl %d0 |else clear the sticky flag 4471da177e4SLinus Torvalds bsr dnrm_lp |denormalize the number 4481da177e4SLinus Torvalds tstb %d1 |check flag 4491da177e4SLinus Torvalds beqs no_inex |if clr, no inex 4501da177e4SLinus Torvalds bras dnrm_inex |if set, set inex 4511da177e4SLinus Torvalds 4521da177e4SLinus Torvaldsload_sgl: 4531da177e4SLinus Torvalds movew #sgl_thresh,%d1 |put copy of threshold in d1 4541da177e4SLinus Torvalds movel %d1,%d0 |copy d1 into d0 4551da177e4SLinus Torvalds subw LOCAL_EX(%a0),%d0 |diff = threshold - exp 4561da177e4SLinus Torvalds cmpw #67,%d0 |if diff > 67 (mant + grs bits) 4571da177e4SLinus Torvalds bpls chk_stky |then branch (all bits would be 4581da177e4SLinus Torvalds| ; shifted off in denorm routine) 4591da177e4SLinus Torvalds clrl %d0 |else clear the sticky flag 4601da177e4SLinus Torvalds bsr dnrm_lp |denormalize the number 4611da177e4SLinus Torvalds tstb %d1 |check flag 4621da177e4SLinus Torvalds beqs no_inex |if clr, no inex 4631da177e4SLinus Torvalds bras dnrm_inex |if set, set inex 4641da177e4SLinus Torvalds 4651da177e4SLinus Torvaldschk_stky: 4661da177e4SLinus Torvalds tstl LOCAL_HI(%a0) |check for any bits set 4671da177e4SLinus Torvalds bnes set_stky 4681da177e4SLinus Torvalds tstl LOCAL_LO(%a0) |check for any bits set 4691da177e4SLinus Torvalds bnes set_stky 4701da177e4SLinus Torvalds bras clr_mant 4711da177e4SLinus Torvaldsset_stky: 4721da177e4SLinus Torvalds orl #inx2a_mask,USER_FPSR(%a6) |set inex2/ainex 4731da177e4SLinus Torvalds movel #0x20000000,%d0 |set sticky bit in return value 4741da177e4SLinus Torvaldsclr_mant: 4751da177e4SLinus Torvalds movew %d1,LOCAL_EX(%a0) |load exp with threshold 4761da177e4SLinus Torvalds movel #0,LOCAL_HI(%a0) |set d1 = 0 (ms mantissa) 4771da177e4SLinus Torvalds movel #0,LOCAL_LO(%a0) |set d2 = 0 (ms mantissa) 4781da177e4SLinus Torvalds rts 4791da177e4SLinus Torvaldsdnrm_inex: 4801da177e4SLinus Torvalds orl #inx2a_mask,USER_FPSR(%a6) |set inex2/ainex 4811da177e4SLinus Torvaldsno_inex: 4821da177e4SLinus Torvalds rts 4831da177e4SLinus Torvalds 4841da177e4SLinus Torvalds| 4851da177e4SLinus Torvalds| dnrm_lp --- normalize exponent/mantissa to specified threshold 4861da177e4SLinus Torvalds| 4871da177e4SLinus Torvalds| Input: 4881da177e4SLinus Torvalds| a0 points to the operand to be denormalized 4891da177e4SLinus Torvalds| d0{31:29} initial guard,round,sticky 4901da177e4SLinus Torvalds| d1{15:0} denormalization threshold 4911da177e4SLinus Torvalds| Output: 4921da177e4SLinus Torvalds| a0 points to the denormalized operand 4931da177e4SLinus Torvalds| d0{31:29} final guard,round,sticky 4941da177e4SLinus Torvalds| d1.b inexact flag: all ones means inexact result 4951da177e4SLinus Torvalds| 4961da177e4SLinus Torvalds| The LOCAL_LO and LOCAL_GRS parts of the value are copied to FP_SCR2 4971da177e4SLinus Torvalds| so that bfext can be used to extract the new low part of the mantissa. 4981da177e4SLinus Torvalds| Dnrm_lp can be called with a0 pointing to ETEMP or WBTEMP and there 4991da177e4SLinus Torvalds| is no LOCAL_GRS scratch word following it on the fsave frame. 5001da177e4SLinus Torvalds| 5011da177e4SLinus Torvalds .global dnrm_lp 5021da177e4SLinus Torvaldsdnrm_lp: 5031da177e4SLinus Torvalds movel %d2,-(%sp) |save d2 for temp use 5041da177e4SLinus Torvalds btstb #E3,E_BYTE(%a6) |test for type E3 exception 5051da177e4SLinus Torvalds beqs not_E3 |not type E3 exception 5061da177e4SLinus Torvalds bfextu WBTEMP_GRS(%a6){#6:#3},%d2 |extract guard,round, sticky bit 5071da177e4SLinus Torvalds movel #29,%d0 5081da177e4SLinus Torvalds lsll %d0,%d2 |shift g,r,s to their positions 5091da177e4SLinus Torvalds movel %d2,%d0 5101da177e4SLinus Torvaldsnot_E3: 5111da177e4SLinus Torvalds movel (%sp)+,%d2 |restore d2 5121da177e4SLinus Torvalds movel LOCAL_LO(%a0),FP_SCR2+LOCAL_LO(%a6) 5131da177e4SLinus Torvalds movel %d0,FP_SCR2+LOCAL_GRS(%a6) 5141da177e4SLinus Torvalds movel %d1,%d0 |copy the denorm threshold 5151da177e4SLinus Torvalds subw LOCAL_EX(%a0),%d1 |d1 = threshold - uns exponent 5161da177e4SLinus Torvalds bles no_lp |d1 <= 0 5171da177e4SLinus Torvalds cmpw #32,%d1 5181da177e4SLinus Torvalds blts case_1 |0 = d1 < 32 5191da177e4SLinus Torvalds cmpw #64,%d1 5201da177e4SLinus Torvalds blts case_2 |32 <= d1 < 64 5211da177e4SLinus Torvalds bra case_3 |d1 >= 64 5221da177e4SLinus Torvalds| 5231da177e4SLinus Torvalds| No normalization necessary 5241da177e4SLinus Torvalds| 5251da177e4SLinus Torvaldsno_lp: 5261da177e4SLinus Torvalds clrb %d1 |set no inex2 reported 5271da177e4SLinus Torvalds movel FP_SCR2+LOCAL_GRS(%a6),%d0 |restore original g,r,s 5281da177e4SLinus Torvalds rts 5291da177e4SLinus Torvalds| 5301da177e4SLinus Torvalds| case (0<d1<32) 5311da177e4SLinus Torvalds| 5321da177e4SLinus Torvaldscase_1: 5331da177e4SLinus Torvalds movel %d2,-(%sp) 5341da177e4SLinus Torvalds movew %d0,LOCAL_EX(%a0) |exponent = denorm threshold 5351da177e4SLinus Torvalds movel #32,%d0 5361da177e4SLinus Torvalds subw %d1,%d0 |d0 = 32 - d1 5371da177e4SLinus Torvalds bfextu LOCAL_EX(%a0){%d0:#32},%d2 5381da177e4SLinus Torvalds bfextu %d2{%d1:%d0},%d2 |d2 = new LOCAL_HI 5391da177e4SLinus Torvalds bfextu LOCAL_HI(%a0){%d0:#32},%d1 |d1 = new LOCAL_LO 5401da177e4SLinus Torvalds bfextu FP_SCR2+LOCAL_LO(%a6){%d0:#32},%d0 |d0 = new G,R,S 5411da177e4SLinus Torvalds movel %d2,LOCAL_HI(%a0) |store new LOCAL_HI 5421da177e4SLinus Torvalds movel %d1,LOCAL_LO(%a0) |store new LOCAL_LO 5431da177e4SLinus Torvalds clrb %d1 5441da177e4SLinus Torvalds bftst %d0{#2:#30} 5451da177e4SLinus Torvalds beqs c1nstky 5461da177e4SLinus Torvalds bsetl #rnd_stky_bit,%d0 5471da177e4SLinus Torvalds st %d1 5481da177e4SLinus Torvaldsc1nstky: 5491da177e4SLinus Torvalds movel FP_SCR2+LOCAL_GRS(%a6),%d2 |restore original g,r,s 5501da177e4SLinus Torvalds andil #0xe0000000,%d2 |clear all but G,R,S 5511da177e4SLinus Torvalds tstl %d2 |test if original G,R,S are clear 5521da177e4SLinus Torvalds beqs grs_clear 5531da177e4SLinus Torvalds orl #0x20000000,%d0 |set sticky bit in d0 5541da177e4SLinus Torvaldsgrs_clear: 5551da177e4SLinus Torvalds andil #0xe0000000,%d0 |clear all but G,R,S 5561da177e4SLinus Torvalds movel (%sp)+,%d2 5571da177e4SLinus Torvalds rts 5581da177e4SLinus Torvalds| 5591da177e4SLinus Torvalds| case (32<=d1<64) 5601da177e4SLinus Torvalds| 5611da177e4SLinus Torvaldscase_2: 5621da177e4SLinus Torvalds movel %d2,-(%sp) 5631da177e4SLinus Torvalds movew %d0,LOCAL_EX(%a0) |unsigned exponent = threshold 5641da177e4SLinus Torvalds subw #32,%d1 |d1 now between 0 and 32 5651da177e4SLinus Torvalds movel #32,%d0 5661da177e4SLinus Torvalds subw %d1,%d0 |d0 = 32 - d1 5671da177e4SLinus Torvalds bfextu LOCAL_EX(%a0){%d0:#32},%d2 5681da177e4SLinus Torvalds bfextu %d2{%d1:%d0},%d2 |d2 = new LOCAL_LO 5691da177e4SLinus Torvalds bfextu LOCAL_HI(%a0){%d0:#32},%d1 |d1 = new G,R,S 5701da177e4SLinus Torvalds bftst %d1{#2:#30} 5711da177e4SLinus Torvalds bnes c2_sstky |bra if sticky bit to be set 5721da177e4SLinus Torvalds bftst FP_SCR2+LOCAL_LO(%a6){%d0:#32} 5731da177e4SLinus Torvalds bnes c2_sstky |bra if sticky bit to be set 5741da177e4SLinus Torvalds movel %d1,%d0 5751da177e4SLinus Torvalds clrb %d1 5761da177e4SLinus Torvalds bras end_c2 5771da177e4SLinus Torvaldsc2_sstky: 5781da177e4SLinus Torvalds movel %d1,%d0 5791da177e4SLinus Torvalds bsetl #rnd_stky_bit,%d0 5801da177e4SLinus Torvalds st %d1 5811da177e4SLinus Torvaldsend_c2: 5821da177e4SLinus Torvalds clrl LOCAL_HI(%a0) |store LOCAL_HI = 0 5831da177e4SLinus Torvalds movel %d2,LOCAL_LO(%a0) |store LOCAL_LO 5841da177e4SLinus Torvalds movel FP_SCR2+LOCAL_GRS(%a6),%d2 |restore original g,r,s 5851da177e4SLinus Torvalds andil #0xe0000000,%d2 |clear all but G,R,S 5861da177e4SLinus Torvalds tstl %d2 |test if original G,R,S are clear 5871da177e4SLinus Torvalds beqs clear_grs 5881da177e4SLinus Torvalds orl #0x20000000,%d0 |set sticky bit in d0 5891da177e4SLinus Torvaldsclear_grs: 5901da177e4SLinus Torvalds andil #0xe0000000,%d0 |get rid of all but G,R,S 5911da177e4SLinus Torvalds movel (%sp)+,%d2 5921da177e4SLinus Torvalds rts 5931da177e4SLinus Torvalds| 5941da177e4SLinus Torvalds| d1 >= 64 Force the exponent to be the denorm threshold with the 5951da177e4SLinus Torvalds| correct sign. 5961da177e4SLinus Torvalds| 5971da177e4SLinus Torvaldscase_3: 5981da177e4SLinus Torvalds movew %d0,LOCAL_EX(%a0) 5991da177e4SLinus Torvalds tstw LOCAL_SGN(%a0) 6001da177e4SLinus Torvalds bges c3con 6011da177e4SLinus Torvaldsc3neg: 6021da177e4SLinus Torvalds orl #0x80000000,LOCAL_EX(%a0) 6031da177e4SLinus Torvaldsc3con: 6041da177e4SLinus Torvalds cmpw #64,%d1 6051da177e4SLinus Torvalds beqs sixty_four 6061da177e4SLinus Torvalds cmpw #65,%d1 6071da177e4SLinus Torvalds beqs sixty_five 6081da177e4SLinus Torvalds| 6091da177e4SLinus Torvalds| Shift value is out of range. Set d1 for inex2 flag and 6101da177e4SLinus Torvalds| return a zero with the given threshold. 6111da177e4SLinus Torvalds| 6121da177e4SLinus Torvalds clrl LOCAL_HI(%a0) 6131da177e4SLinus Torvalds clrl LOCAL_LO(%a0) 6141da177e4SLinus Torvalds movel #0x20000000,%d0 6151da177e4SLinus Torvalds st %d1 6161da177e4SLinus Torvalds rts 6171da177e4SLinus Torvalds 6181da177e4SLinus Torvaldssixty_four: 6191da177e4SLinus Torvalds movel LOCAL_HI(%a0),%d0 6201da177e4SLinus Torvalds bfextu %d0{#2:#30},%d1 6211da177e4SLinus Torvalds andil #0xc0000000,%d0 6221da177e4SLinus Torvalds bras c3com 6231da177e4SLinus Torvalds 6241da177e4SLinus Torvaldssixty_five: 6251da177e4SLinus Torvalds movel LOCAL_HI(%a0),%d0 6261da177e4SLinus Torvalds bfextu %d0{#1:#31},%d1 6271da177e4SLinus Torvalds andil #0x80000000,%d0 6281da177e4SLinus Torvalds lsrl #1,%d0 |shift high bit into R bit 6291da177e4SLinus Torvalds 6301da177e4SLinus Torvaldsc3com: 6311da177e4SLinus Torvalds tstl %d1 6321da177e4SLinus Torvalds bnes c3ssticky 6331da177e4SLinus Torvalds tstl LOCAL_LO(%a0) 6341da177e4SLinus Torvalds bnes c3ssticky 6351da177e4SLinus Torvalds tstb FP_SCR2+LOCAL_GRS(%a6) 6361da177e4SLinus Torvalds bnes c3ssticky 6371da177e4SLinus Torvalds clrb %d1 6381da177e4SLinus Torvalds bras c3end 6391da177e4SLinus Torvalds 6401da177e4SLinus Torvaldsc3ssticky: 6411da177e4SLinus Torvalds bsetl #rnd_stky_bit,%d0 6421da177e4SLinus Torvalds st %d1 6431da177e4SLinus Torvaldsc3end: 6441da177e4SLinus Torvalds clrl LOCAL_HI(%a0) 6451da177e4SLinus Torvalds clrl LOCAL_LO(%a0) 6461da177e4SLinus Torvalds rts 6471da177e4SLinus Torvalds 6481da177e4SLinus Torvalds |end 649