11da177e4SLinus Torvalds| 21da177e4SLinus Torvalds| stanh.sa 3.1 12/10/90 31da177e4SLinus Torvalds| 41da177e4SLinus Torvalds| The entry point sTanh computes the hyperbolic tangent of 51da177e4SLinus Torvalds| an input argument; sTanhd does the same except for denormalized 61da177e4SLinus Torvalds| input. 71da177e4SLinus Torvalds| 81da177e4SLinus Torvalds| Input: Double-extended number X in location pointed to 91da177e4SLinus Torvalds| by address register a0. 101da177e4SLinus Torvalds| 111da177e4SLinus Torvalds| Output: The value tanh(X) returned in floating-point register Fp0. 121da177e4SLinus Torvalds| 131da177e4SLinus Torvalds| Accuracy and Monotonicity: The returned result is within 3 ulps in 141da177e4SLinus Torvalds| 64 significant bit, i.e. within 0.5001 ulp to 53 bits if the 151da177e4SLinus Torvalds| result is subsequently rounded to double precision. The 161da177e4SLinus Torvalds| result is provably monotonic in double precision. 171da177e4SLinus Torvalds| 181da177e4SLinus Torvalds| Speed: The program stanh takes approximately 270 cycles. 191da177e4SLinus Torvalds| 201da177e4SLinus Torvalds| Algorithm: 211da177e4SLinus Torvalds| 221da177e4SLinus Torvalds| TANH 231da177e4SLinus Torvalds| 1. If |X| >= (5/2) log2 or |X| <= 2**(-40), go to 3. 241da177e4SLinus Torvalds| 251da177e4SLinus Torvalds| 2. (2**(-40) < |X| < (5/2) log2) Calculate tanh(X) by 261da177e4SLinus Torvalds| sgn := sign(X), y := 2|X|, z := expm1(Y), and 271da177e4SLinus Torvalds| tanh(X) = sgn*( z/(2+z) ). 281da177e4SLinus Torvalds| Exit. 291da177e4SLinus Torvalds| 301da177e4SLinus Torvalds| 3. (|X| <= 2**(-40) or |X| >= (5/2) log2). If |X| < 1, 311da177e4SLinus Torvalds| go to 7. 321da177e4SLinus Torvalds| 331da177e4SLinus Torvalds| 4. (|X| >= (5/2) log2) If |X| >= 50 log2, go to 6. 341da177e4SLinus Torvalds| 351da177e4SLinus Torvalds| 5. ((5/2) log2 <= |X| < 50 log2) Calculate tanh(X) by 361da177e4SLinus Torvalds| sgn := sign(X), y := 2|X|, z := exp(Y), 371da177e4SLinus Torvalds| tanh(X) = sgn - [ sgn*2/(1+z) ]. 381da177e4SLinus Torvalds| Exit. 391da177e4SLinus Torvalds| 401da177e4SLinus Torvalds| 6. (|X| >= 50 log2) Tanh(X) = +-1 (round to nearest). Thus, we 411da177e4SLinus Torvalds| calculate Tanh(X) by 421da177e4SLinus Torvalds| sgn := sign(X), Tiny := 2**(-126), 431da177e4SLinus Torvalds| tanh(X) := sgn - sgn*Tiny. 441da177e4SLinus Torvalds| Exit. 451da177e4SLinus Torvalds| 461da177e4SLinus Torvalds| 7. (|X| < 2**(-40)). Tanh(X) = X. Exit. 471da177e4SLinus Torvalds| 481da177e4SLinus Torvalds 491da177e4SLinus Torvalds| Copyright (C) Motorola, Inc. 1990 501da177e4SLinus Torvalds| All Rights Reserved 511da177e4SLinus Torvalds| 52*e00d82d0SMatt Waddel| For details on the license for this file, please see the 53*e00d82d0SMatt Waddel| file, README, in this same directory. 541da177e4SLinus Torvalds 551da177e4SLinus Torvalds|STANH idnt 2,1 | Motorola 040 Floating Point Software Package 561da177e4SLinus Torvalds 571da177e4SLinus Torvalds |section 8 581da177e4SLinus Torvalds 591da177e4SLinus Torvalds#include "fpsp.h" 601da177e4SLinus Torvalds 611da177e4SLinus Torvalds .set X,FP_SCR5 621da177e4SLinus Torvalds .set XDCARE,X+2 631da177e4SLinus Torvalds .set XFRAC,X+4 641da177e4SLinus Torvalds 651da177e4SLinus Torvalds .set SGN,L_SCR3 661da177e4SLinus Torvalds 671da177e4SLinus Torvalds .set V,FP_SCR6 681da177e4SLinus Torvalds 691da177e4SLinus TorvaldsBOUNDS1: .long 0x3FD78000,0x3FFFDDCE | ... 2^(-40), (5/2)LOG2 701da177e4SLinus Torvalds 711da177e4SLinus Torvalds |xref t_frcinx 721da177e4SLinus Torvalds |xref t_extdnrm 731da177e4SLinus Torvalds |xref setox 741da177e4SLinus Torvalds |xref setoxm1 751da177e4SLinus Torvalds 761da177e4SLinus Torvalds .global stanhd 771da177e4SLinus Torvaldsstanhd: 781da177e4SLinus Torvalds|--TANH(X) = X FOR DENORMALIZED X 791da177e4SLinus Torvalds 801da177e4SLinus Torvalds bra t_extdnrm 811da177e4SLinus Torvalds 821da177e4SLinus Torvalds .global stanh 831da177e4SLinus Torvaldsstanh: 841da177e4SLinus Torvalds fmovex (%a0),%fp0 | ...LOAD INPUT 851da177e4SLinus Torvalds 861da177e4SLinus Torvalds fmovex %fp0,X(%a6) 871da177e4SLinus Torvalds movel (%a0),%d0 881da177e4SLinus Torvalds movew 4(%a0),%d0 891da177e4SLinus Torvalds movel %d0,X(%a6) 901da177e4SLinus Torvalds andl #0x7FFFFFFF,%d0 911da177e4SLinus Torvalds cmp2l BOUNDS1(%pc),%d0 | ...2**(-40) < |X| < (5/2)LOG2 ? 921da177e4SLinus Torvalds bcss TANHBORS 931da177e4SLinus Torvalds 941da177e4SLinus Torvalds|--THIS IS THE USUAL CASE 951da177e4SLinus Torvalds|--Y = 2|X|, Z = EXPM1(Y), TANH(X) = SIGN(X) * Z / (Z+2). 961da177e4SLinus Torvalds 971da177e4SLinus Torvalds movel X(%a6),%d0 981da177e4SLinus Torvalds movel %d0,SGN(%a6) 991da177e4SLinus Torvalds andl #0x7FFF0000,%d0 1001da177e4SLinus Torvalds addl #0x00010000,%d0 | ...EXPONENT OF 2|X| 1011da177e4SLinus Torvalds movel %d0,X(%a6) 1021da177e4SLinus Torvalds andl #0x80000000,SGN(%a6) 1031da177e4SLinus Torvalds fmovex X(%a6),%fp0 | ...FP0 IS Y = 2|X| 1041da177e4SLinus Torvalds 1051da177e4SLinus Torvalds movel %d1,-(%a7) 1061da177e4SLinus Torvalds clrl %d1 1071da177e4SLinus Torvalds fmovemx %fp0-%fp0,(%a0) 1081da177e4SLinus Torvalds bsr setoxm1 | ...FP0 IS Z = EXPM1(Y) 1091da177e4SLinus Torvalds movel (%a7)+,%d1 1101da177e4SLinus Torvalds 1111da177e4SLinus Torvalds fmovex %fp0,%fp1 1121da177e4SLinus Torvalds fadds #0x40000000,%fp1 | ...Z+2 1131da177e4SLinus Torvalds movel SGN(%a6),%d0 1141da177e4SLinus Torvalds fmovex %fp1,V(%a6) 1151da177e4SLinus Torvalds eorl %d0,V(%a6) 1161da177e4SLinus Torvalds 1171da177e4SLinus Torvalds fmovel %d1,%FPCR |restore users exceptions 1181da177e4SLinus Torvalds fdivx V(%a6),%fp0 1191da177e4SLinus Torvalds bra t_frcinx 1201da177e4SLinus Torvalds 1211da177e4SLinus TorvaldsTANHBORS: 1221da177e4SLinus Torvalds cmpl #0x3FFF8000,%d0 1231da177e4SLinus Torvalds blt TANHSM 1241da177e4SLinus Torvalds 1251da177e4SLinus Torvalds cmpl #0x40048AA1,%d0 1261da177e4SLinus Torvalds bgt TANHHUGE 1271da177e4SLinus Torvalds 1281da177e4SLinus Torvalds|-- (5/2) LOG2 < |X| < 50 LOG2, 1291da177e4SLinus Torvalds|--TANH(X) = 1 - (2/[EXP(2X)+1]). LET Y = 2|X|, SGN = SIGN(X), 1301da177e4SLinus Torvalds|--TANH(X) = SGN - SGN*2/[EXP(Y)+1]. 1311da177e4SLinus Torvalds 1321da177e4SLinus Torvalds movel X(%a6),%d0 1331da177e4SLinus Torvalds movel %d0,SGN(%a6) 1341da177e4SLinus Torvalds andl #0x7FFF0000,%d0 1351da177e4SLinus Torvalds addl #0x00010000,%d0 | ...EXPO OF 2|X| 1361da177e4SLinus Torvalds movel %d0,X(%a6) | ...Y = 2|X| 1371da177e4SLinus Torvalds andl #0x80000000,SGN(%a6) 1381da177e4SLinus Torvalds movel SGN(%a6),%d0 1391da177e4SLinus Torvalds fmovex X(%a6),%fp0 | ...Y = 2|X| 1401da177e4SLinus Torvalds 1411da177e4SLinus Torvalds movel %d1,-(%a7) 1421da177e4SLinus Torvalds clrl %d1 1431da177e4SLinus Torvalds fmovemx %fp0-%fp0,(%a0) 1441da177e4SLinus Torvalds bsr setox | ...FP0 IS EXP(Y) 1451da177e4SLinus Torvalds movel (%a7)+,%d1 1461da177e4SLinus Torvalds movel SGN(%a6),%d0 1471da177e4SLinus Torvalds fadds #0x3F800000,%fp0 | ...EXP(Y)+1 1481da177e4SLinus Torvalds 1491da177e4SLinus Torvalds eorl #0xC0000000,%d0 | ...-SIGN(X)*2 1501da177e4SLinus Torvalds fmoves %d0,%fp1 | ...-SIGN(X)*2 IN SGL FMT 1511da177e4SLinus Torvalds fdivx %fp0,%fp1 | ...-SIGN(X)2 / [EXP(Y)+1 ] 1521da177e4SLinus Torvalds 1531da177e4SLinus Torvalds movel SGN(%a6),%d0 1541da177e4SLinus Torvalds orl #0x3F800000,%d0 | ...SGN 1551da177e4SLinus Torvalds fmoves %d0,%fp0 | ...SGN IN SGL FMT 1561da177e4SLinus Torvalds 1571da177e4SLinus Torvalds fmovel %d1,%FPCR |restore users exceptions 1581da177e4SLinus Torvalds faddx %fp1,%fp0 1591da177e4SLinus Torvalds 1601da177e4SLinus Torvalds bra t_frcinx 1611da177e4SLinus Torvalds 1621da177e4SLinus TorvaldsTANHSM: 1631da177e4SLinus Torvalds movew #0x0000,XDCARE(%a6) 1641da177e4SLinus Torvalds 1651da177e4SLinus Torvalds fmovel %d1,%FPCR |restore users exceptions 1661da177e4SLinus Torvalds fmovex X(%a6),%fp0 |last inst - possible exception set 1671da177e4SLinus Torvalds 1681da177e4SLinus Torvalds bra t_frcinx 1691da177e4SLinus Torvalds 1701da177e4SLinus TorvaldsTANHHUGE: 1711da177e4SLinus Torvalds|---RETURN SGN(X) - SGN(X)EPS 1721da177e4SLinus Torvalds movel X(%a6),%d0 1731da177e4SLinus Torvalds andl #0x80000000,%d0 1741da177e4SLinus Torvalds orl #0x3F800000,%d0 1751da177e4SLinus Torvalds fmoves %d0,%fp0 1761da177e4SLinus Torvalds andl #0x80000000,%d0 1771da177e4SLinus Torvalds eorl #0x80800000,%d0 | ...-SIGN(X)*EPS 1781da177e4SLinus Torvalds 1791da177e4SLinus Torvalds fmovel %d1,%FPCR |restore users exceptions 1801da177e4SLinus Torvalds fadds %d0,%fp0 1811da177e4SLinus Torvalds 1821da177e4SLinus Torvalds bra t_frcinx 1831da177e4SLinus Torvalds 1841da177e4SLinus Torvalds |end 185