vsprintf.c (725fe002d315c2501c110b7245d3eb4f4535f4d6) | vsprintf.c (133fd9f5cda2d86904126f4b9fa4e8f4330c9569) |
---|---|
1/* 2 * linux/lib/vsprintf.c 3 * 4 * Copyright (C) 1991, 1992 Linus Torvalds 5 */ 6 7/* vsprintf.c -- Lars Wirzenius & Linus Torvalds. */ 8/* --- 98 unchanged lines hidden (view full) --- 107 i = i*10 + *((*s)++) - '0'; 108 109 return i; 110} 111 112/* Decimal conversion is by far the most typical, and is used 113 * for /proc and /sys data. This directly impacts e.g. top performance 114 * with many processes running. We optimize it for speed | 1/* 2 * linux/lib/vsprintf.c 3 * 4 * Copyright (C) 1991, 1992 Linus Torvalds 5 */ 6 7/* vsprintf.c -- Lars Wirzenius & Linus Torvalds. */ 8/* --- 98 unchanged lines hidden (view full) --- 107 i = i*10 + *((*s)++) - '0'; 108 109 return i; 110} 111 112/* Decimal conversion is by far the most typical, and is used 113 * for /proc and /sys data. This directly impacts e.g. top performance 114 * with many processes running. We optimize it for speed |
115 * using code from 116 * http://www.cs.uiowa.edu/~jones/bcd/decimal.html 117 * (with permission from the author, Douglas W. Jones). */ | 115 * using ideas described at <http://www.cs.uiowa.edu/~jones/bcd/divide.html> 116 * (with permission from the author, Douglas W. Jones). 117 */ |
118 | 118 |
119/* Formats correctly any integer in [0,99999]. 120 * Outputs from one to five digits depending on input. 121 * On i386 gcc 4.1.2 -O2: ~250 bytes of code. */ | 119#if BITS_PER_LONG != 32 || BITS_PER_LONG_LONG != 64 120/* Formats correctly any integer in [0, 999999999] */ |
122static noinline_for_stack | 121static noinline_for_stack |
123char *put_dec_trunc(char *buf, unsigned q) | 122char *put_dec_full9(char *buf, unsigned q) |
124{ | 123{ |
125 unsigned d3, d2, d1, d0; 126 d1 = (q>>4) & 0xf; 127 d2 = (q>>8) & 0xf; 128 d3 = (q>>12); | 124 unsigned r; |
129 | 125 |
130 d0 = 6*(d3 + d2 + d1) + (q & 0xf); 131 q = (d0 * 0xcd) >> 11; 132 d0 = d0 - 10*q; 133 *buf++ = d0 + '0'; /* least significant digit */ 134 d1 = q + 9*d3 + 5*d2 + d1; 135 if (d1 != 0) { 136 q = (d1 * 0xcd) >> 11; 137 d1 = d1 - 10*q; 138 *buf++ = d1 + '0'; /* next digit */ | 126 /* 127 * Possible ways to approx. divide by 10 128 * (x * 0x1999999a) >> 32 x < 1073741829 (multiply must be 64-bit) 129 * (x * 0xcccd) >> 19 x < 81920 (x < 262149 when 64-bit mul) 130 * (x * 0x6667) >> 18 x < 43699 131 * (x * 0x3334) >> 17 x < 16389 132 * (x * 0x199a) >> 16 x < 16389 133 * (x * 0x0ccd) >> 15 x < 16389 134 * (x * 0x0667) >> 14 x < 2739 135 * (x * 0x0334) >> 13 x < 1029 136 * (x * 0x019a) >> 12 x < 1029 137 * (x * 0x00cd) >> 11 x < 1029 shorter code than * 0x67 (on i386) 138 * (x * 0x0067) >> 10 x < 179 139 * (x * 0x0034) >> 9 x < 69 same 140 * (x * 0x001a) >> 8 x < 69 same 141 * (x * 0x000d) >> 7 x < 69 same, shortest code (on i386) 142 * (x * 0x0007) >> 6 x < 19 143 * See <http://www.cs.uiowa.edu/~jones/bcd/divide.html> 144 */ 145 r = (q * (uint64_t)0x1999999a) >> 32; 146 *buf++ = (q - 10 * r) + '0'; /* 1 */ 147 q = (r * (uint64_t)0x1999999a) >> 32; 148 *buf++ = (r - 10 * q) + '0'; /* 2 */ 149 r = (q * (uint64_t)0x1999999a) >> 32; 150 *buf++ = (q - 10 * r) + '0'; /* 3 */ 151 q = (r * (uint64_t)0x1999999a) >> 32; 152 *buf++ = (r - 10 * q) + '0'; /* 4 */ 153 r = (q * (uint64_t)0x1999999a) >> 32; 154 *buf++ = (q - 10 * r) + '0'; /* 5 */ 155 /* Now value is under 10000, can avoid 64-bit multiply */ 156 q = (r * 0x199a) >> 16; 157 *buf++ = (r - 10 * q) + '0'; /* 6 */ 158 r = (q * 0xcd) >> 11; 159 *buf++ = (q - 10 * r) + '0'; /* 7 */ 160 q = (r * 0xcd) >> 11; 161 *buf++ = (r - 10 * q) + '0'; /* 8 */ 162 *buf++ = q + '0'; /* 9 */ 163 return buf; 164} 165#endif |
139 | 166 |
140 d2 = q + 2*d2; 141 if ((d2 != 0) || (d3 != 0)) { 142 q = (d2 * 0xd) >> 7; 143 d2 = d2 - 10*q; 144 *buf++ = d2 + '0'; /* next digit */ | 167/* Similar to above but do not pad with zeros. 168 * Code can be easily arranged to print 9 digits too, but our callers 169 * always call put_dec_full9() instead when the number has 9 decimal digits. 170 */ 171static noinline_for_stack 172char *put_dec_trunc8(char *buf, unsigned r) 173{ 174 unsigned q; |
145 | 175 |
146 d3 = q + 4*d3; 147 if (d3 != 0) { 148 q = (d3 * 0xcd) >> 11; 149 d3 = d3 - 10*q; 150 *buf++ = d3 + '0'; /* next digit */ 151 if (q != 0) 152 *buf++ = q + '0'; /* most sign. digit */ 153 } 154 } | 176 /* Copy of previous function's body with added early returns */ 177 q = (r * (uint64_t)0x1999999a) >> 32; 178 *buf++ = (r - 10 * q) + '0'; /* 2 */ 179 if (q == 0) 180 return buf; 181 r = (q * (uint64_t)0x1999999a) >> 32; 182 *buf++ = (q - 10 * r) + '0'; /* 3 */ 183 if (r == 0) 184 return buf; 185 q = (r * (uint64_t)0x1999999a) >> 32; 186 *buf++ = (r - 10 * q) + '0'; /* 4 */ 187 if (q == 0) 188 return buf; 189 r = (q * (uint64_t)0x1999999a) >> 32; 190 *buf++ = (q - 10 * r) + '0'; /* 5 */ 191 if (r == 0) 192 return buf; 193 q = (r * 0x199a) >> 16; 194 *buf++ = (r - 10 * q) + '0'; /* 6 */ 195 if (q == 0) 196 return buf; 197 r = (q * 0xcd) >> 11; 198 *buf++ = (q - 10 * r) + '0'; /* 7 */ 199 if (r == 0) 200 return buf; 201 q = (r * 0xcd) >> 11; 202 *buf++ = (r - 10 * q) + '0'; /* 8 */ 203 if (q == 0) 204 return buf; 205 *buf++ = q + '0'; /* 9 */ 206 return buf; 207} 208 209/* There are two algorithms to print larger numbers. 210 * One is generic: divide by 1000000000 and repeatedly print 211 * groups of (up to) 9 digits. It's conceptually simple, 212 * but requires a (unsigned long long) / 1000000000 division. 213 * 214 * Second algorithm splits 64-bit unsigned long long into 16-bit chunks, 215 * manipulates them cleverly and generates groups of 4 decimal digits. 216 * It so happens that it does NOT require long long division. 217 * 218 * If long is > 32 bits, division of 64-bit values is relatively easy, 219 * and we will use the first algorithm. 220 * If long long is > 64 bits (strange architecture with VERY large long long), 221 * second algorithm can't be used, and we again use the first one. 222 * 223 * Else (if long is 32 bits and long long is 64 bits) we use second one. 224 */ 225 226#if BITS_PER_LONG != 32 || BITS_PER_LONG_LONG != 64 227 228/* First algorithm: generic */ 229 230static 231char *put_dec(char *buf, unsigned long long n) 232{ 233 if (n >= 100*1000*1000) { 234 while (n >= 1000*1000*1000) 235 buf = put_dec_full9(buf, do_div(n, 1000*1000*1000)); 236 if (n >= 100*1000*1000) 237 return put_dec_full9(buf, n); |
155 } | 238 } |
239 return put_dec_trunc8(buf, n); 240} |
|
156 | 241 |
242#else 243 244/* Second algorithm: valid only for 64-bit long longs */ 245 246static noinline_for_stack 247char *put_dec_full4(char *buf, unsigned q) 248{ 249 unsigned r; 250 r = (q * 0xcccd) >> 19; 251 *buf++ = (q - 10 * r) + '0'; 252 q = (r * 0x199a) >> 16; 253 *buf++ = (r - 10 * q) + '0'; 254 r = (q * 0xcd) >> 11; 255 *buf++ = (q - 10 * r) + '0'; 256 *buf++ = r + '0'; |
|
157 return buf; 158} | 257 return buf; 258} |
159/* Same with if's removed. Always emits five digits */ 160static noinline_for_stack 161char *put_dec_full(char *buf, unsigned q) | 259 260/* Based on code by Douglas W. Jones found at 261 * <http://www.cs.uiowa.edu/~jones/bcd/decimal.html#sixtyfour> 262 * (with permission from the author). 263 * Performs no 64-bit division and hence should be fast on 32-bit machines. 264 */ 265static 266char *put_dec(char *buf, unsigned long long n) |
162{ | 267{ |
163 /* BTW, if q is in [0,9999], 8-bit ints will be enough, */ 164 /* but anyway, gcc produces better code with full-sized ints */ 165 unsigned d3, d2, d1, d0; 166 d1 = (q>>4) & 0xf; 167 d2 = (q>>8) & 0xf; 168 d3 = (q>>12); | 268 uint32_t d3, d2, d1, q, h; |
169 | 269 |
170 /* 171 * Possible ways to approx. divide by 10 172 * gcc -O2 replaces multiply with shifts and adds 173 * (x * 0xcd) >> 11: 11001101 - shorter code than * 0x67 (on i386) 174 * (x * 0x67) >> 10: 1100111 175 * (x * 0x34) >> 9: 110100 - same 176 * (x * 0x1a) >> 8: 11010 - same 177 * (x * 0x0d) >> 7: 1101 - same, shortest code (on i386) 178 */ 179 d0 = 6*(d3 + d2 + d1) + (q & 0xf); 180 q = (d0 * 0xcd) >> 11; 181 d0 = d0 - 10*q; 182 *buf++ = d0 + '0'; 183 d1 = q + 9*d3 + 5*d2 + d1; 184 q = (d1 * 0xcd) >> 11; 185 d1 = d1 - 10*q; 186 *buf++ = d1 + '0'; | 270 if (n < 100*1000*1000) 271 return put_dec_trunc8(buf, n); |
187 | 272 |
188 d2 = q + 2*d2; 189 q = (d2 * 0xd) >> 7; 190 d2 = d2 - 10*q; 191 *buf++ = d2 + '0'; | 273 d1 = ((uint32_t)n >> 16); /* implicit "& 0xffff" */ 274 h = (n >> 32); 275 d2 = (h ) & 0xffff; 276 d3 = (h >> 16); /* implicit "& 0xffff" */ |
192 | 277 |
193 d3 = q + 4*d3; 194 q = (d3 * 0xcd) >> 11; /* - shorter code */ 195 /* q = (d3 * 0x67) >> 10; - would also work */ 196 d3 = d3 - 10*q; 197 *buf++ = d3 + '0'; 198 *buf++ = q + '0'; | 278 q = 656 * d3 + 7296 * d2 + 5536 * d1 + ((uint32_t)n & 0xffff); |
199 | 279 |
280 buf = put_dec_full4(buf, q % 10000); 281 q = q / 10000; 282 283 d1 = q + 7671 * d3 + 9496 * d2 + 6 * d1; 284 buf = put_dec_full4(buf, d1 % 10000); 285 q = d1 / 10000; 286 287 d2 = q + 4749 * d3 + 42 * d2; 288 buf = put_dec_full4(buf, d2 % 10000); 289 q = d2 / 10000; 290 291 d3 = q + 281 * d3; 292 if (!d3) 293 goto done; 294 buf = put_dec_full4(buf, d3 % 10000); 295 q = d3 / 10000; 296 if (!q) 297 goto done; 298 buf = put_dec_full4(buf, q); 299 done: 300 while (buf[-1] == '0') 301 --buf; 302 |
|
200 return buf; 201} | 303 return buf; 304} |
202/* No inlining helps gcc to use registers better */ 203static noinline_for_stack 204char *put_dec(char *buf, unsigned long long num) 205{ 206 while (1) { 207 unsigned rem; 208 if (num < 100000) 209 return put_dec_trunc(buf, num); 210 rem = do_div(num, 100000); 211 buf = put_dec_full(buf, rem); 212 } 213} | |
214 | 305 |
306#endif 307 |
|
215/* 216 * Convert passed number to decimal string. 217 * Returns the length of string. On buffer overflow, returns 0. 218 * 219 * If speed is not important, use snprintf(). It's easy to read the code. 220 */ 221int num_to_str(char *buf, int size, unsigned long long num) 222{ | 308/* 309 * Convert passed number to decimal string. 310 * Returns the length of string. On buffer overflow, returns 0. 311 * 312 * If speed is not important, use snprintf(). It's easy to read the code. 313 */ 314int num_to_str(char *buf, int size, unsigned long long num) 315{ |
223 char tmp[21]; /* Enough for 2^64 in decimal */ | 316 char tmp[sizeof(num) * 3]; |
224 int idx, len; 225 | 317 int idx, len; 318 |
226 len = put_dec(tmp, num) - tmp; | 319 /* put_dec() may work incorrectly for num = 0 (generate "", not "0") */ 320 if (num <= 9) { 321 tmp[0] = '0' + num; 322 len = 1; 323 } else { 324 len = put_dec(tmp, num) - tmp; 325 } |
227 228 if (len > size) 229 return 0; 230 for (idx = 0; idx < len; ++idx) 231 buf[idx] = tmp[len - idx - 1]; | 326 327 if (len > size) 328 return 0; 329 for (idx = 0; idx < len; ++idx) 330 buf[idx] = tmp[len - idx - 1]; |
232 return len; | 331 return len; |
233} 234 235#define ZEROPAD 1 /* pad with zero */ 236#define SIGN 2 /* unsigned/signed long */ 237#define PLUS 4 /* show plus */ 238#define SPACE 8 /* space if plus */ 239#define LEFT 16 /* left justified */ 240#define SMALL 32 /* use lowercase in hex (must be 32 == 0x20) */ --- 68 unchanged lines hidden (view full) --- 309 if (spec.base == 16) 310 spec.field_width -= 2; 311 else if (!is_zero) 312 spec.field_width--; 313 } 314 315 /* generate full string in tmp[], in reverse order */ 316 i = 0; | 332} 333 334#define ZEROPAD 1 /* pad with zero */ 335#define SIGN 2 /* unsigned/signed long */ 336#define PLUS 4 /* show plus */ 337#define SPACE 8 /* space if plus */ 338#define LEFT 16 /* left justified */ 339#define SMALL 32 /* use lowercase in hex (must be 32 == 0x20) */ --- 68 unchanged lines hidden (view full) --- 408 if (spec.base == 16) 409 spec.field_width -= 2; 410 else if (!is_zero) 411 spec.field_width--; 412 } 413 414 /* generate full string in tmp[], in reverse order */ 415 i = 0; |
317 if (num == 0) 318 tmp[i++] = '0'; | 416 if (num < spec.base) 417 tmp[i++] = digits[num] | locase; |
319 /* Generic code, for any base: 320 else do { 321 tmp[i++] = (digits[do_div(num,base)] | locase); 322 } while (num != 0); 323 */ 324 else if (spec.base != 10) { /* 8 or 16 */ 325 int mask = spec.base - 1; 326 int shift = 3; --- 279 unchanged lines hidden (view full) --- 606 case 'b': 607 default: 608 index = 0; 609 step = 1; 610 break; 611 } 612 for (i = 0; i < 4; i++) { 613 char temp[3]; /* hold each IP quad in reverse order */ | 418 /* Generic code, for any base: 419 else do { 420 tmp[i++] = (digits[do_div(num,base)] | locase); 421 } while (num != 0); 422 */ 423 else if (spec.base != 10) { /* 8 or 16 */ 424 int mask = spec.base - 1; 425 int shift = 3; --- 279 unchanged lines hidden (view full) --- 705 case 'b': 706 default: 707 index = 0; 708 step = 1; 709 break; 710 } 711 for (i = 0; i < 4; i++) { 712 char temp[3]; /* hold each IP quad in reverse order */ |
614 int digits = put_dec_trunc(temp, addr[index]) - temp; | 713 int digits = put_dec_trunc8(temp, addr[index]) - temp; |
615 if (leading_zeros) { 616 if (digits < 3) 617 *p++ = '0'; 618 if (digits < 2) 619 *p++ = '0'; 620 } 621 /* reverse the digits in the quad */ 622 while (digits--) --- 1450 unchanged lines hidden --- | 714 if (leading_zeros) { 715 if (digits < 3) 716 *p++ = '0'; 717 if (digits < 2) 718 *p++ = '0'; 719 } 720 /* reverse the digits in the quad */ 721 while (digits--) --- 1450 unchanged lines hidden --- |