Searched hist:"4277 eedd7908a0ca8b66fad46ee76b0ad96e6ef2" (Results 1 – 1 of 1) sorted by relevance
/openbmc/linux/lib/ |
H A D | vsprintf.c | diff 4277eedd7908a0ca8b66fad46ee76b0ad96e6ef2 Mon Jul 16 01:41:56 CDT 2007 Denis Vlasenko <vda.linux@googlemail.com> vsprintf.c: optimizing, part 2: base 10 conversion speedup, v2
Optimize integer-to-string conversion in vsprintf.c for base 10. This is by far the most used conversion, and in some use cases it impacts performance. For example, top reads /proc/$PID/stat for every process, and with 4000 processes decimal conversion alone takes noticeable time.
Using code from
http://www.cs.uiowa.edu/~jones/bcd/decimal.html (with permission from the author, Douglas W. Jones)
binary-to-decimal-string conversion is done in groups of five digits at once, using only additions/subtractions/shifts (with -O2; -Os throws in some multiply instructions).
On i386 arch gcc 4.1.2 -O2 generates ~500 bytes of code.
This patch is run tested. Userspace benchmark/test is also attached. I tested it on PIII and AMD64 and new code is generally ~2.5 times faster. On AMD64:
# ./vsprintf_verify-O2 Original decimal conv: .......... 151 ns per iteration Patched decimal conv: .......... 62 ns per iteration Testing correctness 12895992590592 ok... [Ctrl-C] # ./vsprintf_verify-O2 Original decimal conv: .......... 151 ns per iteration Patched decimal conv: .......... 62 ns per iteration Testing correctness 26025406464 ok... [Ctrl-C]
More realistic test: top from busybox project was modified to report how many us it took to scan /proc (this does not account any processing done after that, like sorting process list), and then I test it with 4000 processes:
#!/bin/sh i=4000 while test $i != 0; do sleep 30 & let i-- done busybox top -b -n3 >/dev/null
on unpatched kernel:
top: 4120 processes took 102864 microseconds to scan top: 4120 processes took 91757 microseconds to scan top: 4120 processes took 92517 microseconds to scan top: 4120 processes took 92581 microseconds to scan
on patched kernel:
top: 4120 processes took 75460 microseconds to scan top: 4120 processes took 66451 microseconds to scan top: 4120 processes took 67267 microseconds to scan top: 4120 processes took 67618 microseconds to scan
The speedup comes from much faster generation of /proc/PID/stat by sprintf() calls inside the kernel.
Signed-off-by: Douglas W Jones <jones@cs.uiowa.edu> Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|