Searched hist:cf8fb553 (Results 1 – 1 of 1) sorted by relevance
/openbmc/linux/arch/powerpc/lib/ |
H A D | string_64.S | cf8fb553 Mon Jun 04 11:02:22 CDT 2012 Anton Blanchard <anton@samba.org> powerpc: Optimise the 64bit optimised __clear_user
I blame Mikey for this. He elevated my slightly dubious testcase:
to benchmark status. And naturally we need to be number 1 at creating zeros. So lets improve __clear_user some more.
As Paul suggests we can use dcbz for large lengths. This patch gets the destination cacheline aligned then uses dcbz on whole cachelines.
Before: 10485760000 bytes (10 GB) copied, 0.414744 s, 25.3 GB/s
After: 10485760000 bytes (10 GB) copied, 0.268597 s, 39.0 GB/s
39 GB/s, a new record.
Signed-off-by: Anton Blanchard <anton@samba.org> Tested-by: Olof Johansson <olof@lixom.net> Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> cf8fb553 Mon Jun 04 11:02:22 CDT 2012 Anton Blanchard <anton@samba.org> powerpc: Optimise the 64bit optimised __clear_user I blame Mikey for this. He elevated my slightly dubious testcase: to benchmark status. And naturally we need to be number 1 at creating zeros. So lets improve __clear_user some more. As Paul suggests we can use dcbz for large lengths. This patch gets the destination cacheline aligned then uses dcbz on whole cachelines. Before: 10485760000 bytes (10 GB) copied, 0.414744 s, 25.3 GB/s After: 10485760000 bytes (10 GB) copied, 0.268597 s, 39.0 GB/s 39 GB/s, a new record. Signed-off-by: Anton Blanchard <anton@samba.org> Tested-by: Olof Johansson <olof@lixom.net> Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|