Searched hist:c7d4426a98a5f6654cd0b4b33d9dab2e77192c18 (Results 1 – 3 of 3) sorted by relevance
/openbmc/linux/include/net/ |
H A D | dst.h | diff c7d4426a98a5f6654cd0b4b33d9dab2e77192c18 Mon Oct 04 00:17:54 CDT 2010 Eric Dumazet <eric.dumazet@gmail.com> net: introduce DST_NOCACHE flag
While doing stress tests with IP route cache disabled, and multi queue devices, I noticed a very high contention on one rwlock used in neighbour code.
When many cpus are trying to send frames (possibly using a high performance multiqueue device) to the same neighbour, they fight for the neigh->lock rwlock in order to call neigh_hh_init(), and fight on hh->hh_refcnt (a pair of atomic_inc/atomic_dec_and_test())
But we dont need to call neigh_hh_init() for dst that are used only once. It costs four atomic operations at least, on two contended cache lines, plus the high contention on neigh->lock rwlock.
Introduce a new dst flag, DST_NOCACHE, that is set when dst was not inserted in route cache.
With the stress test bench, sending 160000000 frames on one neighbour, results are :
Before patch:
real 2m28.406s user 0m11.781s sys 36m17.964s
After patch:
real 1m26.532s user 0m12.185s sys 20m3.903s
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
/openbmc/linux/net/core/ |
H A D | neighbour.c | diff c7d4426a98a5f6654cd0b4b33d9dab2e77192c18 Mon Oct 04 00:17:54 CDT 2010 Eric Dumazet <eric.dumazet@gmail.com> net: introduce DST_NOCACHE flag
While doing stress tests with IP route cache disabled, and multi queue devices, I noticed a very high contention on one rwlock used in neighbour code.
When many cpus are trying to send frames (possibly using a high performance multiqueue device) to the same neighbour, they fight for the neigh->lock rwlock in order to call neigh_hh_init(), and fight on hh->hh_refcnt (a pair of atomic_inc/atomic_dec_and_test())
But we dont need to call neigh_hh_init() for dst that are used only once. It costs four atomic operations at least, on two contended cache lines, plus the high contention on neigh->lock rwlock.
Introduce a new dst flag, DST_NOCACHE, that is set when dst was not inserted in route cache.
With the stress test bench, sending 160000000 frames on one neighbour, results are :
Before patch:
real 2m28.406s user 0m11.781s sys 36m17.964s
After patch:
real 1m26.532s user 0m12.185s sys 20m3.903s
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
/openbmc/linux/net/ipv4/ |
H A D | route.c | diff c7d4426a98a5f6654cd0b4b33d9dab2e77192c18 Mon Oct 04 00:17:54 CDT 2010 Eric Dumazet <eric.dumazet@gmail.com> net: introduce DST_NOCACHE flag
While doing stress tests with IP route cache disabled, and multi queue devices, I noticed a very high contention on one rwlock used in neighbour code.
When many cpus are trying to send frames (possibly using a high performance multiqueue device) to the same neighbour, they fight for the neigh->lock rwlock in order to call neigh_hh_init(), and fight on hh->hh_refcnt (a pair of atomic_inc/atomic_dec_and_test())
But we dont need to call neigh_hh_init() for dst that are used only once. It costs four atomic operations at least, on two contended cache lines, plus the high contention on neigh->lock rwlock.
Introduce a new dst flag, DST_NOCACHE, that is set when dst was not inserted in route cache.
With the stress test bench, sending 160000000 frames on one neighbour, results are :
Before patch:
real 2m28.406s user 0m11.781s sys 36m17.964s
After patch:
real 1m26.532s user 0m12.185s sys 20m3.903s
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|