Searched hist:d26b3a7c (Results 1 – 3 of 3) sorted by relevance
/openbmc/linux/include/net/ |
H A D | ip_fib.h | d26b3a7c Tue Jul 31 00:45:30 CDT 2012 Eric Dumazet <edumazet@google.com> ipv4: percpu nh_rth_output cache
Input path is mostly run under RCU and doesnt touch dst refcnt
But output path on forwarding or UDP workloads hits badly dst refcount, and we have lot of false sharing, for example in ipv4_mtu() when reading rt->rt_pmtu
Using a percpu cache for nh_rth_output gives a nice performance increase at a small cost.
24 udpflood test on my 24 cpu machine (dummy0 output device) (each process sends 1.000.000 udp frames, 24 processes are started)
before : 5.24 s after : 2.06 s For reference, time on linux-3.5 : 6.60 s
Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> d26b3a7c Tue Jul 31 00:45:30 CDT 2012 Eric Dumazet <edumazet@google.com> ipv4: percpu nh_rth_output cache Input path is mostly run under RCU and doesnt touch dst refcnt But output path on forwarding or UDP workloads hits badly dst refcount, and we have lot of false sharing, for example in ipv4_mtu() when reading rt->rt_pmtu Using a percpu cache for nh_rth_output gives a nice performance increase at a small cost. 24 udpflood test on my 24 cpu machine (dummy0 output device) (each process sends 1.000.000 udp frames, 24 processes are started) before : 5.24 s after : 2.06 s For reference, time on linux-3.5 : 6.60 s Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
/openbmc/linux/net/ipv4/ |
H A D | fib_semantics.c | d26b3a7c Tue Jul 31 00:45:30 CDT 2012 Eric Dumazet <edumazet@google.com> ipv4: percpu nh_rth_output cache
Input path is mostly run under RCU and doesnt touch dst refcnt
But output path on forwarding or UDP workloads hits badly dst refcount, and we have lot of false sharing, for example in ipv4_mtu() when reading rt->rt_pmtu
Using a percpu cache for nh_rth_output gives a nice performance increase at a small cost.
24 udpflood test on my 24 cpu machine (dummy0 output device) (each process sends 1.000.000 udp frames, 24 processes are started)
before : 5.24 s after : 2.06 s For reference, time on linux-3.5 : 6.60 s
Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> d26b3a7c Tue Jul 31 00:45:30 CDT 2012 Eric Dumazet <edumazet@google.com> ipv4: percpu nh_rth_output cache Input path is mostly run under RCU and doesnt touch dst refcnt But output path on forwarding or UDP workloads hits badly dst refcount, and we have lot of false sharing, for example in ipv4_mtu() when reading rt->rt_pmtu Using a percpu cache for nh_rth_output gives a nice performance increase at a small cost. 24 udpflood test on my 24 cpu machine (dummy0 output device) (each process sends 1.000.000 udp frames, 24 processes are started) before : 5.24 s after : 2.06 s For reference, time on linux-3.5 : 6.60 s Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
H A D | route.c | d26b3a7c Tue Jul 31 00:45:30 CDT 2012 Eric Dumazet <edumazet@google.com> ipv4: percpu nh_rth_output cache
Input path is mostly run under RCU and doesnt touch dst refcnt
But output path on forwarding or UDP workloads hits badly dst refcount, and we have lot of false sharing, for example in ipv4_mtu() when reading rt->rt_pmtu
Using a percpu cache for nh_rth_output gives a nice performance increase at a small cost.
24 udpflood test on my 24 cpu machine (dummy0 output device) (each process sends 1.000.000 udp frames, 24 processes are started)
before : 5.24 s after : 2.06 s For reference, time on linux-3.5 : 6.60 s
Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> d26b3a7c Tue Jul 31 00:45:30 CDT 2012 Eric Dumazet <edumazet@google.com> ipv4: percpu nh_rth_output cache Input path is mostly run under RCU and doesnt touch dst refcnt But output path on forwarding or UDP workloads hits badly dst refcount, and we have lot of false sharing, for example in ipv4_mtu() when reading rt->rt_pmtu Using a percpu cache for nh_rth_output gives a nice performance increase at a small cost. 24 udpflood test on my 24 cpu machine (dummy0 output device) (each process sends 1.000.000 udp frames, 24 processes are started) before : 5.24 s after : 2.06 s For reference, time on linux-3.5 : 6.60 s Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|