#
35d889d1 |
| 05-Mar-2018 |
Alexey Kodanev <alexey.kodanev@oracle.com> |
sch_netem: fix skb leak in netem_enqueue()
When we exceed current packets limit and we have more than one segment in the list returned by skb_gso_segment(), netem drops only the first one, skipping
sch_netem: fix skb leak in netem_enqueue()
When we exceed current packets limit and we have more than one segment in the list returned by skb_gso_segment(), netem drops only the first one, skipping the rest, hence kmemleak reports:
unreferenced object 0xffff880b5d23b600 (size 1024): comm "softirq", pid 0, jiffies 4384527763 (age 2770.629s) hex dump (first 32 bytes): 00 80 23 5d 0b 88 ff ff 00 00 00 00 00 00 00 00 ..#]............ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000d8a19b9d>] __alloc_skb+0xc9/0x520 [<000000001709b32f>] skb_segment+0x8c8/0x3710 [<00000000c7b9bb88>] tcp_gso_segment+0x331/0x1830 [<00000000c921cba1>] inet_gso_segment+0x476/0x1370 [<000000008b762dd4>] skb_mac_gso_segment+0x1f9/0x510 [<000000002182660a>] __skb_gso_segment+0x1dd/0x620 [<00000000412651b9>] netem_enqueue+0x1536/0x2590 [sch_netem] [<0000000005d3b2a9>] __dev_queue_xmit+0x1167/0x2120 [<00000000fc5f7327>] ip_finish_output2+0x998/0xf00 [<00000000d309e9d3>] ip_output+0x1aa/0x2c0 [<000000007ecbd3a4>] tcp_transmit_skb+0x18db/0x3670 [<0000000042d2a45f>] tcp_write_xmit+0x4d4/0x58c0 [<0000000056a44199>] tcp_tasklet_func+0x3d9/0x540 [<0000000013d06d02>] tasklet_action+0x1ca/0x250 [<00000000fcde0b8b>] __do_softirq+0x1b4/0x5a3 [<00000000e7ed027c>] irq_exit+0x1e2/0x210
Fix it by adding the rest of the segments, if any, to skb 'to_free' list. Add new __qdisc_drop_all() and qdisc_drop_all() functions because they can be useful in the future if we need to drop segmented GSO packets in other places.
Fixes: 6071bd1aa13e ("netem: Segment GSO packets on enqueue") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
3a053b1a |
| 28-Feb-2018 |
Gal Pressman <galp@mellanox.com> |
net: Fix spelling mistake "greater then" -> "greater than"
Fix trivial spelling mistake "greater then" -> "greater than".
Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: David S. Mil
net: Fix spelling mistake "greater then" -> "greater than"
Fix trivial spelling mistake "greater then" -> "greater than".
Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v4.15 |
|
#
48bfd55e |
| 25-Jan-2018 |
Cong Wang <xiyou.wangcong@gmail.com> |
net_sched: plug in qdisc ops change_tx_queue_len
Introduce a new qdisc ops ->change_tx_queue_len() so that each qdisc could decide how to implement this if it wants. Previously we simply read dev->t
net_sched: plug in qdisc ops change_tx_queue_len
Introduce a new qdisc ops ->change_tx_queue_len() so that each qdisc could decide how to implement this if it wants. Previously we simply read dev->tx_queue_len, after pfifo_fast switches to skb array, we need this API to resize the skb array when we change dev->tx_queue_len.
To avoid handling race conditions with TX BH, we need to deactivate all TX queues before change the value and bring them back after we are done, this also makes implementation easier.
Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
715df5ec |
| 24-Jan-2018 |
Jakub Kicinski <jakub.kicinski@netronome.com> |
net: sched: propagate extack to cls->destroy callbacks
Propagate extack to cls->destroy callbacks when called from non-error paths. On error paths pass NULL to avoid overwriting the failure message
net: sched: propagate extack to cls->destroy callbacks
Propagate extack to cls->destroy callbacks when called from non-error paths. On error paths pass NULL to avoid overwriting the failure message.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
571acf21 |
| 18-Jan-2018 |
Alexander Aring <aring@mojatatu.com> |
net: sched: cls: add extack support for delete callback
This patch adds extack support for classifier delete callback api. This prepares to handle extack support inside each specific classifier impl
net: sched: cls: add extack support for delete callback
This patch adds extack support for classifier delete callback api. This prepares to handle extack support inside each specific classifier implementation.
Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
7306db38 |
| 18-Jan-2018 |
Alexander Aring <aring@mojatatu.com> |
net: sched: cls: add extack support for change callback
This patch adds extack support for classifier change callback api. This prepares to handle extack support inside each specific classifier impl
net: sched: cls: add extack support for change callback
This patch adds extack support for classifier change callback api. This prepares to handle extack support inside each specific classifier implementation.
Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
8865fdd4 |
| 18-Jan-2018 |
Alexander Aring <aring@mojatatu.com> |
net: sched: cls: fix code style issues
This patch changes some code style issues pointed out by checkpatch inside the TC cls subsystem.
Signed-off-by: Alexander Aring <aring@mojatatu.com> Acked-by:
net: sched: cls: fix code style issues
This patch changes some code style issues pointed out by checkpatch inside the TC cls subsystem.
Signed-off-by: Alexander Aring <aring@mojatatu.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
d47a6b0e |
| 17-Jan-2018 |
Jiri Pirko <jiri@mellanox.com> |
net: sched: introduce ingress/egress block index attributes for qdisc
Introduce two new attributes to be used for qdisc creation and dumping. One for ingress block, one for egress block. Introduce a
net: sched: introduce ingress/egress block index attributes for qdisc
Introduce two new attributes to be used for qdisc creation and dumping. One for ingress block, one for egress block. Introduce a set of ops that qdisc which supports block sharing would implement.
Passing block indexes in qdisc change is not supported yet and it is checked and forbidded.
In future, these attributes are to be reused for specifying block indexes for classes as well. As of this moment however, it is not supported so a check is in place to forbid it.
Suggested-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
caa72601 |
| 17-Jan-2018 |
Jiri Pirko <jiri@mellanox.com> |
net: sched: keep track of offloaded filters and check tc offload feature
During block bind, we need to check tc offload feature. If it is disabled yet still the block contains offloaded filters, for
net: sched: keep track of offloaded filters and check tc offload feature
During block bind, we need to check tc offload feature. If it is disabled yet still the block contains offloaded filters, forbid the bind. Also forbid to register callback for a block that already contains offloaded filters, as the play back is not supported now. For keeping track of offloaded filters there is a new counter introduced, alongside with couple of helpers called from cls_* code. These helpers set and clear TCA_CLS_FLAGS_IN_HW flag.
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
edf6711c |
| 17-Jan-2018 |
Jiri Pirko <jiri@mellanox.com> |
net: sched: remove classid and q fields from tcf_proto
Both are no longer used, so remove them.
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by:
net: sched: remove classid and q fields from tcf_proto
Both are no longer used, so remove them.
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
f36fe1c4 |
| 17-Jan-2018 |
Jiri Pirko <jiri@mellanox.com> |
net: sched: introduce block mechanism to handle netif_keep_dst calls
Couple of classifiers call netif_keep_dst directly on q->dev. That is not possible to do directly for shared blocke where multipl
net: sched: introduce block mechanism to handle netif_keep_dst calls
Couple of classifiers call netif_keep_dst directly on q->dev. That is not possible to do directly for shared blocke where multiple qdiscs are owning the block. So introduce a infrastructure to keep track of the block owners in list and use this list to implement block variant of netif_keep_dst.
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
48617387 |
| 17-Jan-2018 |
Jiri Pirko <jiri@mellanox.com> |
net: sched: introduce shared filter blocks infrastructure
Allow qdiscs to share filter blocks among them. Each qdisc type has to use block get/put extended modifications that enable sharing. Shared
net: sched: introduce shared filter blocks infrastructure
Allow qdiscs to share filter blocks among them. Each qdisc type has to use block get/put extended modifications that enable sharing. Shared blocks are tracked within each net namespace and identified by u32 index. This index is passed from user during the qdisc creation. If user passes index that is not used by any other qdisc, new block is created. If user passes index that is already used, the existing block will be re-used.
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
a9b19443 |
| 17-Jan-2018 |
Jiri Pirko <jiri@mellanox.com> |
net: sched: introduce support for multiple filter chain pointers registration
So far, there was possible only to register a single filter chain pointer to block->chain[0]. However, when the blocks w
net: sched: introduce support for multiple filter chain pointers registration
So far, there was possible only to register a single filter chain pointer to block->chain[0]. However, when the blocks will get shareable, we need to allow multiple filter chain pointers registration.
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
81d947e2 |
| 15-Jan-2018 |
Daniel Borkmann <daniel@iogearbox.net> |
net, sched: fix panic when updating miniq {b,q}stats
While working on fixing another bug, I ran into the following panic on arm64 by simply attaching clsact qdisc, adding a filter and running traffi
net, sched: fix panic when updating miniq {b,q}stats
While working on fixing another bug, I ran into the following panic on arm64 by simply attaching clsact qdisc, adding a filter and running traffic on ingress to it:
[...] [ 178.188591] Unable to handle kernel read from unreadable memory at virtual address 810fb501f000 [ 178.197314] Mem abort info: [ 178.200121] ESR = 0x96000004 [ 178.203168] Exception class = DABT (current EL), IL = 32 bits [ 178.209095] SET = 0, FnV = 0 [ 178.212157] EA = 0, S1PTW = 0 [ 178.215288] Data abort info: [ 178.218175] ISV = 0, ISS = 0x00000004 [ 178.222019] CM = 0, WnR = 0 [ 178.224997] user pgtable: 4k pages, 48-bit VAs, pgd = 0000000023cb3f33 [ 178.231531] [0000810fb501f000] *pgd=0000000000000000 [ 178.236508] Internal error: Oops: 96000004 [#1] SMP [...] [ 178.311855] CPU: 73 PID: 2497 Comm: ping Tainted: G W 4.15.0-rc7+ #5 [ 178.319413] Hardware name: FOXCONN R2-1221R-A4/C2U4N_MB, BIOS G31FB18A 03/31/2017 [ 178.326887] pstate: 60400005 (nZCv daif +PAN -UAO) [ 178.331685] pc : __netif_receive_skb_core+0x49c/0xac8 [ 178.336728] lr : __netif_receive_skb+0x28/0x78 [ 178.341161] sp : ffff00002344b750 [ 178.344465] x29: ffff00002344b750 x28: ffff810fbdfd0580 [ 178.349769] x27: 0000000000000000 x26: ffff000009378000 [...] [ 178.418715] x1 : 0000000000000054 x0 : 0000000000000000 [ 178.424020] Process ping (pid: 2497, stack limit = 0x000000009f0a3ff4) [ 178.430537] Call trace: [ 178.432976] __netif_receive_skb_core+0x49c/0xac8 [ 178.437670] __netif_receive_skb+0x28/0x78 [ 178.441757] process_backlog+0x9c/0x160 [ 178.445584] net_rx_action+0x2f8/0x3f0 [...]
Reason is that sch_ingress and sch_clsact are doing mini_qdisc_pair_init() which sets up miniq pointers to cpu_{b,q}stats from the underlying qdisc. Problem is that this cannot work since they are actually set up right after the qdisc ->init() callback in qdisc_create(), so first packet going into sch_handle_ingress() tries to call mini_qdisc_bstats_cpu_update() and we therefore panic.
In order to fix this, allocation of {b,q}stats needs to happen before we call into ->init(). In net-next, there's already such option through commit d59f5ffa59d8 ("net: sched: a dflt qdisc may be used with per cpu stats"). However, the bug needs to be fixed in net still for 4.15. Thus, include these bits to reduce any merge churn and reuse the static_flags field to set TCQ_F_CPUSTATS, and remove the allocation from qdisc_create() since there is no other user left. Prashant Bhole ran into the same issue but for net-next, thus adding him below as well as co-author. Same issue was also reported by Sandipan Das when using bcc.
Fixes: 46209401f8f6 ("net: core: introduce mini_Qdisc and eliminate usage of tp->q for clsact fastpath") Reference: https://lists.iovisor.org/pipermail/iovisor-dev/2018-January/001190.html Reported-by: Sandipan Das <sandipan@linux.vnet.ibm.com> Co-authored-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Co-authored-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
a38a9882 |
| 20-Dec-2017 |
Alexander Aring <aring@mojatatu.com> |
net: sch: api: add extack support in qdisc_create_dflt
This patch adds extack support for the function qdisc_create_dflt which is a common used function in the tc subsystem. Callers which are intere
net: sch: api: add extack support in qdisc_create_dflt
This patch adds extack support for the function qdisc_create_dflt which is a common used function in the tc subsystem. Callers which are interested in the receiving error can assign extack to get a more detailed information why qdisc_create_dflt failed. The function qdisc_create_dflt will also call an init callback which can fail by any per-qdisc specific handling.
Cc: David Ahern <dsahern@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
d0bd684d |
| 20-Dec-2017 |
Alexander Aring <aring@mojatatu.com> |
net: sch: api: add extack support in qdisc_alloc
This patch adds extack support for the function qdisc_alloc which is a common used function in the tc subsystem. Callers which are interested in the
net: sch: api: add extack support in qdisc_alloc
This patch adds extack support for the function qdisc_alloc which is a common used function in the tc subsystem. Callers which are interested in the receiving error can assign extack to get a more detailed information why qdisc_alloc failed.
Cc: David Ahern <dsahern@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
653d6fd6 |
| 20-Dec-2017 |
Alexander Aring <aring@mojatatu.com> |
net: sched: sch: add extack for graft callback
This patch adds extack support for graft callback to prepare per-qdisc specific changes for extack.
Cc: David Ahern <dsahern@gmail.com> Acked-by: Jama
net: sched: sch: add extack for graft callback
This patch adds extack support for graft callback to prepare per-qdisc specific changes for extack.
Cc: David Ahern <dsahern@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
cbaacc4e |
| 20-Dec-2017 |
Alexander Aring <aring@mojatatu.com> |
net: sched: sch: add extack for block callback
This patch adds extack support for block callback to prepare per-qdisc specific changes for extack.
Cc: David Ahern <dsahern@gmail.com> Acked-by: Jama
net: sched: sch: add extack for block callback
This patch adds extack support for block callback to prepare per-qdisc specific changes for extack.
Cc: David Ahern <dsahern@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
793d81d6 |
| 20-Dec-2017 |
Alexander Aring <aring@mojatatu.com> |
net: sched: sch: add extack to change class
This patch adds extack support for class change callback api. This prepares to handle extack support inside each specific class implementation.
Cc: David
net: sched: sch: add extack to change class
This patch adds extack support for class change callback api. This prepares to handle extack support inside each specific class implementation.
Cc: David Ahern <dsahern@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
2030721c |
| 20-Dec-2017 |
Alexander Aring <aring@mojatatu.com> |
net: sched: sch: add extack for change qdisc ops
This patch adds extack support for change callback for qdisc ops structtur to prepare per-qdisc specific changes for extack.
Cc: David Ahern <dsaher
net: sched: sch: add extack for change qdisc ops
This patch adds extack support for change callback for qdisc ops structtur to prepare per-qdisc specific changes for extack.
Cc: David Ahern <dsahern@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
e63d7dfd |
| 20-Dec-2017 |
Alexander Aring <aring@mojatatu.com> |
net: sched: sch: add extack for init callback
This patch adds extack support for init callback to prepare per-qdisc specific changes for extack.
Cc: David Ahern <dsahern@gmail.com> Acked-by: Jamal
net: sched: sch: add extack for init callback
This patch adds extack support for init callback to prepare per-qdisc specific changes for extack.
Cc: David Ahern <dsahern@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
7a4fa291 |
| 14-Dec-2017 |
Yuval Mintz <yuvalm@mellanox.com> |
net: sched: Add TCA_HW_OFFLOAD
Qdiscs can be offloaded to HW, but current implementation isn't uniform. Instead, qdiscs either pass information about offload status via their TCA_OPTIONS or omit it
net: sched: Add TCA_HW_OFFLOAD
Qdiscs can be offloaded to HW, but current implementation isn't uniform. Instead, qdiscs either pass information about offload status via their TCA_OPTIONS or omit it altogether.
Introduce a new attribute - TCA_HW_OFFLOAD that would form a uniform uAPI for the offloading status of qdiscs.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
7e66016f |
| 07-Dec-2017 |
John Fastabend <john.fastabend@gmail.com> |
net: sched: helpers to sum qlen and qlen for per cpu logic
Add qdisc qlen helper routines for lockless qdiscs to use.
The qdisc qlen is no longer used in the hotpath but it is reported via stats qu
net: sched: helpers to sum qlen and qlen for per cpu logic
Add qdisc qlen helper routines for lockless qdiscs to use.
The qdisc qlen is no longer used in the hotpath but it is reported via stats query on the qdisc so it still needs to be tracked. This adds the per cpu operations needed along with a helper to return the summation of per cpu stats.
Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
70e57d5e |
| 07-Dec-2017 |
John Fastabend <john.fastabend@gmail.com> |
net: sched: use skb list for skb_bad_tx
Similar to how gso is handled use skb list for skb_bad_tx this is required with lockless qdiscs because we may have multiple cores attempting to push skbs int
net: sched: use skb list for skb_bad_tx
Similar to how gso is handled use skb list for skb_bad_tx this is required with lockless qdiscs because we may have multiple cores attempting to push skbs into skb_bad_tx concurrently
Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
a53851e2 |
| 07-Dec-2017 |
John Fastabend <john.fastabend@gmail.com> |
net: sched: explicit locking in gso_cpu fallback
This work is preparing the qdisc layer to support egress lockless qdiscs. If we are running the egress qdisc lockless in the case we overrun the netd
net: sched: explicit locking in gso_cpu fallback
This work is preparing the qdisc layer to support egress lockless qdiscs. If we are running the egress qdisc lockless in the case we overrun the netdev, for whatever reason, the netdev returns a busy error code and the skb is parked on the gso_skb pointer. With many cores all hitting this case at once its possible to have multiple sk_buffs here so we turn gso_skb into a queue.
This should be the edge case and if we see this frequently then the netdev/qdisc layer needs to back off.
Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|