#
fb3ceec1 |
| 30-Aug-2022 |
Wolfram Sang <wsa+renesas@sang-engineering.com> |
net: move from strlcpy with unused retval to strscpy
Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Generated by
net: move from strlcpy with unused retval to strscpy
Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Generated by a coccinelle script.
Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for CAN Link: https://lore.kernel.org/r/20220830201457.7984-1-wsa+renesas@sang-engineering.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
2e9ca760 |
| 16-Aug-2022 |
Michael S. Tsirkin <mst@redhat.com> |
virtio_net: Revert "virtio_net: set the default max ring size by find_vqs()"
This reverts commit 762faee5a2678559d3dc09d95f8f2c54cd0466a7.
This has been reported to trip up guests on GCP (Google Cl
virtio_net: Revert "virtio_net: set the default max ring size by find_vqs()"
This reverts commit 762faee5a2678559d3dc09d95f8f2c54cd0466a7.
This has been reported to trip up guests on GCP (Google Cloud). The reason is that virtio_find_vqs_ctx_size is broken on legacy devices. We can in theory fix virtio_find_vqs_ctx_size but in fact the patch itself has several other issues:
- It treats unknown speed as < 10G - It leaves userspace no way to find out the ring size set by hypervisor - It tests speed when link is down - It ignores the virtio spec advice: Both \field{speed} and \field{duplex} can change, thus the driver is expected to re-read these values after receiving a configuration change notification. - It is not clear the performance impact has been tested properly
Revert the patch for now.
Reported-by: Andres Freund <andres@anarazel.de> Link: https://lore.kernel.org/r/20220814212610.GA3690074%40roeck-us.net Link: https://lore.kernel.org/r/20220815070203.plwjx7b3cyugpdt7%40awork3.anarazel.de Link: https://lore.kernel.org/r/3df6bb82-1951-455d-a768-e9e1513eb667%40www.fastmail.com Link: https://lore.kernel.org/r/FCDC5DDE-3CDD-4B8A-916F-CA7D87B547CE%40anarazel.de Fixes: 762faee5a267 ("virtio_net: set the default max ring size by find_vqs()") Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Andres Freund <andres@anarazel.de> Tested-by: Guenter Roeck <linux@roeck-us.net> Message-Id: <20220816053602.173815-2-mst@redhat.com>
show more ...
|
#
95bb6330 |
| 11-Aug-2022 |
Michael S. Tsirkin <mst@redhat.com> |
virtio_net: fix endian-ness for RSS
Using native endian-ness for device supplied fields is wrong on BE platforms. Sparse warns about this.
Fixes: 91f41f01d219 ("drivers/net/virtio_net: Added RSS ha
virtio_net: fix endian-ness for RSS
Using native endian-ness for device supplied fields is wrong on BE platforms. Sparse warns about this.
Fixes: 91f41f01d219 ("drivers/net/virtio_net: Added RSS hash report.") Cc: "Andrew Melnychenko" <andrew@daynix.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
699b045a |
| 18-Jul-2022 |
Alvaro Karsz <alvaro.karsz@solid-run.com> |
net: virtio_net: notifications coalescing support
New VirtIO network feature: VIRTIO_NET_F_NOTF_COAL.
Control a Virtio network device notifications coalescing parameters using the control virtqueue
net: virtio_net: notifications coalescing support
New VirtIO network feature: VIRTIO_NET_F_NOTF_COAL.
Control a Virtio network device notifications coalescing parameters using the control virtqueue.
A device that supports this fetature can receive VIRTIO_NET_CTRL_NOTF_COAL control commands.
- VIRTIO_NET_CTRL_NOTF_COAL_TX_SET: Ask the network device to change the following parameters: - tx_usecs: Maximum number of usecs to delay a TX notification. - tx_max_packets: Maximum number of packets to send before a TX notification.
- VIRTIO_NET_CTRL_NOTF_COAL_RX_SET: Ask the network device to change the following parameters: - rx_usecs: Maximum number of usecs to delay a RX notification. - rx_max_packets: Maximum number of packets to receive before a RX notification.
VirtIO spec. patch: https://lists.oasis-open.org/archives/virtio-comment/202206/msg00100.html
Signed-off-by: Alvaro Karsz <alvaro.karsz@solid-run.com> Message-Id: <20220718091102.498774-1-alvaro.karsz@solid-run.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Jason Wang <jasowang@redhat.com>
show more ...
|
#
a335b33f |
| 01-Aug-2022 |
Xuan Zhuo <xuanzhuo@linux.alibaba.com> |
virtio_net: support set_ringparam
Support set_ringparam based on virtio queue reset.
Users can use ethtool -G eth0 <ring_num> to modify the ring size of virtio-net.
Signed-off-by: Xuan Zhuo <xuanz
virtio_net: support set_ringparam
Support set_ringparam based on virtio queue reset.
Users can use ethtool -G eth0 <ring_num> to modify the ring size of virtio-net.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-43-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
#
ebcce492 |
| 01-Aug-2022 |
Xuan Zhuo <xuanzhuo@linux.alibaba.com> |
virtio_net: support tx queue resize
This patch implements the resize function of the tx queues. Based on this function, it is possible to modify the ring num of the queue.
Inludes fixup:
virtio_ne
virtio_net: support tx queue resize
This patch implements the resize function of the tx queues. Based on this function, it is possible to modify the ring num of the queue.
Inludes fixup:
virtio_net: fix for stuck when change tx ring size with dev down
When dev is set to DOWN state, napi has been disabled, if we modify the ring size at this time, we should not call napi_disable() again, which will cause stuck.
And all operations are under the protection of rtnl_lock, so there is no need to consider concurrency issues.
Message-Id: <20220801063902.129329-42-xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220811080258.79398-3-xuanzhuo@linux.alibaba.com> Reported-by: Kangjie Xu <kangjie.xu@linux.alibaba.com> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
#
6a4763e2 |
| 01-Aug-2022 |
Xuan Zhuo <xuanzhuo@linux.alibaba.com> |
virtio_net: support rx queue resize
This patch implements the resize function of the rx queues. Based on this function, it is possible to modify the ring num of the queue.
Includes fixup:
virtio_n
virtio_net: support rx queue resize
This patch implements the resize function of the rx queues. Based on this function, it is possible to modify the ring num of the queue.
Includes fixup:
virtio_net: fix for stuck when change rx ring size with dev down
When dev is set to DOWN state, napi has been disabled, if we modify the ring size at this time, we should not call napi_disable() again, which will cause stuck.
And all operations are under the protection of rtnl_lock, so there is no need to consider concurrency issues.
Message-Id: <20220801063902.129329-41-xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220811080258.79398-2-xuanzhuo@linux.alibaba.com> Reported-by: Kangjie Xu <kangjie.xu@linux.alibaba.com> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
#
6e345f8c |
| 01-Aug-2022 |
Xuan Zhuo <xuanzhuo@linux.alibaba.com> |
virtio_net: split free_unused_bufs()
This patch separates two functions for freeing sq buf and rq buf from free_unused_bufs().
When supporting the enable/disable tx/rq queue in the future, it is ne
virtio_net: split free_unused_bufs()
This patch separates two functions for freeing sq buf and rq buf from free_unused_bufs().
When supporting the enable/disable tx/rq queue in the future, it is necessary to support separate recovery of a sq buf or a rq buf.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-40-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
#
8597b5dd |
| 01-Aug-2022 |
Xuan Zhuo <xuanzhuo@linux.alibaba.com> |
virtio_net: get ringparam by virtqueue_get_vring_max_size()
Use virtqueue_get_vring_max_size() in virtnet_get_ringparam() to set tx,rx_max_pending.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.
virtio_net: get ringparam by virtqueue_get_vring_max_size()
Use virtqueue_get_vring_max_size() in virtnet_get_ringparam() to set tx,rx_max_pending.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-39-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
#
762faee5 |
| 01-Aug-2022 |
Xuan Zhuo <xuanzhuo@linux.alibaba.com> |
virtio_net: set the default max ring size by find_vqs()
Use virtio_find_vqs_ctx_size() to specify the maximum ring size of tx, rx at the same time.
| rx/tx ring size ------
virtio_net: set the default max ring size by find_vqs()
Use virtio_find_vqs_ctx_size() to specify the maximum ring size of tx, rx at the same time.
| rx/tx ring size ------------------------------------------- speed == UNKNOWN or < 10G| 1024 speed < 40G | 4096 speed >= 40G | 8192
Call virtnet_update_settings() once before calling init_vqs() to update speed.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-38-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
#
7a542bee |
| 04-Aug-2022 |
Xuan Zhuo <xuanzhuo@linux.alibaba.com> |
virtio_net: fix memory leak inside XPD_TX with mergeable
When we call xdp_convert_buff_to_frame() to get xdpf, if it returns NULL, we should check if xdp_page was allocated by xdp_linearize_page().
virtio_net: fix memory leak inside XPD_TX with mergeable
When we call xdp_convert_buff_to_frame() to get xdpf, if it returns NULL, we should check if xdp_page was allocated by xdp_linearize_page(). If it is newly allocated, it should be freed here alone. Just like any other "goto err_xdp".
Fixes: 44fa2dbd4759 ("xdp: transition into using xdp_frame for ndo_xdp_xmit") Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
5a159128 |
| 25-Jul-2022 |
Jason Wang <jasowang@redhat.com> |
virtio-net: fix the race between refill work and close
We try using cancel_delayed_work_sync() to prevent the work from enabling NAPI. This is insufficient since we don't disable the source of the r
virtio-net: fix the race between refill work and close
We try using cancel_delayed_work_sync() to prevent the work from enabling NAPI. This is insufficient since we don't disable the source of the refill work scheduling. This means an NAPI poll callback after cancel_delayed_work_sync() can schedule the refill work then can re-enable the NAPI that leads to use-after-free [1].
Since the work can enable NAPI, we can't simply disable NAPI before calling cancel_delayed_work_sync(). So fix this by introducing a dedicated boolean to control whether or not the work could be scheduled from NAPI.
[1] ================================================================== BUG: KASAN: use-after-free in refill_work+0x43/0xd4 Read of size 2 at addr ffff88810562c92e by task kworker/2:1/42
CPU: 2 PID: 42 Comm: kworker/2:1 Not tainted 5.19.0-rc1+ #480 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Workqueue: events refill_work Call Trace: <TASK> dump_stack_lvl+0x34/0x44 print_report.cold+0xbb/0x6ac ? _printk+0xad/0xde ? refill_work+0x43/0xd4 kasan_report+0xa8/0x130 ? refill_work+0x43/0xd4 refill_work+0x43/0xd4 process_one_work+0x43d/0x780 worker_thread+0x2a0/0x6f0 ? process_one_work+0x780/0x780 kthread+0x167/0x1a0 ? kthread_exit+0x50/0x50 ret_from_fork+0x22/0x30 </TASK> ...
Fixes: b2baed69e605c ("virtio_net: set/cancel work on ndo_open/ndo_stop") Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
50c0ada6 |
| 17-Jun-2022 |
Jason Wang <jasowang@redhat.com> |
virtio-net: fix race between ndo_open() and virtio_device_ready()
We currently call virtio_device_ready() after netdev registration. Since ndo_open() can be called immediately after register_netdev,
virtio-net: fix race between ndo_open() and virtio_device_ready()
We currently call virtio_device_ready() after netdev registration. Since ndo_open() can be called immediately after register_netdev, this means there exists a race between ndo_open() and virtio_device_ready(): the driver may start to use the device before DRIVER_OK which violates the spec.
Fix this by switching to use register_netdevice() and protect the virtio_device_ready() with rtnl_lock() to make sure ndo_open() can only be called after virtio_device_ready().
Fixes: 4baf1e33d0842 ("virtio_net: enable VQs early") Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220617072949.30734-1-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
#
8af52fe9 |
| 21-Jun-2022 |
Stephan Gerhold <stephan.gerhold@kernkonzept.com> |
virtio_net: fix xdp_rxq_info bug after suspend/resume
The following sequence currently causes a driver bug warning when using virtio_net:
# ip link set eth0 up # echo mem > /sys/power/state (or
virtio_net: fix xdp_rxq_info bug after suspend/resume
The following sequence currently causes a driver bug warning when using virtio_net:
# ip link set eth0 up # echo mem > /sys/power/state (or e.g. # rtcwake -s 10 -m mem) <resume> # ip link set eth0 down
Missing register, driver bug WARNING: CPU: 0 PID: 375 at net/core/xdp.c:138 xdp_rxq_info_unreg+0x58/0x60 Call trace: xdp_rxq_info_unreg+0x58/0x60 virtnet_close+0x58/0xac __dev_close_many+0xac/0x140 __dev_change_flags+0xd8/0x210 dev_change_flags+0x24/0x64 do_setlink+0x230/0xdd0 ...
This happens because virtnet_freeze() frees the receive_queue completely (including struct xdp_rxq_info) but does not call xdp_rxq_info_unreg(). Similarly, virtnet_restore() sets up the receive_queue again but does not call xdp_rxq_info_reg().
Actually, parts of virtnet_freeze_down() and virtnet_restore_up() are almost identical to virtnet_close() and virtnet_open(): only the calls to xdp_rxq_info_(un)reg() are missing. This means that we can fix this easily and avoid such problems in the future by just calling virtnet_close()/open() from the freeze/restore handlers.
Aside from adding the missing xdp_rxq_info calls the only difference is that the refill work is only cancelled if netif_running(). However, this should not make any functional difference since the refill work should only be active if the network interface is actually up.
Fixes: 754b8a21a96d ("virtio_net: setup xdp_rxq_info") Signed-off-by: Stephan Gerhold <stephan.gerhold@kernkonzept.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20220621114845.3650258-1-stephan.gerhold@kernkonzept.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
d484735d |
| 06-May-2022 |
Jakub Kicinski <kuba@kernel.org> |
net: virtio: switch to netif_napi_add_weight()
virtio netdev driver uses a custom napi weight, switch to the new API for setting custom weight.
Signed-off-by: Jakub Kicinski <kuba@kernel.org> Revie
net: virtio: switch to netif_napi_add_weight()
virtio netdev driver uses a custom napi weight, switch to the new API for setting custom weight.
Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
8d602e1a |
| 04-May-2022 |
Jakub Kicinski <kuba@kernel.org> |
net: move snowflake callers to netif_napi_add_tx_weight()
Make the drivers with custom tx napi weight call netif_napi_add_tx_weight().
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Link: http
net: move snowflake callers to netif_napi_add_tx_weight()
Make the drivers with custom tx napi weight call netif_napi_add_tx_weight().
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Link: https://lore.kernel.org/r/20220504163725.550782-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
acb16b39 |
| 25-Apr-2022 |
Nikolay Aleksandrov <razor@blackwall.org> |
virtio_net: fix wrong buf address calculation when using xdp
We received a report[1] of kernel crashes when Cilium is used in XDP mode with virtio_net after updating to newer kernels. After investig
virtio_net: fix wrong buf address calculation when using xdp
We received a report[1] of kernel crashes when Cilium is used in XDP mode with virtio_net after updating to newer kernels. After investigating the reason it turned out that when using mergeable bufs with an XDP program which adjusts xdp.data or xdp.data_meta page_to_buf() calculates the build_skb address wrong because the offset can become less than the headroom so it gets the address of the previous page (-X bytes depending on how lower offset is): page_to_skb: page addr ffff9eb2923e2000 buf ffff9eb2923e1ffc offset 252 headroom 256
This is a pr_err() I added in the beginning of page_to_skb which clearly shows offset that is less than headroom by adding 4 bytes of metadata via an xdp prog. The calculations done are: receive_mergeable(): headroom = VIRTIO_XDP_HEADROOM; // VIRTIO_XDP_HEADROOM == 256 bytes offset = xdp.data - page_address(xdp_page) - vi->hdr_len - metasize;
page_to_skb(): p = page_address(page) + offset; ... buf = p - headroom;
Now buf goes -4 bytes from the page's starting address as can be seen above which is set as skb->head and skb->data by build_skb later. Depending on what's done with the skb (when it's freed most often) we get all kinds of corruptions and BUG_ON() triggers in mm[2]. We have to recalculate the new headroom after the xdp program has run, similar to how offset and len are recalculated. Headroom is directly related to data_hard_start, data and data_meta, so we use them to get the new size. The result is correct (similar pr_err() in page_to_skb, one case of xdp_page and one case of virtnet buf): a) Case with 4 bytes of metadata [ 115.949641] page_to_skb: page addr ffff8b4dcfad2000 offset 252 headroom 252 [ 121.084105] page_to_skb: page addr ffff8b4dcf018000 offset 20732 headroom 252 b) Case of pushing data +32 bytes [ 153.181401] page_to_skb: page addr ffff8b4dd0c4d000 offset 288 headroom 288 [ 158.480421] page_to_skb: page addr ffff8b4dd00b0000 offset 24864 headroom 288 c) Case of pushing data -33 bytes [ 835.906830] page_to_skb: page addr ffff8b4dd3270000 offset 223 headroom 223 [ 840.839910] page_to_skb: page addr ffff8b4dcdd68000 offset 12511 headroom 223
Offset and headroom are equal because offset points to the start of reserved bytes for the virtio_net header which are at buf start + headroom, while data points at buf start + vnet hdr size + headroom so when data or data_meta are adjusted by the xdp prog both the headroom size and the offset change equally. We can use data_hard_start to compute the new headroom after the xdp prog (linearized / page start case, the virtnet buf case is similar just with bigger base offset): xdp.data_hard_start = page_address + vnet_hdr xdp.data = page_address + vnet_hdr + headroom new headroom after xdp prog = xdp.data - xdp.data_hard_start - metasize
An example reproducer xdp prog[3] is below.
[1] https://github.com/cilium/cilium/issues/19453
[2] Two of the many traces: [ 40.437400] BUG: Bad page state in process swapper/0 pfn:14940 [ 40.916726] BUG: Bad page state in process systemd-resolve pfn:053b7 [ 41.300891] kernel BUG at include/linux/mm.h:720! [ 41.301801] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI [ 41.302784] CPU: 1 PID: 1181 Comm: kubelet Kdump: loaded Tainted: G B W 5.18.0-rc1+ #37 [ 41.304458] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1.fc35 04/01/2014 [ 41.306018] RIP: 0010:page_frag_free+0x79/0xe0 [ 41.306836] Code: 00 00 75 ea 48 8b 07 a9 00 00 01 00 74 e0 48 8b 47 48 48 8d 50 ff a8 01 48 0f 45 fa eb d0 48 c7 c6 18 b8 30 a6 e8 d7 f8 fc ff <0f> 0b 48 8d 78 ff eb bc 48 8b 07 a9 00 00 01 00 74 3a 66 90 0f b6 [ 41.310235] RSP: 0018:ffffac05c2a6bc78 EFLAGS: 00010292 [ 41.311201] RAX: 000000000000003e RBX: 0000000000000000 RCX: 0000000000000000 [ 41.312502] RDX: 0000000000000001 RSI: ffffffffa6423004 RDI: 00000000ffffffff [ 41.313794] RBP: ffff993c98823600 R08: 0000000000000000 R09: 00000000ffffdfff [ 41.315089] R10: ffffac05c2a6ba68 R11: ffffffffa698ca28 R12: ffff993c98823600 [ 41.316398] R13: ffff993c86311ebc R14: 0000000000000000 R15: 000000000000005c [ 41.317700] FS: 00007fe13fc56740(0000) GS:ffff993cdd900000(0000) knlGS:0000000000000000 [ 41.319150] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 41.320152] CR2: 000000c00008a000 CR3: 0000000014908000 CR4: 0000000000350ee0 [ 41.321387] Call Trace: [ 41.321819] <TASK> [ 41.322193] skb_release_data+0x13f/0x1c0 [ 41.322902] __kfree_skb+0x20/0x30 [ 41.343870] tcp_recvmsg_locked+0x671/0x880 [ 41.363764] tcp_recvmsg+0x5e/0x1c0 [ 41.384102] inet_recvmsg+0x42/0x100 [ 41.406783] ? sock_recvmsg+0x1d/0x70 [ 41.428201] sock_read_iter+0x84/0xd0 [ 41.445592] ? 0xffffffffa3000000 [ 41.462442] new_sync_read+0x148/0x160 [ 41.479314] ? 0xffffffffa3000000 [ 41.496937] vfs_read+0x138/0x190 [ 41.517198] ksys_read+0x87/0xc0 [ 41.535336] do_syscall_64+0x3b/0x90 [ 41.551637] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 41.568050] RIP: 0033:0x48765b [ 41.583955] Code: e8 4a 35 fe ff eb 88 cc cc cc cc cc cc cc cc e8 fb 7a fe ff 48 8b 7c 24 10 48 8b 74 24 18 48 8b 54 24 20 48 8b 44 24 08 0f 05 <48> 3d 01 f0 ff ff 76 20 48 c7 44 24 28 ff ff ff ff 48 c7 44 24 30 [ 41.632818] RSP: 002b:000000c000a2f5b8 EFLAGS: 00000212 ORIG_RAX: 0000000000000000 [ 41.664588] RAX: ffffffffffffffda RBX: 000000c000062000 RCX: 000000000048765b [ 41.681205] RDX: 0000000000005e54 RSI: 000000c000e66000 RDI: 0000000000000016 [ 41.697164] RBP: 000000c000a2f608 R08: 0000000000000001 R09: 00000000000001b4 [ 41.713034] R10: 00000000000000b6 R11: 0000000000000212 R12: 00000000000000e9 [ 41.728755] R13: 0000000000000001 R14: 000000c000a92000 R15: ffffffffffffffff [ 41.744254] </TASK> [ 41.758585] Modules linked in: br_netfilter bridge veth netconsole virtio_net
and
[ 33.524802] BUG: Bad page state in process systemd-network pfn:11e60 [ 33.528617] page ffffe05dc0147b00 ffffe05dc04e7a00 ffff8ae9851ec000 (1) len 82 offset 252 metasize 4 hroom 0 hdr_len 12 data ffff8ae9851ec10c data_meta ffff8ae9851ec108 data_end ffff8ae9851ec14e [ 33.529764] page:000000003792b5ba refcount:0 mapcount:-512 mapping:0000000000000000 index:0x0 pfn:0x11e60 [ 33.532463] flags: 0xfffffc0000000(node=0|zone=1|lastcpupid=0x1fffff) [ 33.532468] raw: 000fffffc0000000 0000000000000000 dead000000000122 0000000000000000 [ 33.532470] raw: 0000000000000000 0000000000000000 00000000fffffdff 0000000000000000 [ 33.532471] page dumped because: nonzero mapcount [ 33.532472] Modules linked in: br_netfilter bridge veth netconsole virtio_net [ 33.532479] CPU: 0 PID: 791 Comm: systemd-network Kdump: loaded Not tainted 5.18.0-rc1+ #37 [ 33.532482] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1.fc35 04/01/2014 [ 33.532484] Call Trace: [ 33.532496] <TASK> [ 33.532500] dump_stack_lvl+0x45/0x5a [ 33.532506] bad_page.cold+0x63/0x94 [ 33.532510] free_pcp_prepare+0x290/0x420 [ 33.532515] free_unref_page+0x1b/0x100 [ 33.532518] skb_release_data+0x13f/0x1c0 [ 33.532524] kfree_skb_reason+0x3e/0xc0 [ 33.532527] ip6_mc_input+0x23c/0x2b0 [ 33.532531] ip6_sublist_rcv_finish+0x83/0x90 [ 33.532534] ip6_sublist_rcv+0x22b/0x2b0
[3] XDP program to reproduce(xdp_pass.c): #include <linux/bpf.h> #include <bpf/bpf_helpers.h>
SEC("xdp_pass") int xdp_pkt_pass(struct xdp_md *ctx) { bpf_xdp_adjust_head(ctx, -(int)32); return XDP_PASS; }
char _license[] SEC("license") = "GPL";
compile: clang -O2 -g -Wall -target bpf -c xdp_pass.c -o xdp_pass.o load on virtio_net: ip link set enp1s0 xdpdrv obj xdp_pass.o sec xdp_pass
CC: stable@vger.kernel.org CC: Jason Wang <jasowang@redhat.com> CC: Xuan Zhuo <xuanzhuo@linux.alibaba.com> CC: Daniel Borkmann <daniel@iogearbox.net> CC: "Michael S. Tsirkin" <mst@redhat.com> CC: virtualization@lists.linux-foundation.org Fixes: 8fb7da9e9907 ("virtio_net: get build_skb() buf by data ptr") Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20220425103703.3067292-1-razor@blackwall.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
show more ...
|
#
c1170820 |
| 28-Mar-2022 |
Andrew Melnychenko <andrew@daynix.com> |
drivers/net/virtio_net: Added RSS hash report control.
Now it's possible to control supported hashflows. Added hashflow set/get callbacks. Also, disabling RXH_IP_SRC/DST for TCP would disable then f
drivers/net/virtio_net: Added RSS hash report control.
Now it's possible to control supported hashflows. Added hashflow set/get callbacks. Also, disabling RXH_IP_SRC/DST for TCP would disable then for UDP. TCP and UDP supports only: ethtool -U eth0 rx-flow-hash tcp4 sd RXH_IP_SRC + RXH_IP_DST ethtool -U eth0 rx-flow-hash tcp4 sdfn RXH_IP_SRC + RXH_IP_DST + RXH_L4_B_0_1 + RXH_L4_B_2_3 Disabling happens because VirtioNET hashtype for IP doesn't check L4 proto, it works for all IP packets(TCP, UDP, ICMP, etc.). For TCP and UDP, it's possible to set IP+PORT hashes. But disabling IP hashes will disable them for TCP and UDP simultaneously. It's possible to set IP+PORT for TCP/UDP and disable/enable IP for everything else(UDP, ICMP, etc.).
Signed-off-by: Andrew Melnychenko <andrew@daynix.com> Link: https://lore.kernel.org/r/20220328175336.10802-5-andrew@daynix.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
#
91f41f01 |
| 28-Mar-2022 |
Andrew Melnychenko <andrew@daynix.com> |
drivers/net/virtio_net: Added RSS hash report.
Added features for RSS hash report. If hash is provided - it sets to skb. Added checks if rss and/or hash are enabled together.
Signed-off-by: Andrew
drivers/net/virtio_net: Added RSS hash report.
Added features for RSS hash report. If hash is provided - it sets to skb. Added checks if rss and/or hash are enabled together.
Signed-off-by: Andrew Melnychenko <andrew@daynix.com> Link: https://lore.kernel.org/r/20220328175336.10802-4-andrew@daynix.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
#
c7114b12 |
| 28-Mar-2022 |
Andrew Melnychenko <andrew@daynix.com> |
drivers/net/virtio_net: Added basic RSS support.
Added features for RSS. Added initialization, RXHASH feature and ethtool ops. By default RSS/RXHASH is disabled. Virtio RSS "IPv6 extensions" hashes
drivers/net/virtio_net: Added basic RSS support.
Added features for RSS. Added initialization, RXHASH feature and ethtool ops. By default RSS/RXHASH is disabled. Virtio RSS "IPv6 extensions" hashes disabled. Added ethtools ops to set key and indirection table.
Signed-off-by: Andrew Melnychenko <andrew@daynix.com> Link: https://lore.kernel.org/r/20220328175336.10802-3-andrew@daynix.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
#
c1ddc42d |
| 28-Mar-2022 |
Andrew Melnychenko <andrew@daynix.com> |
drivers/net/virtio_net: Fixed padded vheader to use v1 with hash.
The header v1 provides additional info about RSS. Added changes to computing proper header length. In the next patches, the header m
drivers/net/virtio_net: Fixed padded vheader to use v1 with hash.
The header v1 provides additional info about RSS. Added changes to computing proper header length. In the next patches, the header may contain RSS hash info for the hash population.
Signed-off-by: Andrew Melnychenko <andrew@daynix.com> Link: https://lore.kernel.org/r/20220328175336.10802-2-andrew@daynix.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
#
4f50ef15 |
| 13-Feb-2022 |
Michael Catanzaro <mcatanzaro.kernel@gmail.com> |
virtio_net: Fix code indent error
This patch fixes the checkpatch.pl warning:
ERROR: code indent should use tabs where possible #3453: FILE: drivers/net/virtio_net.c:3453: ret = register_virtio_dri
virtio_net: Fix code indent error
This patch fixes the checkpatch.pl warning:
ERROR: code indent should use tabs where possible #3453: FILE: drivers/net/virtio_net.c:3453: ret = register_virtio_driver(&virtio_net_driver);$
Uneccessary newline was also removed making line 3453 now 3452.
Signed-off-by: Michael Catanzaro <mcatanzaro.kernel@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
9b51d9d8 |
| 14-Aug-2021 |
Yury Norov <yury.norov@gmail.com> |
cpumask: replace cpumask_next_* with cpumask_first_* where appropriate
cpumask_first() is a more effective analogue of 'next' version if n == -1 (which means start == 0). This patch replaces 'next'
cpumask: replace cpumask_next_* with cpumask_first_* where appropriate
cpumask_first() is a more effective analogue of 'next' version if n == -1 (which means start == 0). This patch replaces 'next' with 'first' where things look trivial.
There's no cpumask_first_zero() function, so create it.
Signed-off-by: Yury Norov <yury.norov@gmail.com> Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
show more ...
|
#
d9679d00 |
| 13-Oct-2021 |
Michael S. Tsirkin <mst@redhat.com> |
virtio: wrap config->reset calls
This will enable cleanups down the road. The idea is to disable cbs, then add "flush_queued_cbs" callback as a parameter, this way drivers can flush any work queued
virtio: wrap config->reset calls
This will enable cleanups down the road. The idea is to disable cbs, then add "flush_queued_cbs" callback as a parameter, this way drivers can flush any work queued after callbacks have been disabled.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20211013105226.20225-1-mst@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
#
053c9e18 |
| 15-Dec-2021 |
Wenliang Wang <wangwenliang.1995@bytedance.com> |
virtio_net: fix rx_drops stat for small pkts
We found the stat of rx drops for small pkts does not increment when build_skb fail, it's not coherent with other mode's rx drops stat.
Signed-off-by: W
virtio_net: fix rx_drops stat for small pkts
We found the stat of rx drops for small pkts does not increment when build_skb fail, it's not coherent with other mode's rx drops stat.
Signed-off-by: Wenliang Wang <wangwenliang.1995@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|