History log of /openbmc/linux/drivers/net/virtio_net.c (Results 1 – 25 of 584)
Revision Date Author Comments
# f5e9a22d 12-Jul-2024 Breno Leitao <leitao@debian.org>

virtio_net: Fix napi_skb_cache_put warning

commit f8321fa75102246d7415a6af441872f6637c93ab upstream.

After the commit bdacf3e34945 ("net: Use nested-BH locking for
napi_alloc_cache.") was merged, t

virtio_net: Fix napi_skb_cache_put warning

commit f8321fa75102246d7415a6af441872f6637c93ab upstream.

After the commit bdacf3e34945 ("net: Use nested-BH locking for
napi_alloc_cache.") was merged, the following warning began to appear:

WARNING: CPU: 5 PID: 1 at net/core/skbuff.c:1451 napi_skb_cache_put+0x82/0x4b0

__warn+0x12f/0x340
napi_skb_cache_put+0x82/0x4b0
napi_skb_cache_put+0x82/0x4b0
report_bug+0x165/0x370
handle_bug+0x3d/0x80
exc_invalid_op+0x1a/0x50
asm_exc_invalid_op+0x1a/0x20
__free_old_xmit+0x1c8/0x510
napi_skb_cache_put+0x82/0x4b0
__free_old_xmit+0x1c8/0x510
__free_old_xmit+0x1c8/0x510
__pfx___free_old_xmit+0x10/0x10

The issue arises because virtio is assuming it's running in NAPI context
even when it's not, such as in the netpoll case.

To resolve this, modify virtnet_poll_tx() to only set NAPI when budget
is available. Same for virtnet_poll_cleantx(), which always assumed that
it was in a NAPI context.

Fixes: df133f3f9625 ("virtio_net: bulk free tx skbs")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Heng Qi <hengqi@linux.alibaba.com>
Link: https://patch.msgid.link/20240712115325.54175-1-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
[Shivani: Modified to apply on v6.6.y]
Signed-off-by: Shivani Agarwal <shivani.agarwal@broadcom.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

show more ...


# 3ef2a16a 17-Jun-2024 Heng Qi <hengqi@linux.alibaba.com>

virtio_net: fixing XDP for fully checksummed packets handling

[ Upstream commit 703eec1b242276f2d97d98f04790ddad319ddde4 ]

The XDP program can't correctly handle partially checksummed
packets, but

virtio_net: fixing XDP for fully checksummed packets handling

[ Upstream commit 703eec1b242276f2d97d98f04790ddad319ddde4 ]

The XDP program can't correctly handle partially checksummed
packets, but works fine with fully checksummed packets. If the
device has already validated fully checksummed packets, then
the driver doesn't need to re-validate them, saving CPU resources.

Additionally, the driver does not drop all partially checksummed
packets when VIRTIO_NET_F_GUEST_CSUM is not negotiated. This is
not a bug, as the driver has always done this.

Fixes: 436c9453a1ac ("virtio-net: keep vnet header zeroed after processing XDP")
Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


# ef609fd7 17-Jun-2024 Heng Qi <hengqi@linux.alibaba.com>

virtio_net: checksum offloading handling fix

[ Upstream commit 604141c036e1b636e2a71cf6e1aa09d1e45f40c2 ]

In virtio spec 0.95, VIRTIO_NET_F_GUEST_CSUM was designed to handle
partially checksummed p

virtio_net: checksum offloading handling fix

[ Upstream commit 604141c036e1b636e2a71cf6e1aa09d1e45f40c2 ]

In virtio spec 0.95, VIRTIO_NET_F_GUEST_CSUM was designed to handle
partially checksummed packets, and the validation of fully checksummed
packets by the device is independent of VIRTIO_NET_F_GUEST_CSUM
negotiation. However, the specification erroneously stated:

"If VIRTIO_NET_F_GUEST_CSUM is not negotiated, the device MUST set flags
to zero and SHOULD supply a fully checksummed packet to the driver."

This statement is inaccurate because even without VIRTIO_NET_F_GUEST_CSUM
negotiation, the device can still set the VIRTIO_NET_HDR_F_DATA_VALID flag.
Essentially, the device can facilitate the validation of these packets'
checksums - a process known as RX checksum offloading - removing the need
for the driver to do so.

This scenario is currently not implemented in the driver and requires
correction. The necessary specification correction[1] has been made and
approved in the virtio TC vote.
[1] https://lists.oasis-open.org/archives/virtio-comment/202401/msg00011.html

Fixes: 4f49129be6fa ("virtio-net: Set RXCSUM feature if GUEST_CSUM is available")
Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


# 43a71c1b 03-Apr-2024 Breno Leitao <leitao@debian.org>

virtio_net: Do not send RSS key if it is not supported

commit 059a49aa2e25c58f90b50151f109dd3c4cdb3a47 upstream.

There is a bug when setting the RSS options in virtio_net that can break
the whole m

virtio_net: Do not send RSS key if it is not supported

commit 059a49aa2e25c58f90b50151f109dd3c4cdb3a47 upstream.

There is a bug when setting the RSS options in virtio_net that can break
the whole machine, getting the kernel into an infinite loop.

Running the following command in any QEMU virtual machine with virtionet
will reproduce this problem:

# ethtool -X eth0 hfunc toeplitz

This is how the problem happens:

1) ethtool_set_rxfh() calls virtnet_set_rxfh()

2) virtnet_set_rxfh() calls virtnet_commit_rss_command()

3) virtnet_commit_rss_command() populates 4 entries for the rss
scatter-gather

4) Since the command above does not have a key, then the last
scatter-gatter entry will be zeroed, since rss_key_size == 0.
sg_buf_size = vi->rss_key_size;

5) This buffer is passed to qemu, but qemu is not happy with a buffer
with zero length, and do the following in virtqueue_map_desc() (QEMU
function):

if (!sz) {
virtio_error(vdev, "virtio: zero sized buffers are not allowed");

6) virtio_error() (also QEMU function) set the device as broken

vdev->broken = true;

7) Qemu bails out, and do not repond this crazy kernel.

8) The kernel is waiting for the response to come back (function
virtnet_send_command())

9) The kernel is waiting doing the following :

while (!virtqueue_get_buf(vi->cvq, &tmp) &&
!virtqueue_is_broken(vi->cvq))
cpu_relax();

10) None of the following functions above is true, thus, the kernel
loops here forever. Keeping in mind that virtqueue_is_broken() does
not look at the qemu `vdev->broken`, so, it never realizes that the
vitio is broken at QEMU side.

Fix it by not sending RSS commands if the feature is not available in
the device.

Fixes: c7114b1249fa ("drivers/net/virtio_net: Added basic RSS support.")
Cc: stable@vger.kernel.org
Cc: qemu-devel@nongnu.org
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Heng Qi <hengqi@linux.alibaba.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Vlad Poenaru <vlad.wing@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

show more ...


# 20e81d2c 03-Jan-2024 Zhu Yanjun <yanjun.zhu@linux.dev>

virtio_net: Fix "‘%d’ directive writing between 1 and 11 bytes into a region of size 10" warnings

[ Upstream commit e3fe8d28c67bf6c291e920c6d04fa22afa14e6e4 ]

Fix the warnings when building virtio_

virtio_net: Fix "‘%d’ directive writing between 1 and 11 bytes into a region of size 10" warnings

[ Upstream commit e3fe8d28c67bf6c291e920c6d04fa22afa14e6e4 ]

Fix the warnings when building virtio_net driver.

"
drivers/net/virtio_net.c: In function ‘init_vqs’:
drivers/net/virtio_net.c:4551:48: warning: ‘%d’ directive writing between 1 and 11 bytes into a region of size 10 [-Wformat-overflow=]
4551 | sprintf(vi->rq[i].name, "input.%d", i);
| ^~
In function ‘virtnet_find_vqs’,
inlined from ‘init_vqs’ at drivers/net/virtio_net.c:4645:8:
drivers/net/virtio_net.c:4551:41: note: directive argument in the range [-2147483643, 65534]
4551 | sprintf(vi->rq[i].name, "input.%d", i);
| ^~~~~~~~~~
drivers/net/virtio_net.c:4551:17: note: ‘sprintf’ output between 8 and 18 bytes into a destination of size 16
4551 | sprintf(vi->rq[i].name, "input.%d", i);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/virtio_net.c: In function ‘init_vqs’:
drivers/net/virtio_net.c:4552:49: warning: ‘%d’ directive writing between 1 and 11 bytes into a region of size 9 [-Wformat-overflow=]
4552 | sprintf(vi->sq[i].name, "output.%d", i);
| ^~
In function ‘virtnet_find_vqs’,
inlined from ‘init_vqs’ at drivers/net/virtio_net.c:4645:8:
drivers/net/virtio_net.c:4552:41: note: directive argument in the range [-2147483643, 65534]
4552 | sprintf(vi->sq[i].name, "output.%d", i);
| ^~~~~~~~~~~
drivers/net/virtio_net.c:4552:17: note: ‘sprintf’ output between 9 and 19 bytes into a destination of size 16
4552 | sprintf(vi->sq[i].name, "output.%d", i);

"

Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Link: https://lore.kernel.org/r/20240104020902.2753599-1-yanjun.zhu@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


# 3ffd05c2 26-Dec-2023 Xuan Zhuo <xuanzhuo@linux.alibaba.com>

virtio_net: fix missing dma unmap for resize

[ Upstream commit 2311e06b9bf3d44e15f9175af177a782806f688f ]

For rq, we have three cases getting buffers from virtio core:

1. virtqueue_get_buf{,_ctx}

virtio_net: fix missing dma unmap for resize

[ Upstream commit 2311e06b9bf3d44e15f9175af177a782806f688f ]

For rq, we have three cases getting buffers from virtio core:

1. virtqueue_get_buf{,_ctx}
2. virtqueue_detach_unused_buf
3. callback for virtqueue_resize

But in commit 295525e29a5b("virtio_net: merge dma operations when
filling mergeable buffers"), I missed the dma unmap for the #3 case.

That will leak some memory, because I did not release the pages referred
by the unused buffers.

If we do such script, we will make the system OOM.

while true
do
ethtool -G ens4 rx 128
ethtool -G ens4 rx 256
free -m
done

Fixes: 295525e29a5b ("virtio_net: merge dma operations when filling mergeable buffers")
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/r/20231226094333.47740-1-xuanzhuo@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


# e2e5c2a3 21-Sep-2023 Eric Dumazet <edumazet@google.com>

virtio_net: avoid data-races on dev->stats fields

[ Upstream commit d12a26b74fb77434b73fe39022266c4b00907219 ]

Use DEV_STATS_INC() and DEV_STATS_READ() which provide
atomicity on paths that can be

virtio_net: avoid data-races on dev->stats fields

[ Upstream commit d12a26b74fb77434b73fe39022266c4b00907219 ]

Use DEV_STATS_INC() and DEV_STATS_READ() which provide
atomicity on paths that can be used concurrently.

Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: 2311e06b9bf3 ("virtio_net: fix missing dma unmap for resize")
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


# 27debe3e 26-Oct-2023 Eric Dumazet <edumazet@google.com>

virtio_net: use u64_stats_t infra to avoid data-races

[ Upstream commit 61217d8f6360437329af1b16b8bbd9143167718d ]

syzbot reported a data-race in virtnet_poll / virtnet_stats [1]

u64_stats_t infra

virtio_net: use u64_stats_t infra to avoid data-races

[ Upstream commit 61217d8f6360437329af1b16b8bbd9143167718d ]

syzbot reported a data-race in virtnet_poll / virtnet_stats [1]

u64_stats_t infra has very nice accessors that must be used
to avoid potential load-store tearing.

[1]
BUG: KCSAN: data-race in virtnet_poll / virtnet_stats

read-write to 0xffff88810271b1a0 of 8 bytes by interrupt on cpu 0:
virtnet_receive drivers/net/virtio_net.c:2102 [inline]
virtnet_poll+0x6c8/0xb40 drivers/net/virtio_net.c:2148
__napi_poll+0x60/0x3b0 net/core/dev.c:6527
napi_poll net/core/dev.c:6594 [inline]
net_rx_action+0x32b/0x750 net/core/dev.c:6727
__do_softirq+0xc1/0x265 kernel/softirq.c:553
invoke_softirq kernel/softirq.c:427 [inline]
__irq_exit_rcu kernel/softirq.c:632 [inline]
irq_exit_rcu+0x3b/0x90 kernel/softirq.c:644
common_interrupt+0x7f/0x90 arch/x86/kernel/irq.c:247
asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:636
__sanitizer_cov_trace_const_cmp8+0x0/0x80 kernel/kcov.c:306
jbd2_write_access_granted fs/jbd2/transaction.c:1174 [inline]
jbd2_journal_get_write_access+0x94/0x1c0 fs/jbd2/transaction.c:1239
__ext4_journal_get_write_access+0x154/0x3f0 fs/ext4/ext4_jbd2.c:241
ext4_reserve_inode_write+0x14e/0x200 fs/ext4/inode.c:5745
__ext4_mark_inode_dirty+0x8e/0x440 fs/ext4/inode.c:5919
ext4_evict_inode+0xaf0/0xdc0 fs/ext4/inode.c:299
evict+0x1aa/0x410 fs/inode.c:664
iput_final fs/inode.c:1775 [inline]
iput+0x42c/0x5b0 fs/inode.c:1801
do_unlinkat+0x2b9/0x4f0 fs/namei.c:4405
__do_sys_unlink fs/namei.c:4446 [inline]
__se_sys_unlink fs/namei.c:4444 [inline]
__x64_sys_unlink+0x30/0x40 fs/namei.c:4444
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

read to 0xffff88810271b1a0 of 8 bytes by task 2814 on cpu 1:
virtnet_stats+0x1b3/0x340 drivers/net/virtio_net.c:2564
dev_get_stats+0x6d/0x860 net/core/dev.c:10511
rtnl_fill_stats+0x45/0x320 net/core/rtnetlink.c:1261
rtnl_fill_ifinfo+0xd0e/0x1120 net/core/rtnetlink.c:1867
rtnl_dump_ifinfo+0x7f9/0xc20 net/core/rtnetlink.c:2240
netlink_dump+0x390/0x720 net/netlink/af_netlink.c:2266
netlink_recvmsg+0x425/0x780 net/netlink/af_netlink.c:1992
sock_recvmsg_nosec net/socket.c:1027 [inline]
sock_recvmsg net/socket.c:1049 [inline]
____sys_recvmsg+0x156/0x310 net/socket.c:2760
___sys_recvmsg net/socket.c:2802 [inline]
__sys_recvmsg+0x1ea/0x270 net/socket.c:2832
__do_sys_recvmsg net/socket.c:2842 [inline]
__se_sys_recvmsg net/socket.c:2839 [inline]
__x64_sys_recvmsg+0x46/0x50 net/socket.c:2839
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

value changed: 0x000000000045c334 -> 0x000000000045c376

Fixes: 3fa2a1df9094 ("virtio-net: per cpu 64 bit stats (v2)")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


# 39d591b0 08-Oct-2023 Heng Qi <hengqi@linux.alibaba.com>

virtio-net: fix the vq coalescing setting for vq resize

[ Upstream commit f61fe5f081cf40de08d0a4c89659baf23c900f0c ]

According to the definition of virtqueue coalescing spec[1]:

Upon disabling a

virtio-net: fix the vq coalescing setting for vq resize

[ Upstream commit f61fe5f081cf40de08d0a4c89659baf23c900f0c ]

According to the definition of virtqueue coalescing spec[1]:

Upon disabling and re-enabling a transmit virtqueue, the device MUST set
the coalescing parameters of the virtqueue to those configured through the
VIRTIO_NET_CTRL_NOTF_COAL_TX_SET command, or, if the driver did not set
any TX coalescing parameters, to 0.

Upon disabling and re-enabling a receive virtqueue, the device MUST set
the coalescing parameters of the virtqueue to those configured through the
VIRTIO_NET_CTRL_NOTF_COAL_RX_SET command, or, if the driver did not set
any RX coalescing parameters, to 0.

We need to add this setting for vq resize (ethtool -G) where vq_reset happens.

[1] https://lists.oasis-open.org/archives/virtio-dev/202303/msg00415.html

Fixes: 394bd87764b6 ("virtio_net: support per queue interrupt coalesce command")
Cc: Gavin Li <gavinl@nvidia.com>
Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


# 7c1453ab 08-Oct-2023 Heng Qi <hengqi@linux.alibaba.com>

virtio-net: fix per queue coalescing parameter setting

[ Upstream commit bfb2b3609162135625bf96acf5118051cd0d082e ]

When the user sets a non-zero coalescing parameter to 0 for a specific
virtqueue,

virtio-net: fix per queue coalescing parameter setting

[ Upstream commit bfb2b3609162135625bf96acf5118051cd0d082e ]

When the user sets a non-zero coalescing parameter to 0 for a specific
virtqueue, it does not work as expected, so let's fix this.

Fixes: 394bd87764b6 ("virtio_net: support per queue interrupt coalesce command")
Reported-by: Xiaoming Zhao <zxm377917@alibaba-inc.com>
Cc: Gavin Li <gavinl@nvidia.com>
Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


# 829cce76 08-Oct-2023 Heng Qi <hengqi@linux.alibaba.com>

virtio-net: consistently save parameters for per-queue

[ Upstream commit e9420838ab4ffb82850095549e94dcee3f7fe0cb ]

When using .set_coalesce interface to set all queue coalescing
parameters, we nee

virtio-net: consistently save parameters for per-queue

[ Upstream commit e9420838ab4ffb82850095549e94dcee3f7fe0cb ]

When using .set_coalesce interface to set all queue coalescing
parameters, we need to update both per-queue and global save values.

Fixes: 394bd87764b6 ("virtio_net: support per queue interrupt coalesce command")
Cc: Gavin Li <gavinl@nvidia.com>
Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


# 84a056f7 08-Oct-2023 Heng Qi <hengqi@linux.alibaba.com>

virtio-net: fix mismatch of getting tx-frames

[ Upstream commit 134674c1877be5e35e35802517c67a9ecce21153 ]

Since virtio-net allows switching napi_tx for per txq, we have to
get the specific txq's r

virtio-net: fix mismatch of getting tx-frames

[ Upstream commit 134674c1877be5e35e35802517c67a9ecce21153 ]

Since virtio-net allows switching napi_tx for per txq, we have to
get the specific txq's result now.

Fixes: 394bd87764b6 ("virtio_net: support per queue interrupt coalesce command")
Cc: Gavin Li <gavinl@nvidia.com>
Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


# 5720c43d 27-Sep-2023 Xuan Zhuo <xuanzhuo@linux.alibaba.com>

virtio_net: fix the missing of the dma cpu sync

Commit 295525e29a5b ("virtio_net: merge dma operations when filling
mergeable buffers") unmaps the buffer with DMA_ATTR_SKIP_CPU_SYNC when
the dma->re

virtio_net: fix the missing of the dma cpu sync

Commit 295525e29a5b ("virtio_net: merge dma operations when filling
mergeable buffers") unmaps the buffer with DMA_ATTR_SKIP_CPU_SYNC when
the dma->ref is zero. We do that with DMA_ATTR_SKIP_CPU_SYNC, because we
do not want to do the sync for the entire page_frag. But that misses the
sync for the current area.

This patch does cpu sync regardless of whether the ref is zero or not.

Fixes: 295525e29a5b ("virtio_net: merge dma operations when filling mergeable buffers")
Reported-by: Michael Roth <michael.roth@amd.com>
Closes: http://lore.kernel.org/all/20230926130451.axgodaa6tvwqs3ut@amd.com
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

show more ...


# 295525e2 10-Aug-2023 Xuan Zhuo <xuanzhuo@linux.alibaba.com>

virtio_net: merge dma operations when filling mergeable buffers

Currently, the virtio core will perform a dma operation for each
buffer. Although, the same page may be operated multiple times.

This

virtio_net: merge dma operations when filling mergeable buffers

Currently, the virtio core will perform a dma operation for each
buffer. Although, the same page may be operated multiple times.

This patch, the driver does the dma operation and manages the dma
address based the feature premapped of virtio core.

This way, we can perform only one dma operation for the pages of the
alloc frag. This is beneficial for the iommu device.

kernel command line: intel_iommu=on iommu.passthrough=0

| strict=0 | strict=1
Before | 775496pps | 428614pps
After | 1109316pps | 742853pps

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Message-Id: <20230810123057.43407-13-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

show more ...


# dae64749 21-Aug-2023 Feng Liu <feliu@nvidia.com>

virtio_net: Introduce skb_vnet_common_hdr to avoid typecasting

The virtio_net driver currently deals with different versions and types
of virtio net headers, such as virtio_net_hdr_mrg_rxbuf,
virtio

virtio_net: Introduce skb_vnet_common_hdr to avoid typecasting

The virtio_net driver currently deals with different versions and types
of virtio net headers, such as virtio_net_hdr_mrg_rxbuf,
virtio_net_hdr_v1_hash, etc. Due to these variations, the code relies
on multiple type casts to convert memory between different structures,
potentially leading to bugs when there are changes in these structures.

Introduces the "struct skb_vnet_common_hdr" as a unifying header
structure using a union. With this approach, various virtio net header
structures can be converted by accessing different members of this
structure, thus eliminating the need for type casting and reducing the
risk of potential bugs.

For example following code:
static struct sk_buff *page_to_skb(struct virtnet_info *vi,
struct receive_queue *rq,
struct page *page, unsigned int offset,
unsigned int len, unsigned int truesize,
unsigned int headroom)
{
[...]
struct virtio_net_hdr_mrg_rxbuf *hdr;
[...]
hdr_len = vi->hdr_len;
[...]
ok:
hdr = skb_vnet_hdr(skb);
memcpy(hdr, hdr_p, hdr_len);
[...]
}

When VIRTIO_NET_F_HASH_REPORT feature is enabled, hdr_len = 20
But the sizeof(*hdr) is 12,
memcpy(hdr, hdr_p, hdr_len); will copy 20 bytes to the hdr,
which make a potential risk of bug. And this risk can be avoided by
introducing struct skb_vnet_common_hdr.

Change log
v1->v2
feedback from Willem de Bruijn <willemdebruijn.kernel@gmail.com>
feedback from Simon Horman <horms@kernel.org>
1. change to use net-next tree.
2. move skb_vnet_common_hdr inside kernel file instead of the UAPI header.

v2->v3
feedback from Willem de Bruijn <willemdebruijn.kernel@gmail.com>
1. fix typo in commit message.
2. add original struct virtio_net_hdr into union
3. remove virtio_net_hdr_mrg_rxbuf variable in receive_buf;

Signed-off-by: Feng Liu <feliu@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# 51b81317 09-Aug-2023 Jason Wang <jasowang@redhat.com>

virtio-net: set queues after driver_ok

Commit 25266128fe16 ("virtio-net: fix race between set queues and
probe") tries to fix the race between set queues and probe by calling
_virtnet_set_queues() b

virtio-net: set queues after driver_ok

Commit 25266128fe16 ("virtio-net: fix race between set queues and
probe") tries to fix the race between set queues and probe by calling
_virtnet_set_queues() before DRIVER_OK is set. This violates virtio
spec. Fixing this by setting queues after virtio_device_ready().

Note that rtnl needs to be held for userspace requests to change the
number of queues. So we are serialized in this way.

Fixes: 25266128fe16 ("virtio-net: fix race between set queues and probe")
Reported-by: Dragos Tatulea <dtatulea@nvidia.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# 2c507ce9 10-Aug-2023 Hawkins Jiawei <yin31149@gmail.com>

virtio-net: Zero max_tx_vq field for VIRTIO_NET_CTRL_MQ_HASH_CONFIG case

Kernel uses `struct virtio_net_ctrl_rss` to save command-specific-data
for both the VIRTIO_NET_CTRL_MQ_HASH_CONFIG and
VIRTIO

virtio-net: Zero max_tx_vq field for VIRTIO_NET_CTRL_MQ_HASH_CONFIG case

Kernel uses `struct virtio_net_ctrl_rss` to save command-specific-data
for both the VIRTIO_NET_CTRL_MQ_HASH_CONFIG and
VIRTIO_NET_CTRL_MQ_RSS_CONFIG commands.

According to the VirtIO standard, "Field reserved MUST contain zeroes.
It is defined to make the structure to match the layout of
virtio_net_rss_config structure, defined in 5.1.6.5.7.".

Yet for the VIRTIO_NET_CTRL_MQ_HASH_CONFIG command case, the `max_tx_vq`
field in struct virtio_net_ctrl_rss, which corresponds to the
`reserved` field in struct virtio_net_hash_config, is not zeroed,
thereby violating the VirtIO standard.

This patch solves this problem by zeroing this field in
virtnet_init_default_rss().

Cc: Andrew Melnychenko <andrew@daynix.com>
Cc: stable@vger.kernel.org
Fixes: c7114b1249fa ("drivers/net/virtio_net: Added basic RSS support.")
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20230810110405.25558-1-yin31149@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>

show more ...


# 49e47a5b 02-Aug-2023 Jakub Kicinski <kuba@kernel.org>

net: move struct netdev_rx_queue out of netdevice.h

struct netdev_rx_queue is touched in only a few places
and having it defined in netdevice.h brings in the dependency
on xdp.h, because struct xdp_

net: move struct netdev_rx_queue out of netdevice.h

struct netdev_rx_queue is touched in only a few places
and having it defined in netdevice.h brings in the dependency
on xdp.h, because struct xdp_rxq_info gets embedded in
struct netdev_rx_queue.

In prep for removal of xdp.h from netdevice.h move all
the netdev_rx_queue stuff to a new header.

We could technically break the new header up to avoid
the sysfs.h include but it's so rarely included it
doesn't seem to be worth it at this point.

Reviewed-by: Amritha Nambiar <amritha.nambiar@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Link: https://lore.kernel.org/r/20230803010230.1755386-3-kuba@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

show more ...


# 8af3bf66 31-Jul-2023 Gavin Li <gavinl@nvidia.com>

virtio_net: enable per queue interrupt coalesce feature

Enable per queue interrupt coalesce feature bit in driver and validate its
dependency with control queue.

Signed-off-by: Gavin Li <gavinl@nvi

virtio_net: enable per queue interrupt coalesce feature

Enable per queue interrupt coalesce feature bit in driver and validate its
dependency with control queue.

Signed-off-by: Gavin Li <gavinl@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Heng Qi <hengqi@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20230731070656.96411-4-gavinl@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

show more ...


# 394bd877 31-Jul-2023 Gavin Li <gavinl@nvidia.com>

virtio_net: support per queue interrupt coalesce command

Add interrupt_coalesce config in send_queue and receive_queue to cache user
config.

Send per virtqueue interrupt moderation config to underl

virtio_net: support per queue interrupt coalesce command

Add interrupt_coalesce config in send_queue and receive_queue to cache user
config.

Send per virtqueue interrupt moderation config to underlying device in
order to have more efficient interrupt moderation and cpu utilization of
guest VM.

Additionally, address all the VQs when updating the global configuration,
as now the individual VQs configuration can diverge from the global
configuration.

Signed-off-by: Gavin Li <gavinl@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Heng Qi <hengqi@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20230731070656.96411-3-gavinl@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

show more ...


# 308d7982 31-Jul-2023 Gavin Li <gavinl@nvidia.com>

virtio_net: extract interrupt coalescing settings to a structure

Extract interrupt coalescing settings to a structure so that it could be
reused in other data structures.

Signed-off-by: Gavin Li <g

virtio_net: extract interrupt coalescing settings to a structure

Extract interrupt coalescing settings to a structure so that it could be
reused in other data structures.

Signed-off-by: Gavin Li <gavinl@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Heng Qi <hengqi@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230731070656.96411-2-gavinl@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

show more ...


# 25266128 25-Jul-2023 Jason Wang <jasowang@redhat.com>

virtio-net: fix race between set queues and probe

A race were found where set_channels could be called after registering
but before virtnet_set_queues() in virtnet_probe(). Fixing this by
moving the

virtio-net: fix race between set queues and probe

A race were found where set_channels could be called after registering
but before virtnet_set_queues() in virtnet_probe(). Fixing this by
moving the virtnet_set_queues() before netdevice registering. While at
it, use _virtnet_set_queues() to avoid holding rtnl as the device is
not even registered at that time.

Cc: stable@vger.kernel.org
Fixes: a220871be66f ("virtio-net: correctly enable multiqueue")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230725072049.617289-1-jasowang@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

show more ...


# accc1bf2 05-Jun-2023 Brett Creeley <brett.creeley@amd.com>

virtio_net: use control_buf for coalesce params

Commit 699b045a8e43 ("net: virtio_net: notifications coalescing
support") added coalescing command support for virtio_net. However,
the coalesce comma

virtio_net: use control_buf for coalesce params

Commit 699b045a8e43 ("net: virtio_net: notifications coalescing
support") added coalescing command support for virtio_net. However,
the coalesce commands are using buffers on the stack, which is causing
the device to see DMA errors. There should also be a complaint from
check_for_stack() in debug_dma_map_xyz(). Fix this by adding and using
coalesce params from the control_buf struct, which aligns with other
commands.

Cc: stable@vger.kernel.org
Fixes: 699b045a8e43 ("net: virtio_net: notifications coalescing support")
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Allen Hubbe <allen.hubbe@amd.com>
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/r/20230605195925.51625-1-brett.creeley@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

show more ...


# 5306623a 12-May-2023 Feng Liu <feliu@nvidia.com>

virtio_net: Fix error unwinding of XDP initialization

When initializing XDP in virtnet_open(), some rq xdp initialization
may hit an error causing net device open failed. However, previous
rqs have

virtio_net: Fix error unwinding of XDP initialization

When initializing XDP in virtnet_open(), some rq xdp initialization
may hit an error causing net device open failed. However, previous
rqs have already initialized XDP and enabled NAPI, which is not the
expected behavior. Need to roll back the previous rq initialization
to avoid leaks in error unwinding of init code.

Also extract helper functions of disable and enable queue pairs.
Use newly introduced disable helper function in error unwinding and
virtnet_close. Use enable helper function in virtnet_open.

Fixes: 754b8a21a96d ("virtio_net: setup xdp_rxq_info")
Signed-off-by: Feng Liu <feliu@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: William Tu <witu@nvidia.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# b51f4113 10-May-2023 Yunsheng Lin <linyunsheng@huawei.com>

net: introduce and use skb_frag_fill_page_desc()

Most users use __skb_frag_set_page()/skb_frag_off_set()/
skb_frag_size_set() to fill the page desc for a skb frag.

Introduce skb_frag_fill_page_desc

net: introduce and use skb_frag_fill_page_desc()

Most users use __skb_frag_set_page()/skb_frag_off_set()/
skb_frag_size_set() to fill the page desc for a skb frag.

Introduce skb_frag_fill_page_desc() to do that.

net/bpf/test_run.c does not call skb_frag_off_set() to
set the offset, "copy_from_user(page_address(page), ...)"
and 'shinfo' being part of the 'data' kzalloced in
bpf_test_init() suggest that it is assuming offset to be
initialized as zero, so call skb_frag_fill_page_desc()
with offset being zero for this case.

Also, skb_frag_set_page() is not used anymore, so remove
it.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


12345678910>>...24