abb0dcf9 | 31-Aug-2023 |
Dragos Tatulea <dtatulea@nvidia.com> |
vdpa/mlx5: Fix firmware error on creation of 1k VQs
A firmware error is triggered when configuring a 9k MTU on the PF after switching to switchdev mode and then using a vdpa device with larger (1k)
vdpa/mlx5: Fix firmware error on creation of 1k VQs
A firmware error is triggered when configuring a 9k MTU on the PF after switching to switchdev mode and then using a vdpa device with larger (1k) rings: mlx5_cmd_out_err: CREATE_GENERAL_OBJECT(0xa00) op_mod(0xd) failed, status bad resource(0x5), syndrome (0xf6db90), err(-22)
This is due to the fact that the hw VQ size parameters are computed based on the umem_1/2/3_buffer_param_a/b capabilities and all device capabilities are read only when the driver is moved to switchdev mode.
The problematic configuration flow looks like this: 1) Create VF 2) Unbind VF 3) Switch PF to switchdev mode. 4) Bind VF 5) Set PF MTU to 9k 6) create vDPA device 7) Start VM with vDPA device and 1K queue size
Note that setting the MTU before step 3) doesn't trigger this issue.
This patch reads the forementioned umem parameters at the latest point possible before the VQs of the device are created.
v2: - Allocate output with kmalloc to reduce stack frame size. - Removed stable from cc.
Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20230831155702.1080754-1-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
show more ...
|
810b0cc1 | 03-Aug-2023 |
Dragos Tatulea <dtatulea@nvidia.com> |
vdpa/mlx5: Fix crash on shutdown for when no ndev exists
The ndev was accessed on shutdown without a check if it actually exists. This triggered the crash pasted below.
Instead of doing the ndev ch
vdpa/mlx5: Fix crash on shutdown for when no ndev exists
The ndev was accessed on shutdown without a check if it actually exists. This triggered the crash pasted below.
Instead of doing the ndev check, delete the shutdown handler altogether. The irqs will be released at the parent VF level (mlx5_core).
BUG: kernel NULL pointer dereference, address: 0000000000000300 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP CPU: 0 PID: 1 Comm: systemd-shutdow Not tainted 6.5.0-rc2_for_upstream_min_debug_2023_07_17_15_05 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:mlx5v_shutdown+0xe/0x50 [mlx5_vdpa] RSP: 0018:ffff8881003bfdc0 EFLAGS: 00010286 RAX: ffff888103befba0 RBX: ffff888109d28008 RCX: 0000000000000017 RDX: 0000000000000001 RSI: 0000000000000212 RDI: ffff888109d28000 RBP: 0000000000000000 R08: 0000000d3a3a3882 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: ffff888109d28000 R13: ffff888109d28080 R14: 00000000fee1dead R15: 0000000000000000 FS: 00007f4969e0be40(0000) GS:ffff88852c800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000300 CR3: 00000001051cd006 CR4: 0000000000370eb0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ? __die+0x20/0x60 ? page_fault_oops+0x14c/0x3c0 ? exc_page_fault+0x75/0x140 ? asm_exc_page_fault+0x22/0x30 ? mlx5v_shutdown+0xe/0x50 [mlx5_vdpa] device_shutdown+0x13e/0x1e0 kernel_restart+0x36/0x90 __do_sys_reboot+0x141/0x210 ? vfs_writev+0xcd/0x140 ? handle_mm_fault+0x161/0x260 ? do_writev+0x6b/0x110 do_syscall_64+0x3d/0x90 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7f496990fb56 RSP: 002b:00007fffc7bdde88 EFLAGS: 00000206 ORIG_RAX: 00000000000000a9 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f496990fb56 RDX: 0000000001234567 RSI: 0000000028121969 RDI: fffffffffee1dead RBP: 00007fffc7bde1d0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000 R13: 00007fffc7bddf10 R14: 0000000000000000 R15: 00007fffc7bde2b8 </TASK> CR2: 0000000000000300 ---[ end trace 0000000000000000 ]---
Fixes: bc9a2b3e686e ("vdpa/mlx5: Support interrupt bypassing") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20230803152648.199297-1-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
ad03a0f4 | 02-Aug-2023 |
Eugenio Pérez <eperezma@redhat.com> |
vdpa/mlx5: Delete control vq iotlb in destroy_mr only when necessary
mlx5_vdpa_destroy_mr can be called from .set_map with data ASID after the control virtqueue ASID iotlb has been populated. The co
vdpa/mlx5: Delete control vq iotlb in destroy_mr only when necessary
mlx5_vdpa_destroy_mr can be called from .set_map with data ASID after the control virtqueue ASID iotlb has been populated. The control vq iotlb must not be cleared, since it will not be populated again.
So call the ASID aware destroy function which makes sure that the right vq resource is destroyed.
Fixes: 8fcd20c30704 ("vdpa/mlx5: Support different address spaces for control and data") Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Message-Id: <20230802171231.11001-5-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
show more ...
|
e9d67e59 | 21-Mar-2023 |
Eli Cohen <elic@nvidia.com> |
vdpa/mlx5: Extend driver support for new features
Extend the possible list for features that can be supported by firmware. Note that different versions of firmware may or may not support these featu
vdpa/mlx5: Extend driver support for new features
Extend the possible list for features that can be supported by firmware. Note that different versions of firmware may or may not support these features. The driver is made aware of them by querying the firmware.
While doing this, improve the code so we use enum names instead of hard coded numerical values.
The new features supported by the driver are the following:
VIRTIO_NET_F_MRG_RXBUF VIRTIO_NET_F_HOST_ECN VIRTIO_NET_F_GUEST_ECN VIRTIO_NET_F_GUEST_TSO6 VIRTIO_NET_F_GUEST_TSO4
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20230321112809.221432-3-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Eugenio Pérez Martin <eperezma@redhat.com>
show more ...
|
deeacf35 | 06-Feb-2023 |
Si-Wei Liu <si-wei.liu@oracle.com> |
vdpa/mlx5: support device features provisioning
This patch implements features provisioning for mlx5_vdpa.
1) Validate the provisioned features are a subset of the parent features. 2) Clearing
vdpa/mlx5: support device features provisioning
This patch implements features provisioning for mlx5_vdpa.
1) Validate the provisioned features are a subset of the parent features. 2) Clearing features that are not wanted by userspace.
For example:
# vdpa mgmtdev show pci/0000:41:04.2: supported_classes net max_supported_vqs 65 dev_features CSUM GUEST_CSUM MTU MAC HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ CTRL_VLAN MQ CTRL_MAC_ADDR VERSION_1 ACCESS_PLATFORM
1) Provision vDPA device with all features derived from the parent
# vdpa dev add name vdpa1 mgmtdev pci/0000:41:04.2 # vdpa dev config show vdpa1: mac e4:11:c6:d3:45:f0 link up link_announce false max_vq_pairs 1 mtu 1500 negotiated_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ CTRL_VLAN MQ CTRL_MAC_ADDR VERSION_1 ACCESS_PLATFORM
2) Provision vDPA device with a subset of parent features
# vdpa dev add name vdpa1 mgmtdev pci/0000:41:04.2 device_features 0x300020000 # vdpa dev config show vdpa1: negotiated_features CTRL_VQ VERSION_1 ACCESS_PLATFORM
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Reviewed-by: Eli Cohen <elic@nvidia.com> Message-Id: <1675725124-7375-7-git-send-email-si-wei.liu@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
0a599750 | 14-Nov-2022 |
Eli Cohen <elic@nvidia.com> |
vdpa/mlx5: Add RX counters to debugfs
For each interface, either VLAN tagged or untagged, add two hardware counters: one for unicast and another for multicast. The counters count RX packets and byte
vdpa/mlx5: Add RX counters to debugfs
For each interface, either VLAN tagged or untagged, add two hardware counters: one for unicast and another for multicast. The counters count RX packets and bytes and can be read through debugfs:
$ cat /sys/kernel/debug/mlx5/mlx5_core.sf.1/vdpa-0/rx/untagged/mcast/packets $ cat /sys/kernel/debug/mlx5/mlx5_core.sf.1/vdpa-0/rx/untagged/ucast/bytes
This feature is controlled via the config option MLX5_VDPA_STEERING_DEBUG. It is off by default as it may have some impact on performance.
includes a fixup By Yang Yingliang <yangyingliang@huawei.com>:
vdpa/mlx5: fix check wrong pointer in mlx5_vdpa_add_mac_vlan_rules()
The local variable 'rule' is not used anymore, fix return value check after calling mlx5_add_flow_rules().
Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20221114131759.57883-9-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Message-Id: <20230104074418.1737510-1-yangyingliang@huawei.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Eli Cohen <elic@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com>
show more ...
|
29422100 | 14-Nov-2022 |
Eli Cohen <elic@nvidia.com> |
vdpa/mlx5: Add debugfs subtree
Add debugfs subtree and expose flow table ID and TIR number. This information can be used by external tools to do extended troubleshooting.
The information can be ret
vdpa/mlx5: Add debugfs subtree
Add debugfs subtree and expose flow table ID and TIR number. This information can be used by external tools to do extended troubleshooting.
The information can be retrieved like so: $ cat /sys/kernel/debug/mlx5/mlx5_core.sf.1/vdpa-0/rx/table_id $ cat /sys/kernel/debug/mlx5/mlx5_core.sf.1/vdpa-0/rx/tirn
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20221114131759.57883-8-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
38fc462f | 14-Nov-2022 |
Eli Cohen <elic@nvidia.com> |
vdpa/mlx5: Avoid overwriting CVQ iotlb
When qemu uses different address spaces for data and control virtqueues, the current code would overwrite the control virtqueue iotlb through the dup_iotlb cal
vdpa/mlx5: Avoid overwriting CVQ iotlb
When qemu uses different address spaces for data and control virtqueues, the current code would overwrite the control virtqueue iotlb through the dup_iotlb call. Fix this by referring to the address space identifier and the group to asid mapping to determine which mapping needs to be updated. We also move the address space logic from mlx5 net to core directory.
Reported-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20221114131759.57883-6-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com>
show more ...
|
0dbc1b4a | 14-Nov-2022 |
Eli Cohen <elic@nvidia.com> |
vdpa/mlx5: Avoid using reslock in event_handler
event_handler runs under atomic context and may not acquire reslock. We can still guarantee that the handler won't be called after suspend by clearing
vdpa/mlx5: Avoid using reslock in event_handler
event_handler runs under atomic context and may not acquire reslock. We can still guarantee that the handler won't be called after suspend by clearing nb_registered, unregistering the handler and flushing the workqueue.
Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20221114131759.57883-5-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
1ab53760 | 14-Nov-2022 |
Eli Cohen <elic@nvidia.com> |
vdpa/mlx5: Fix wrong mac address deletion
Delete the old MAC from the table and not the new one which is not there yet.
Fixes: baf2ad3f6a98 ("vdpa/mlx5: Add RX MAC VLAN filter support") Acked-by: J
vdpa/mlx5: Fix wrong mac address deletion
Delete the old MAC from the table and not the new one which is not there yet.
Fixes: baf2ad3f6a98 ("vdpa/mlx5: Add RX MAC VLAN filter support") Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20221114131759.57883-4-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|