History log of /openbmc/qemu/hw/virtio/vhost-user.c (Results 1 – 25 of 273)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v9.2.0, v9.1.2
# 9eb9350c 05-Nov-2024 Peter Maydell <peter.maydell@linaro.org>

Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging

virtio,pc,pci: features, fixes, cleanups

CXL now can use Generic Port Affinity Structures.
CXL now allows c

Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging

virtio,pc,pci: features, fixes, cleanups

CXL now can use Generic Port Affinity Structures.
CXL now allows control of link speed and width
vhost-user-blk now supports live resize, by means of
a new device-sync-config command
amd iommu now supports interrupt remapping
pcie devices now report extended tag field support
intel_iommu dropped support for Transient Mapping, to match VTD spec
arch agnostic ACPI infrastructure for vCPU Hotplug

Fixes, cleanups all over the place.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmcpNqUPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRp/2oH/0qO33prhDa48J5mqT9NuJzzYwp5QHKF9Zjv
# fDAplMUEmfxZIEgJchcyDWPYTGX2geT4pCFhRWioZMIR/0JyzrFgSwsk1kL88cMh
# 46gzhNVD6ybyPJ7O0Zq3GLy5jo7rlw/n+fFxKAuRCzcbK/fmH8gNC+RwW1IP64Na
# HDczYilHUhnO7yKZFQzQNQVbK4BckrG1bu0Fcx0EMUQBf4V6x7GLOrT+3hkKYcr6
# +DG5DmUmv20or/FXnu2Ye+MzR8Ebx6JVK3A3sXEE4Ns2CCzK9QLzeeyc2aU13jWN
# OpZ6WcKF8HqYprIwnSsMTxhPcq0/c7TvrGrazVwna5RUBMyjjvc=
# =zSX4
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 04 Nov 2024 21:03:33 GMT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469

* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (65 commits)
intel_iommu: Add missed reserved bit check for IEC descriptor
intel_iommu: Add missed sanity check for 256-bit invalidation queue
intel_iommu: Send IQE event when setting reserved bit in IQT_TAIL
hw/acpi: Update GED with vCPU Hotplug VMSD for migration
tests/qtest/bios-tables-test: Update DSDT golden masters for x86/{pc,q35}
hw/acpi: Update ACPI `_STA` method with QOM vCPU ACPI Hotplug states
qtest: allow ACPI DSDT Table changes
hw/acpi: Make CPUs ACPI `presence` conditional during vCPU hot-unplug
hw/pci: Add parenthesis to PCI_BUILD_BDF macro
hw/cxl: Ensure there is enough data to read the input header in cmd_get_physical_port_state()
hw/cxl: Ensure there is enough data for the header in cmd_ccls_set_lsa()
hw/cxl: Check that writes do not go beyond end of target attributes
hw/cxl: Ensuring enough data to read parameters in cmd_tunnel_management_cmd()
hw/cxl: Avoid accesses beyond the end of cel_log.
hw/cxl: Check the length of data requested fits in get_log()
hw/cxl: Check enough data in cmd_firmware_update_transfer()
hw/cxl: Check input length is large enough in cmd_events_clear_records()
hw/cxl: Check input includes at least the header in cmd_features_set_feature()
hw/cxl: Check size of input data to dynamic capacity mailbox commands
hw/cxl/cxl-mailbox-util: Fix output buffer index update when retrieving DC extents
...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

show more ...


# eea5aeef 22-Oct-2024 Albert Esteve <aesteve@redhat.com>

vhost-user: fix shared object return values

VHOST_USER_BACKEND_SHARED_OBJECT_ADD and
VHOST_USER_BACKEND_SHARED_OBJECT_REMOVE state
in the spec that they return 0 for successful
operations, non-zero

vhost-user: fix shared object return values

VHOST_USER_BACKEND_SHARED_OBJECT_ADD and
VHOST_USER_BACKEND_SHARED_OBJECT_REMOVE state
in the spec that they return 0 for successful
operations, non-zero otherwise. However,
implementation relies on the return types
of the virtio-dmabuf library, with opposite
semantics (true if everything is correct,
false otherwise). Therefore, current
implementation violates the specification.

Revert the logic so that the implementation
of the vhost-user handling methods matches
the specification.

Fixes: 043e127a126bb3ceb5fc753deee27d261fd0c5ce
Fixes: 160947666276c5b7f6bca4d746bcac2966635d79
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Albert Esteve <aesteve@redhat.com>
Message-Id: <20241022124615.585596-1-aesteve@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

show more ...


Revision tags: v9.1.1
# 963b0276 11-Oct-2024 yaozhenguo <yaozhenguo1@gmail.com>

virtio/vhost-user: fix qemu abort when hotunplug vhost-user-net device

During the hot-unplugging of vhost-user-net type network cards,
the vhost_user_cleanup function may add the same rcu node to
th

virtio/vhost-user: fix qemu abort when hotunplug vhost-user-net device

During the hot-unplugging of vhost-user-net type network cards,
the vhost_user_cleanup function may add the same rcu node to
the rcu linked list. The function call in this case is as follows:

vhost_user_cleanup
->vhost_user_host_notifier_remove
->call_rcu(n, vhost_user_host_notifier_free, rcu);
->g_free_rcu(n, rcu);

When this happens, QEMU will abort in try_dequeue:

if (head == &dummy && qatomic_mb_read(&tail) == &dummy.next) {
abort();
}

backtrace is as follows:
0 __pthread_kill_implementation () at /usr/lib64/libc.so.6
1 raise () at /usr/lib64/libc.so.6
2 abort () at /usr/lib64/libc.so.6
3 try_dequeue () at ../util/rcu.c:235
4 call_rcu_thread (0) at ../util/rcu.c:288
5 qemu_thread_start (0) at ../util/qemu-thread-posix.c:541
6 start_thread () at /usr/lib64/libc.so.6
7 clone3 () at /usr/lib64/libc.so.6

The reason for the abort is that adding two identical nodes to
the rcu linked list will cause the rcu linked list to become a ring,
but when the dummy node is added after the two identical nodes,
the ring is opened. But only one node is added to list with
rcu_call_count added twice. This will cause rcu try_dequeue abort.

This happens when n->addr != 0. In some scenarios, this does happen.
For example, this situation will occur when using a 32-queue DPU
vhost-user-net type network card for hot-unplug testing, because
VhostUserHostNotifier->addr will be cleared during the processing of
VHOST_USER_BACKEND_VRING_HOST_NOTIFIER_MSG. However,it is asynchronous,
so we cannot guarantee that VhostUserHostNotifier->addr is zero in
vhost_user_cleanup. Therefore, it is necessary to merge g_free_rcu
and vhost_user_host_notifier_free into one rcu node.

Fixes: 503e355465 ("virtio/vhost-user: dynamically assign VhostUserHostNotifiers")
Signed-off-by: yaozhenguo <yaozhenguo@jd.com>
Message-Id: <20241011102913.45582-1-yaozhenguo@jd.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

show more ...


Revision tags: v9.1.0
# 1406b7fc 03-Jul-2024 Richard Henderson <richard.henderson@linaro.org>

Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging

virtio: features,fixes

A bunch of improvements:
- vhost dirty log is now only scanned once, not once per de

Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging

virtio: features,fixes

A bunch of improvements:
- vhost dirty log is now only scanned once, not once per device
- virtio and vhost now support VIRTIO_F_NOTIFICATION_DATA
- cxl gained DCD emulation support
- pvpanic gained shutdown support
- beginning of patchset for Generic Port Affinity Structure
- s3 support
- friendlier error messages when boot fails on some illegal configs
- for vhost-user, VHOST_USER_SET_LOG_BASE is now only sent once
- part of vhost-user support for any POSIX system -
not yet enabled due to qtest failures
- sr-iov VF setup code has been reworked significantly
- new tests, particularly for risc-v ACPI
- bugfixes

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmaF068PHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRp+DMIAMC//mBXIZlPprfhb5cuZklxYi31Acgu5TUr
# njqjCkN+mFhXXZuc3B67xmrQ066IEPtsbzCjSnzuU41YK4tjvO1g+LgYJBv41G16
# va2k8vFM5pdvRA+UC9li1CCIPxiEcszxOdzZemj3szWLVLLUmwsc5OZLWWeFA5m8
# vXrrT9miODUz3z8/Xn/TVpxnmD6glKYIRK/IJRzzC4Qqqwb5H3ji/BJV27cDUtdC
# w6ns5RYIj5j4uAiG8wQNDggA1bMsTxFxThRDUwxlxaIwAcexrf1oRnxGRePA7PVG
# BXrt5yodrZYR2sR6svmOOIF3wPMUDKdlAItTcEgYyxaVo5rAdpc=
# =p9h4
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 03 Jul 2024 03:41:51 PM PDT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [undefined]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469

* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (85 commits)
hw/pci: Replace -1 with UINT32_MAX for romsize
pcie_sriov: Register VFs after migration
pcie_sriov: Remove num_vfs from PCIESriovPF
pcie_sriov: Release VFs failed to realize
pcie_sriov: Reuse SR-IOV VF device instances
pcie_sriov: Ensure VF function number does not overflow
pcie_sriov: Do not manually unrealize
hw/ppc/spapr_pci: Do not reject VFs created after a PF
hw/ppc/spapr_pci: Do not create DT for disabled PCI device
hw/pci: Rename has_power to enabled
virtio-iommu: Clear IOMMUDevice when VFIO device is unplugged
virtio: remove virtio_tswap16s() call in vring_packed_event_read()
hw/cxl/events: Mark cxl-add-dynamic-capacity and cxl-release-dynamic-capcity unstable
hw/cxl/events: Improve QMP interfaces and documentation for add/release dynamic capacity.
tests/data/acpi/rebuild-expected-aml.sh: Add RISC-V
pc-bios/meson.build: Add support for RISC-V in unpack_edk2_blobs
meson.build: Add RISC-V to the edk2-target list
tests/data/acpi/virt: Move ARM64 ACPI tables under aarch64/${machine} path
tests/data/acpi: Move x86 ACPI tables under x86/${machine} path
tests/qtest/bios-tables-test.c: Set "arch" for x86 tests
...

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

show more ...


# 7c211eb0 13-Jun-2024 BillXiang <xiangwencheng@dayudpu.com>

vhost-user: Skip unnecessary duplicated VHOST_USER_SET_LOG_BASE requests

The VHOST_USER_SET_LOG_BASE requests should be categorized into
non-vring specific messages, and should be sent only once.
If

vhost-user: Skip unnecessary duplicated VHOST_USER_SET_LOG_BASE requests

The VHOST_USER_SET_LOG_BASE requests should be categorized into
non-vring specific messages, and should be sent only once.
If send more than once, dpdk will munmap old log_addr which may has been used and cause segmentation fault.

Signed-off-by: BillXiang <xiangwencheng@dayudpu.com>
Message-Id: <20240613065150.3100-1-xiangwencheng@dayudpu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

show more ...


# 6eaf0e61 15-May-2024 Li Feng <fengli@smartx.com>

vhost-user: fix lost reconnect again

When the vhost-user is reconnecting to the backend, and if the vhost-user fails
at the get_features in vhost_dev_init(), then the reconnect will fail
and it will

vhost-user: fix lost reconnect again

When the vhost-user is reconnecting to the backend, and if the vhost-user fails
at the get_features in vhost_dev_init(), then the reconnect will fail
and it will not be retriggered forever.

The reason is:
When the vhost-user fail at get_features, the vhost_dev_cleanup will be called
immediately.

vhost_dev_cleanup calls 'memset(hdev, 0, sizeof(struct vhost_dev))'.

The reconnect path is:
vhost_user_blk_event
vhost_user_async_close(.. vhost_user_blk_disconnect ..)
qemu_chr_fe_set_handlers <----- clear the notifier callback
schedule vhost_user_async_close_bh

The vhost->vdev is null, so the vhost_user_blk_disconnect will not be
called, then the event fd callback will not be reinstalled.

We need to ensure that even if vhost_dev_init initialization fails, the event
handler still needs to be reinstalled when s->connected is false.

All vhost-user devices have this issue, including vhost-user-blk/scsi.

Fixes: 71e076a07d ("hw/virtio: generalise CHR_EVENT_CLOSED handling")

Signed-off-by: Li Feng <fengli@smartx.com>
Message-Id: <20240516025753.130171-3-fengli@smartx.com>
Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

show more ...


# 9569fe0a 15-May-2024 Li Feng <fengli@smartx.com>

Revert "vhost-user: fix lost reconnect"

This reverts commit f02a4b8e6431598612466f76aac64ab492849abf.

Since the current patch cannot completely fix the lost reconnect
problem, there is a scenario t

Revert "vhost-user: fix lost reconnect"

This reverts commit f02a4b8e6431598612466f76aac64ab492849abf.

Since the current patch cannot completely fix the lost reconnect
problem, there is a scenario that is not considered:
- When the virtio-blk driver is removed from the guest os,
s->connected has no chance to be set to false, resulting in
subsequent reconnection not being executed.

The next patch will completely fix this issue with a better approach.

Signed-off-by: Li Feng <fengli@smartx.com>
Message-Id: <20240516025753.130171-2-fengli@smartx.com>
Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

show more ...


# 6fc69312 13-Mar-2024 Peter Maydell <peter.maydell@linaro.org>

Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging

virtio,pc,pci: features, cleanups, fixes

more memslots support in libvhost-user
support PCIe Gen5/Gen6 link

Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging

virtio,pc,pci: features, cleanups, fixes

more memslots support in libvhost-user
support PCIe Gen5/Gen6 link speeds in pcie
more traces in vdpa
network simulation devices support in vdpa
SMBIOS type 9 descriptor implementation
Bump max_cpus to 4096 vcpus in q35
aw-bits and granule options in VIRTIO-IOMMU
Support report NUMA nodes for device memory using GI in acpi
Beginning of shutdown event support in pvpanic

fixes, cleanups all over the place.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmXw0TMPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRp8x4H+gLMoGwaGAX7gDGPgn2Ix4j/3kO77ZJ9X9k/
# 1KqZu/9eMS1j2Ei+vZqf05w7qRjxxhwDq3ilEXF/+UFqgAehLqpRRB8j5inqvzYt
# +jv0DbL11PBp/oFjWcytm5CbiVsvq8KlqCF29VNzc162XdtcduUOWagL96y8lJfZ
# uPrOoyeR7SMH9lp3LLLHWgu+9W4nOS03RroZ6Umj40y5B7yR0Rrppz8lMw5AoQtr
# 0gMRnFhYXeiW6CXdz+Tzcr7XfvkkYDi/j7ibiNSURLBfOpZa6Y8+kJGKxz5H1K1G
# 6ZY4PBcOpQzl+NMrktPHogczgJgOK10t+1i/R3bGZYw2Qn/93Eg=
# =C0UU
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 12 Mar 2024 22:03:31 GMT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469

* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (68 commits)
docs/specs/pvpanic: document shutdown event
hw/cxl: Fix missing reserved data in CXL Device DVSEC
hmat acpi: Fix out of bounds access due to missing use of indirection
hmat acpi: Do not add Memory Proximity Domain Attributes Structure targetting non existent memory.
qemu-options.hx: Document the virtio-iommu-pci aw-bits option
hw/arm/virt: Set virtio-iommu aw-bits default value to 48
hw/i386/q35: Set virtio-iommu aw-bits default value to 39
virtio-iommu: Add an option to define the input range width
virtio-iommu: Trace domain range limits as unsigned int
qemu-options.hx: Document the virtio-iommu-pci granule option
virtio-iommu: Change the default granule to the host page size
virtio-iommu: Add a granule property
hw/i386/acpi-build: Add support for SRAT Generic Initiator structures
hw/acpi: Implement the SRAT GI affinity structure
qom: new object to associate device to NUMA node
hw/i386/pc: Inline pc_cmos_init() into pc_cmos_init_late() and remove it
hw/i386/pc: Set "normal" boot device order in pc_basic_device_init()
hw/i386/pc: Avoid one use of the current_machine global
hw/i386/pc: Remove "rtc_state" link again
Revert "hw/i386/pc: Confine system flash handling to pc_sysfw"
...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

# Conflicts:
# hw/core/machine.c

show more ...


# 043e127a 19-Feb-2024 Albert Esteve <aesteve@redhat.com>

hw/virtio: check owner for removing objects

Shared objects lack spoofing protection.
For VHOST_USER_BACKEND_SHARED_OBJECT_REMOVE messages
received by the vhost-user interface, any backend was
allowe

hw/virtio: check owner for removing objects

Shared objects lack spoofing protection.
For VHOST_USER_BACKEND_SHARED_OBJECT_REMOVE messages
received by the vhost-user interface, any backend was
allowed to remove entries from the shared table just
by knowing the UUID. Only the owner of the entry
shall be allowed to removed their resources
from the table.

To fix that, add a check for all
*SHARED_OBJECT_REMOVE messages received.
A vhost device can only remove TYPE_VHOST_DEV
entries that are owned by them, otherwise skip
the removal, and inform the device that the entry
has not been removed in the answer.

Signed-off-by: Albert Esteve <aesteve@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20240219143423.272012-2-aesteve@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

show more ...


# 8f3f329f 12-Mar-2024 Peter Maydell <peter.maydell@linaro.org>

Merge tag 'migration-20240311-pull-request' of https://gitlab.com/peterx/qemu into staging

Migration pull request

- Avihai's fix to allow vmstate iterators to not starve for VFIO
- Maksim's fix on

Merge tag 'migration-20240311-pull-request' of https://gitlab.com/peterx/qemu into staging

Migration pull request

- Avihai's fix to allow vmstate iterators to not starve for VFIO
- Maksim's fix on additional check on precopy load error
- Fabiano's fix on fdatasync() hang in mapped-ram
- Jonathan's fix on vring cached access over MMIO regions
- Cedric's cleanup patches 1-4 out of his error report series
- Yu's fix for RDMA migration (which used to be broken even for 8.2)
- Anthony's small cleanup/fix on err message
- Steve's patches on privatize migration.h
- Xiang's patchset to enable zero page detections in multifd threads

# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZe9+uBIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wamaQD/SvmpMEcuRndT9LPSxzXowAGDZTBpYUfv
# 5XAbx80dS9IBAO8PJJgQJIBHBeacyLBjHP9CsdVtgw5/VW+wCsbfV4AB
# =xavb
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 11 Mar 2024 21:59:20 GMT
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal]
# gpg: aka "Peter Xu <peterx@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706

* tag 'migration-20240311-pull-request' of https://gitlab.com/peterx/qemu: (34 commits)
migration/multifd: Add new migration test cases for legacy zero page checking.
migration/multifd: Enable multifd zero page checking by default.
migration/multifd: Implement ram_save_target_page_multifd to handle multifd version of MigrationOps::ram_save_target_page.
migration/multifd: Implement zero page transmission on the multifd thread.
migration/multifd: Add new migration option zero-page-detection.
migration/multifd: Allow clearing of the file_bmap from multifd
migration/multifd: Allow zero pages in file migration
migration: purge MigrationState from public interface
migration: delete unused accessors
migration: privatize colo interfaces
migration: migration_file_set_error
migration: migration_is_device
migration: migration_thread_is_self
migration: export vcpu_dirty_limit_period
migration: export migration_is_running
migration: export migration_is_active
migration: export migration_is_setup_or_active
migration: remove migration.h references
migration: export fewer options
migration: Fix format in error message
...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

show more ...


# f853fa07 11-Mar-2024 Steve Sistare <steven.sistare@oracle.com>

migration: remove migration.h references

Remove migration.h from files that no longer need it due to
previous commits.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Link: https://lore.ke

migration: remove migration.h references

Remove migration.h from files that no longer need it due to
previous commits.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Link: https://lore.kernel.org/r/1710179338-294359-2-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>

show more ...


# c0c6a0e3 28-Feb-2024 Peter Maydell <peter.maydell@linaro.org>

Merge tag 'migration-next-pull-request' of https://gitlab.com/peterx/qemu into staging

Migration pull request

- Fabiano's fixed-ram patches (1-5 only)
- Peter's cleanups on multifd tls IOC referenc

Merge tag 'migration-next-pull-request' of https://gitlab.com/peterx/qemu into staging

Migration pull request

- Fabiano's fixed-ram patches (1-5 only)
- Peter's cleanups on multifd tls IOC referencing
- Steve's cpr patches for vfio (migration patches only)
- Fabiano's fix on mbps stats racing with COMPLETE state
- Fabiano's fix on return path thread hang

# -----BEGIN PGP SIGNATURE-----
#
# iIcEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZd7AbhIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wbg0gDyA3Vg3pIqCJ+u+hLZ+QKxY/pnu8Y5kF+E
# HK2IdslQUQD+OX4ATUnl+CGMiVX9fjs1fKx0Z0Qetq8gC1YJF13yuA0=
# =P2QF
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 28 Feb 2024 05:11:10 GMT
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal]
# gpg: aka "Peter Xu <peterx@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706

* tag 'migration-next-pull-request' of https://gitlab.com/peterx/qemu: (25 commits)
migration: Use migrate_has_error() in close_return_path_on_source()
migration: Join the return path thread before releasing to_dst_file
migration: Fix qmp_query_migrate mbps value
migration: options incompatible with cpr
migration: update cpr-reboot description
migration: stop vm for cpr
migration: notifier error checking
migration: refactor migrate_fd_connect failures
migration: per-mode notifiers
migration: MigrationNotifyFunc
migration: remove postcopy_after_devices
migration: MigrationEvent for notifiers
migration: convert to NotifierWithReturn
migration: remove error from notifier data
notify: pass error to notifier with return
migration/multifd: Drop unnecessary helper to destroy IOC
migration/multifd: Cleanup outgoing_args in state destroy
migration/multifd: Make multifd_channel_connect() return void
migration/multifd: Drop registered_yank
migration/multifd: Cleanup TLS iochannel referencing
...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

show more ...


# d91f33c7 22-Feb-2024 Steve Sistare <steven.sistare@oracle.com>

migration: remove error from notifier data

Remove the error object from opaque data passed to notifiers.
Use the new error parameter passed to the notifier instead.

Signed-off-by: Steve Sistare <st

migration: remove error from notifier data

Remove the error object from opaque data passed to notifiers.
Use the new error parameter passed to the notifier instead.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-3-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>

show more ...


# be19d836 22-Feb-2024 Steve Sistare <steven.sistare@oracle.com>

notify: pass error to notifier with return

Pass an error object as the third parameter to "notifier with return"
notifiers, so clients no longer need to bundle an error object in the
opaque data. T

notify: pass error to notifier with return

Pass an error object as the third parameter to "notifier with return"
notifiers, so clients no longer need to bundle an error object in the
opaque data. The new parameter is used in a later patch.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-2-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>

show more ...


# f6b615b5 07-Nov-2023 Stefan Hajnoczi <stefanha@redhat.com>

Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging

virtio,pc,pci: features, fixes

virtio sound card support

vhost-user: back-end state migration

cxl:
l

Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging

virtio,pc,pci: features, fixes

virtio sound card support

vhost-user: back-end state migration

cxl:
line length reduction
enabling fabric management

vhost-vdpa:
shadow virtqueue hash calculation Support
shadow virtqueue RSS Support

tests:
CPU topology related smbios test cases

Fixes, cleanups all over the place

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmVKDDoPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpF08H/0Zts8uvkHbgiOEJw4JMHU6/VaCipfIYsp01
# GSfwYOyEsXJ7GIxKWaCiMnWXEm7tebNCPKf3DoUtcAojQj3vuF9XbWBKw/bfRn83
# nGO/iiwbYViSKxkwqUI+Up5YiN9o0M8gBFrY0kScPezbnYmo5u2bcADdEEq6gH68
# D0Ea8i+WmszL891ypvgCDBL2ObDk3qX3vA5Q6J2I+HKX2ofJM59BwaKwS5ghw+IG
# BmbKXUZJNjUQfN9dQ7vJuiuqdknJ2xUzwW2Vn612ffarbOZB1DZ6ruWlrHty5TjX
# 0w4IXEJPBgZYbX9oc6zvTQnbLDBJbDU89mnme0TcmNMKWmQKTtc=
# =vEv+
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 07 Nov 2023 18:06:50 HKT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469

* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (63 commits)
acpi/tests/avocado/bits: enable console logging from bits VM
acpi/tests/avocado/bits: enforce 32-bit SMBIOS entry point
hw/cxl: Add tunneled command support to mailbox for switch cci.
hw/cxl: Add dummy security state get
hw/cxl/type3: Cleanup multiple CXL_TYPE3() calls in read/write functions
hw/cxl/mbox: Add Get Background Operation Status Command
hw/cxl: Add support for device sanitation
hw/cxl/mbox: Wire up interrupts for background completion
hw/cxl/mbox: Add support for background operations
hw/cxl: Implement Physical Ports status retrieval
hw/pci-bridge/cxl_downstream: Set default link width and link speed
hw/cxl/mbox: Add Physical Switch Identify command.
hw/cxl/mbox: Add Information and Status / Identify command
hw/cxl: Add a switch mailbox CCI function
hw/pci-bridge/cxl_upstream: Move defintion of device to header.
hw/cxl/mbox: Generalize the CCI command processing
hw/cxl/mbox: Pull the CCI definition out of the CXLDeviceState
hw/cxl/mbox: Split mailbox command payload into separate input and output
hw/cxl/mbox: Pull the payload out of struct cxl_cmd and make instances constant
hw/cxl: Fix a QEMU_BUILD_BUG_ON() in switch statement scope issue.
...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

show more ...


# cda83adc 16-Oct-2023 Hanna Czenczek <hreitz@redhat.com>

vhost-user: Interface for migration state transfer

Add the interface for transferring the back-end's state during migration
as defined previously in vhost-user.rst.

Reviewed-by: Stefan Hajnoczi <st

vhost-user: Interface for migration state transfer

Add the interface for transferring the back-end's state during migration
as defined previously in vhost-user.rst.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20231016134243.68248-6-hreitz@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

show more ...


# 3c95fd4e 26-Oct-2023 Stefan Hajnoczi <stefanha@redhat.com>

Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* target/i386: implement SHA instructions
* target/i386: check CPUID_PAE to determine 36 bit processor address space
* target

Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* target/i386: implement SHA instructions
* target/i386: check CPUID_PAE to determine 36 bit processor address space
* target/i386: improve validation of AVX instructions
* require Linux 4.4 for KVM

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmU5Vi4UHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroNVbwf9HCx+C0MITWjQ+rEkmtiy/Cn+ZsF1
# gbaL31ahymEU3vUcKZX8Z4ycmBFw9b3yvotTVR38lE9p+sKtSaGKUGV0btpS7oBB
# y8IfnVmg5X1j4PtyDxFlLD48qg//2kVgJ6wtaDTSAkgQMOPM9UgHgQD+Ks7kOo8v
# rReL46XVPEZTWt3syX0y87mFinjK2hXGqIdsnJ1uT614BAVVIrmO6aFNNN1FlsRb
# NGRZevJTfEWjWVfWOhUiZdUGDz74sOXdshZX/teadeDJLtWaw0uytMN9qoTN33h/
# OsdR2fO7h8ZknGEc2F1fJEVh4sOfO4fGYAAJGzHP9AjUDV1IVVYELb79dg==
# =WYTo
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 26 Oct 2023 02:53:50 JST
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83

* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (24 commits)
kvm: i8254: require KVM_CAP_PIT2 and KVM_CAP_PIT_STATE2
kvm: i386: require KVM_CAP_SET_IDENTITY_MAP_ADDR
kvm: i386: require KVM_CAP_ADJUST_CLOCK
kvm: i386: require KVM_CAP_MCE
kvm: i386: require KVM_CAP_SET_VCPU_EVENTS and KVM_CAP_X86_ROBUST_SINGLESTEP
kvm: i386: require KVM_CAP_XSAVE
kvm: i386: require KVM_CAP_DEBUGREGS
kvm: i386: move KVM_CAP_IRQ_ROUTING detection to kvm_arch_required_capabilities
kvm: unify listeners for PIO address space
kvm: require KVM_CAP_IOEVENTFD and KVM_CAP_IOEVENTFD_ANY_LENGTH
kvm: assume that many ioeventfds can be created
kvm: drop reference to KVM_CAP_PCI_2_3
kvm: require KVM_IRQFD for kernel irqchip
kvm: require KVM_IRQFD for kernel irqchip
kvm: require KVM_CAP_SIGNAL_MSI
kvm: require KVM_CAP_INTERNAL_ERROR_DATA
kvm: remove unnecessary stub
target/i386: check CPUID_PAE to determine 36 bit processor address space
target/i386: validate VEX.W for AVX instructions
target/i386: group common checks in the decoding phase
...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

show more ...


# 126e7f78 17-Oct-2023 Paolo Bonzini <pbonzini@redhat.com>

kvm: require KVM_CAP_IOEVENTFD and KVM_CAP_IOEVENTFD_ANY_LENGTH

KVM_CAP_IOEVENTFD_ANY_LENGTH was added in Linux 4.4, released in 2016.
Assume that it is present.

Signed-off-by: Paolo Bonzini <pbonz

kvm: require KVM_CAP_IOEVENTFD and KVM_CAP_IOEVENTFD_ANY_LENGTH

KVM_CAP_IOEVENTFD_ANY_LENGTH was added in Linux 4.4, released in 2016.
Assume that it is present.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

show more ...


# 1b4a5a20 23-Oct-2023 Stefan Hajnoczi <stefanha@redhat.com>

Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging

virtio,pc,pci: features, cleanups

infrastructure for vhost-vdpa shadow work
piix south bridge rework
reconn

Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging

virtio,pc,pci: features, cleanups

infrastructure for vhost-vdpa shadow work
piix south bridge rework
reconnect for vhost-user-scsi
dummy ACPI QTG DSM for cxl

tests, cleanups, fixes all over the place

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmU06PMPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpNIsH/0DlKti86VZLJ6PbNqsnKxoK2gg05TbEhPZU
# pQ+RPDaCHpFBsLC5qsoMJwvaEQFe0e49ZFemw7bXRzBxgmbbNnZ9ArCIPqT+rvQd
# 7UBmyC+kacVyybZatq69aK2BHKFtiIRlT78d9Izgtjmp8V7oyKoz14Esh8wkE+FT
# ypHUa70Addi6alNm6BVkm7bxZxi0Wrmf3THqF8ViYvufzHKl7JR5e17fKWEG0BqV
# 9W7AeHMnzJ7jkTvBGUw7g5EbzFn7hPLTbO4G/VW97k0puS4WRX5aIMkVhUazsRIa
# zDOuXCCskUWuRapiCwY0E4g7cCaT8/JR6JjjBaTgkjJgvo5Y8Eg=
# =ILek
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 22 Oct 2023 02:18:43 PDT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469

* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (62 commits)
intel-iommu: Report interrupt remapping faults, fix return value
MAINTAINERS: Add include/hw/intc/i8259.h to the PC chip section
vhost-user: Fix protocol feature bit conflict
tests/acpi: Update DSDT.cxl with QTG DSM
hw/cxl: Add QTG _DSM support for ACPI0017 device
tests/acpi: Allow update of DSDT.cxl
hw/i386/cxl: ensure maxram is greater than ram size for calculating cxl range
vhost-user: fix lost reconnect
vhost-user-scsi: start vhost when guest kicks
vhost-user-scsi: support reconnect to backend
vhost: move and rename the conn retry times
vhost-user-common: send get_inflight_fd once
hw/i386/pc_piix: Make PIIX4 south bridge usable in PC machine
hw/isa/piix: Implement multi-process QEMU support also for PIIX4
hw/isa/piix: Resolve duplicate code regarding PCI interrupt wiring
hw/isa/piix: Reuse PIIX3's PCI interrupt triggering in PIIX4
hw/isa/piix: Rename functions to be shared for PCI interrupt triggering
hw/isa/piix: Reuse PIIX3 base class' realize method in PIIX4
hw/isa/piix: Share PIIX3's base class with PIIX4
hw/isa/piix: Harmonize names of reset control memory regions
...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

show more ...


# f02a4b8e 08-Oct-2023 Li Feng <fengli@smartx.com>

vhost-user: fix lost reconnect

When the vhost-user is reconnecting to the backend, and if the vhost-user fails
at the get_features in vhost_dev_init(), then the reconnect will fail
and it will not b

vhost-user: fix lost reconnect

When the vhost-user is reconnecting to the backend, and if the vhost-user fails
at the get_features in vhost_dev_init(), then the reconnect will fail
and it will not be retriggered forever.

The reason is:
When the vhost-user fails at get_features, the vhost_dev_cleanup will be called
immediately.

vhost_dev_cleanup calls 'memset(hdev, 0, sizeof(struct vhost_dev))'.

The reconnect path is:
vhost_user_blk_event
vhost_user_async_close(.. vhost_user_blk_disconnect ..)
qemu_chr_fe_set_handlers <----- clear the notifier callback
schedule vhost_user_async_close_bh

The vhost->vdev is null, so the vhost_user_blk_disconnect will not be
called, then the event fd callback will not be reinstalled.

All vhost-user devices have this issue, including vhost-user-blk/scsi.

With this patch, if the vdev->vdev is null, the fd callback will still
be reinstalled.

Fixes: 71e076a07d ("hw/virtio: generalise CHR_EVENT_CLOSED handling")

Signed-off-by: Li Feng <fengli@smartx.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20231009044735.941655-6-fengli@smartx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

show more ...


# 22d2464f 03-Oct-2023 Stefan Hajnoczi <stefanha@redhat.com>

vhost-user: do not send RESET_OWNER on device reset

The VHOST_USER_RESET_OWNER message is deprecated in the spec:

This is no longer used. Used to be sent to request disabling all
rings, but s

vhost-user: do not send RESET_OWNER on device reset

The VHOST_USER_RESET_OWNER message is deprecated in the spec:

This is no longer used. Used to be sent to request disabling all
rings, but some back-ends interpreted it to also discard connection
state (this interpretation would lead to bugs). It is recommended
that back-ends either ignore this message, or use it to disable all
rings.

The only caller of vhost_user_reset_device() is vhost_user_scsi_reset().
It checks that F_RESET_DEVICE was negotiated before calling it:

static void vhost_user_scsi_reset(VirtIODevice *vdev)
{
VHostSCSICommon *vsc = VHOST_SCSI_COMMON(vdev);
struct vhost_dev *dev = &vsc->dev;

/*
* Historically, reset was not implemented so only reset devices
* that are expecting it.
*/
if (!virtio_has_feature(dev->protocol_features,
VHOST_USER_PROTOCOL_F_RESET_DEVICE)) {
return;
}

if (dev->vhost_ops->vhost_reset_device) {
dev->vhost_ops->vhost_reset_device(dev);
}
}

Therefore VHOST_USER_RESET_OWNER is actually never sent by
vhost_user_reset_device(). Remove the dead code. This effectively moves
the vhost-user protocol specific code from vhost-user-scsi.c into
vhost-user.c where it belongs.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20231004014532.1228637-2-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>

show more ...


# d7dc0682 02-Oct-2023 Laszlo Ersek <lersek@redhat.com>

vhost-user: call VHOST_USER_SET_VRING_ENABLE synchronously

(1) The virtio-1.2 specification
<http://docs.oasis-open.org/virtio/virtio/v1.2/virtio-v1.2.html> writes:

> 3 General Initialization A

vhost-user: call VHOST_USER_SET_VRING_ENABLE synchronously

(1) The virtio-1.2 specification
<http://docs.oasis-open.org/virtio/virtio/v1.2/virtio-v1.2.html> writes:

> 3 General Initialization And Device Operation
> 3.1 Device Initialization
> 3.1.1 Driver Requirements: Device Initialization
>
> [...]
>
> 7. Perform device-specific setup, including discovery of virtqueues for
> the device, optional per-bus setup, reading and possibly writing the
> device’s virtio configuration space, and population of virtqueues.
>
> 8. Set the DRIVER_OK status bit. At this point the device is “live”.

and

> 4 Virtio Transport Options
> 4.1 Virtio Over PCI Bus
> 4.1.4 Virtio Structure PCI Capabilities
> 4.1.4.3 Common configuration structure layout
> 4.1.4.3.2 Driver Requirements: Common configuration structure layout
>
> [...]
>
> The driver MUST configure the other virtqueue fields before enabling the
> virtqueue with queue_enable.
>
> [...]

(The same statements are present in virtio-1.0 identically, at
<http://docs.oasis-open.org/virtio/virtio/v1.0/virtio-v1.0.html>.)

These together mean that the following sub-sequence of steps is valid for
a virtio-1.0 guest driver:

(1.1) set "queue_enable" for the needed queues as the final part of device
initialization step (7),

(1.2) set DRIVER_OK in step (8),

(1.3) immediately start sending virtio requests to the device.

(2) When vhost-user is enabled, and the VHOST_USER_F_PROTOCOL_FEATURES
special virtio feature is negotiated, then virtio rings start in disabled
state, according to
<https://qemu-project.gitlab.io/qemu/interop/vhost-user.html#ring-states>.
In this case, explicit VHOST_USER_SET_VRING_ENABLE messages are needed for
enabling vrings.

Therefore setting "queue_enable" from the guest (1.1) -- which is
technically "buffered" on the QEMU side until the guest sets DRIVER_OK
(1.2) -- is a *control plane* operation, which -- after (1.2) -- travels
from the guest through QEMU to the vhost-user backend, using a unix domain
socket.

Whereas sending a virtio request (1.3) is a *data plane* operation, which
evades QEMU -- it travels from guest to the vhost-user backend via
eventfd.

This means that operations ((1.1) + (1.2)) and (1.3) travel through
different channels, and their relative order can be reversed, as perceived
by the vhost-user backend.

That's exactly what happens when OVMF's virtiofs driver (VirtioFsDxe) runs
against the Rust-language virtiofsd version 1.7.2. (Which uses version
0.10.1 of the vhost-user-backend crate, and version 0.8.1 of the vhost
crate.)

Namely, when VirtioFsDxe binds a virtiofs device, it goes through the
device initialization steps (i.e., control plane operations), and
immediately sends a FUSE_INIT request too (i.e., performs a data plane
operation). In the Rust-language virtiofsd, this creates a race between
two components that run *concurrently*, i.e., in different threads or
processes:

- Control plane, handling vhost-user protocol messages:

The "VhostUserSlaveReqHandlerMut::set_vring_enable" method
[crates/vhost-user-backend/src/handler.rs] handles
VHOST_USER_SET_VRING_ENABLE messages, and updates each vring's "enabled"
flag according to the message processed.

- Data plane, handling virtio / FUSE requests:

The "VringEpollHandler::handle_event" method
[crates/vhost-user-backend/src/event_loop.rs] handles the incoming
virtio / FUSE request, consuming the virtio kick at the same time. If
the vring's "enabled" flag is set, the virtio / FUSE request is
processed genuinely. If the vring's "enabled" flag is clear, then the
virtio / FUSE request is discarded.

Note that OVMF enables the queue *first*, and sends FUSE_INIT *second*.
However, if the data plane processor in virtiofsd wins the race, then it
sees the FUSE_INIT *before* the control plane processor took notice of
VHOST_USER_SET_VRING_ENABLE and green-lit the queue for the data plane
processor. Therefore the latter drops FUSE_INIT on the floor, and goes
back to waiting for further virtio / FUSE requests with epoll_wait.
Meanwhile OVMF is stuck waiting for the FUSET_INIT response -- a deadlock.

The deadlock is not deterministic. OVMF hangs infrequently during first
boot. However, OVMF hangs almost certainly during reboots from the UEFI
shell.

The race can be "reliably masked" by inserting a very small delay -- a
single debug message -- at the top of "VringEpollHandler::handle_event",
i.e., just before the data plane processor checks the "enabled" field of
the vring. That delay suffices for the control plane processor to act upon
VHOST_USER_SET_VRING_ENABLE.

We can deterministically prevent the race in QEMU, by blocking OVMF inside
step (1.2) -- i.e., in the write to the device status register that
"unleashes" queue enablement -- until VHOST_USER_SET_VRING_ENABLE actually
*completes*. That way OVMF's VCPU cannot advance to the FUSE_INIT
submission before virtiofsd's control plane processor takes notice of the
queue being enabled.

Wait for VHOST_USER_SET_VRING_ENABLE completion by:

- setting the NEED_REPLY flag on VHOST_USER_SET_VRING_ENABLE, and waiting
for the reply, if the VHOST_USER_PROTOCOL_F_REPLY_ACK vhost-user feature
has been negotiated, or

- performing a separate VHOST_USER_GET_FEATURES *exchange*, which requires
a backend response regardless of VHOST_USER_PROTOCOL_F_REPLY_ACK.

Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost)
Cc: Eugenio Perez Martin <eperezma@redhat.com>
Cc: German Maglione <gmaglione@redhat.com>
Cc: Liu Jiang <gerry@linux.alibaba.com>
Cc: Sergio Lopez Pascual <slp@redhat.com>
Cc: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Tested-by: Albert Esteve <aesteve@redhat.com>
[lersek@redhat.com: work Eugenio's explanation into the commit message,
about QEMU containing step (1.1) until step (1.2)]
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20231002203221.17241-8-lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

show more ...


# 75b6b6da 02-Oct-2023 Laszlo Ersek <lersek@redhat.com>

vhost-user: allow "vhost_set_vring" to wait for a reply

The "vhost_set_vring" function already centralizes the common parts of
"vhost_user_set_vring_num", "vhost_user_set_vring_base" and
"vhost_user

vhost-user: allow "vhost_set_vring" to wait for a reply

The "vhost_set_vring" function already centralizes the common parts of
"vhost_user_set_vring_num", "vhost_user_set_vring_base" and
"vhost_user_set_vring_enable". We'll want to allow some of those callers
to wait for a reply.

Therefore, rebase "vhost_set_vring" from just "vhost_user_write" to
"vhost_user_write_sync", exposing the "wait_for_reply" parameter.

This is purely refactoring -- there is no observable change. That's
because:

- all three callers pass in "false" for "wait_for_reply", which disables
all logic in "vhost_user_write_sync" except the call to
"vhost_user_write";

- the fds=NULL and fd_num=0 arguments of the original "vhost_user_write"
call inside "vhost_set_vring" are hard-coded within
"vhost_user_write_sync".

Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost)
Cc: Eugenio Perez Martin <eperezma@redhat.com>
Cc: German Maglione <gmaglione@redhat.com>
Cc: Liu Jiang <gerry@linux.alibaba.com>
Cc: Sergio Lopez Pascual <slp@redhat.com>
Cc: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Tested-by: Albert Esteve <aesteve@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20231002203221.17241-7-lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

show more ...


# df3b2abc 02-Oct-2023 Laszlo Ersek <lersek@redhat.com>

vhost-user: hoist "write_sync", "get_features", "get_u64"

In order to avoid a forward-declaration for "vhost_user_write_sync" in a
subsequent patch, hoist "vhost_user_write_sync" ->
"vhost_user_get_

vhost-user: hoist "write_sync", "get_features", "get_u64"

In order to avoid a forward-declaration for "vhost_user_write_sync" in a
subsequent patch, hoist "vhost_user_write_sync" ->
"vhost_user_get_features" -> "vhost_user_get_u64" just above
"vhost_set_vring".

This is purely code movement -- no observable change.

Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost)
Cc: Eugenio Perez Martin <eperezma@redhat.com>
Cc: German Maglione <gmaglione@redhat.com>
Cc: Liu Jiang <gerry@linux.alibaba.com>
Cc: Sergio Lopez Pascual <slp@redhat.com>
Cc: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Tested-by: Albert Esteve <aesteve@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20231002203221.17241-6-lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

show more ...


# 99ad9ec8 02-Oct-2023 Laszlo Ersek <lersek@redhat.com>

vhost-user: flatten "enforce_reply" into "vhost_user_write_sync"

At this point, only "vhost_user_write_sync" calls "enforce_reply"; embed
the latter into the former.

This is purely refactoring -- n

vhost-user: flatten "enforce_reply" into "vhost_user_write_sync"

At this point, only "vhost_user_write_sync" calls "enforce_reply"; embed
the latter into the former.

This is purely refactoring -- no observable change.

Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost)
Cc: Eugenio Perez Martin <eperezma@redhat.com>
Cc: German Maglione <gmaglione@redhat.com>
Cc: Liu Jiang <gerry@linux.alibaba.com>
Cc: Sergio Lopez Pascual <slp@redhat.com>
Cc: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Tested-by: Albert Esteve <aesteve@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20231002203221.17241-5-lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

show more ...


1234567891011