9cd1fd4b | 28-Aug-2024 |
Fabiano Rosas <farosas@suse.de> |
migration/multifd: Fix p->iov leak in multifd-uadk.c
The send_cleanup() hook should free the p->iov that was allocated at send_setup(). This was missed because the UADK code is conditional on the pr
migration/multifd: Fix p->iov leak in multifd-uadk.c
The send_cleanup() hook should free the p->iov that was allocated at send_setup(). This was missed because the UADK code is conditional on the presence of the accelerator, so it's not tested by default.
Fixes: 819dd20636 ("migration/multifd: Add UADK initialization") Reported-by: Peter Xu <peterx@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> (cherry picked from commit 405e352d28c24991cacfdebccf67d56c4795cf6e) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
show more ...
|
9b42e33b | 17-Sep-2024 |
Fabiano Rosas <farosas@suse.de> |
migration/multifd: Fix rb->receivedmap cleanup race
Fix a segmentation fault in multifd when rb->receivedmap is cleared too early.
After commit 5ef7e26bdb ("migration/multifd: solve zero page causi
migration/multifd: Fix rb->receivedmap cleanup race
Fix a segmentation fault in multifd when rb->receivedmap is cleared too early.
After commit 5ef7e26bdb ("migration/multifd: solve zero page causing multiple page faults"), multifd started using the rb->receivedmap bitmap, which belongs to ram.c and is initialized and *freed* from the ram SaveVMHandlers.
Multifd threads are live until migration_incoming_state_destroy(), which is called after qemu_loadvm_state_cleanup(), leading to a crash when accessing rb->receivedmap.
process_incoming_migration_co() ... qemu_loadvm_state() multifd_nocomp_recv() qemu_loadvm_state_cleanup() ramblock_recv_bitmap_set_offset() rb->receivedmap = NULL set_bit_atomic(..., rb->receivedmap) ... migration_incoming_state_destroy() multifd_recv_cleanup() multifd_recv_terminate_threads(NULL)
Move the loadvm cleanup into migration_incoming_state_destroy(), after multifd_recv_cleanup() to ensure multifd threads have already exited when rb->receivedmap is cleared.
Adjust the postcopy listen thread comment to indicate that we still want to skip the cpu synchronization.
CC: qemu-stable@nongnu.org Fixes: 5ef7e26bdb ("migration/multifd: solve zero page causing multiple page faults") Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240917185802.15619-3-farosas@suse.de [peterx: added comment in migration_incoming_state_destroy()] Signed-off-by: Peter Xu <peterx@redhat.com> (cherry picked from commit 4ce56229087860805877075ddb29dd44578365a9) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
show more ...
|
0bd5b928 | 01-Aug-2024 |
Fabiano Rosas <farosas@suse.de> |
migration/multifd: Fix multifd_send_setup cleanup when channel creation fails
When a channel fails to create, the code currently just returns. This is wrong for two reasons:
1) Channel n+1 will not
migration/multifd: Fix multifd_send_setup cleanup when channel creation fails
When a channel fails to create, the code currently just returns. This is wrong for two reasons:
1) Channel n+1 will not get to initialize it's semaphores, leading to an assert when terminate_threads tries to post to it:
qemu-system-x86_64: ../util/qemu-thread-posix.c:92: qemu_mutex_lock_impl: Assertion `mutex->initialized' failed.
2) (theoretical) If channel n-1 already started creation it will defeat the purpose of the channels_created logic which is in place to avoid migrate_fd_cleanup() to run while channels are still being created.
This cannot really happen today because the current failure cases for multifd_new_send_channel_create() are all synchronous, resulting from qio_channel_file_new_path() getting a bad filename. This would hit all channels equally.
But I don't want to set a trap for future people, so have all channels try to create (even if failing), and only fail after the channels_created semaphore has been posted.
While here, remove the error_report_err call. There's one already at migrate_fd_cleanup later on.
Cc: qemu-stable@nongnu.org Reported-by: Jim Fehlig <jfehlig@suse.com> Fixes: b7b03eb614 ("migration/multifd: Add outgoing QIOChannelFile support") Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
show more ...
|
84ac6fa1 | 01-Aug-2024 |
Fabiano Rosas <farosas@suse.de> |
migration: Fix cleanup of iochannel in file migration
The QIOChannelFile object already has its reference decremented by g_autoptr. Trying to unref an extra time causes:
ERROR:../qom/object.c:1241:
migration: Fix cleanup of iochannel in file migration
The QIOChannelFile object already has its reference decremented by g_autoptr. Trying to unref an extra time causes:
ERROR:../qom/object.c:1241:object_unref: assertion failed: (obj->ref > 0)
Fixes: a701c03dec ("migration: Drop reference to QIOChannel if file seeking fails") Fixes: 6d3279655a ("migration: Fix file migration with fdset") Reported-by: Jim Fehlig <jfehlig@suse.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
show more ...
|
4146b77e | 19-Jun-2024 |
Peter Xu <peterx@redhat.com> |
migration/postcopy: Add postcopy-recover-setup phase
This patch adds a migration state on src called "postcopy-recover-setup". The new state will describe the intermediate step starting from when th
migration/postcopy: Add postcopy-recover-setup phase
This patch adds a migration state on src called "postcopy-recover-setup". The new state will describe the intermediate step starting from when the src QEMU received a postcopy recovery request, until the migration channels are properly established, but before the recovery process take place.
The request came from Libvirt where Libvirt currently rely on the migration state events to detect migration state changes. That works for most of the migration process but except postcopy recovery failures at the beginning.
Currently postcopy recovery only has two major states:
- postcopy-paused: this is the state that both sides of QEMU will be in for a long time as long as the migration channel was interrupted.
- postcopy-recover: this is the state where both sides of QEMU handshake with each other, preparing for a continuation of postcopy which used to be interrupted.
The issue here is when the recovery port is invalid, the src QEMU will take the URI/channels, noticing the ports are not valid, and it'll silently keep in the postcopy-paused state, with no event sent to Libvirt. In this case, the only thing Libvirt can do is to poll the migration status with a proper interval, however that's less optimal.
Considering that this is the only case where Libvirt won't get a notification from QEMU on such events, let's add postcopy-recover-setup state to mimic what we have with the "setup" state of a newly initialized migration, describing the phase of connection establishment.
With that, postcopy recovery will have two paths to go now, and either path will guarantee an event generated. Now the events will look like this during a recovery process on src QEMU:
- Initially when the recovery is initiated on src, QEMU will go from "postcopy-paused" -> "postcopy-recover-setup". Old QEMUs don't have this event.
- Depending on whether the channel re-establishment is succeeded:
- In succeeded case, src QEMU will move from "postcopy-recover-setup" to "postcopy-recover". Old QEMUs also have this event.
- In failure case, src QEMU will move from "postcopy-recover-setup" to "postcopy-paused" again. Old QEMUs don't have this event.
This guarantees that Libvirt will always receive a notification for recovery process properly.
One thing to mention is, such new status is only needed on src QEMU not both. On dest QEMU, the state machine doesn't change. Hence the events don't change either. It's done like so because dest QEMU may not have an explicit point of setup start. E.g., it can happen that when dest QEMUs doesn't use migrate-recover command to use a new URI/channel, but the old URI/channels can be reused in recovery, in which case the old ports simply can work again after the network routes are fixed up.
Add a new helper postcopy_is_paused() detecting whether postcopy is still paused, taking RECOVER_SETUP into account too. When using it on both src/dst, a slight change is done altogether to always wait for the semaphore before checking the status, because for both sides a sem_post() will be required for a recovery.
Cc: Jiri Denemark <jdenemar@redhat.com> Cc: Prasad Pandit <ppandit@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Buglink: https://issues.redhat.com/browse/RHEL-38485 Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
show more ...
|
4dd5f7b8 | 19-Jun-2024 |
Peter Xu <peterx@redhat.com> |
migration: Cleanup incoming migration setup state change
Destination QEMU can setup incoming ports for two purposes: either a fresh new incoming migration, in which QEMU will switch to SETUP for cha
migration: Cleanup incoming migration setup state change
Destination QEMU can setup incoming ports for two purposes: either a fresh new incoming migration, in which QEMU will switch to SETUP for channel establishment, or a paused postcopy migration, in which QEMU will stay in POSTCOPY_PAUSED until kicking off the RECOVER phase.
Now the state machine worked on dest node for the latter, only because migrate_set_state() implicitly will become a noop if the current state check failed. It wasn't clear at all.
Clean it up by providing a helper migration_incoming_state_setup() doing proper checks over current status. Postcopy-paused will be explicitly checked now, and then we can bail out for unknown states.
Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
show more ...
|
a5c24e13 | 19-Jun-2024 |
Peter Xu <peterx@redhat.com> |
migration: Use MigrationStatus instead of int
QEMU uses "int" in most cases even if it stores MigrationStatus. I don't know why, so let's try to do that right and see what blows up..
Reviewed-by:
migration: Use MigrationStatus instead of int
QEMU uses "int" in most cases even if it stores MigrationStatus. I don't know why, so let's try to do that right and see what blows up..
Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
show more ...
|
60ce4767 | 19-Jun-2024 |
Peter Xu <peterx@redhat.com> |
migration: Rename thread debug names
The postcopy thread names on dest QEMU are slightly confusing, partly I'll need to blame myself on 36f62f11e4 ("migration: Postcopy preemption preparation on cha
migration: Rename thread debug names
The postcopy thread names on dest QEMU are slightly confusing, partly I'll need to blame myself on 36f62f11e4 ("migration: Postcopy preemption preparation on channel creation"). E.g., "fault-fast" reads like a fast version of "fault-default", but it's actually the fast version of "postcopy/listen".
Taking this chance, rename all the migration threads with proper rules. Considering we only have 15 chars usable, prefix all threads with "mig/", meanwhile identify src/dst threads properly this time. So now most thread names will look like "mig/DIR/xxx", where DIR will be "src"/"dst", except the bg-snapshot thread which doesn't have a direction.
For multifd threads, making them "mig/{src|dst}/{send|recv}_%d".
We used to have "live_migration" thread for a very long time, now it's called "mig/src/main". We may hope to have "mig/dst/main" soon but not yet.
Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Zhijian Li (Fujitsu) <lizhijian@fujitsu.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
show more ...
|
637280ae | 19-Jun-2024 |
Peter Xu <peterx@redhat.com> |
migration/multifd: Avoid the final FLUSH in complete()
We always do the flush when finishing one round of scan, and during complete() phase we should scan one more round making sure no dirty page ex
migration/multifd: Avoid the final FLUSH in complete()
We always do the flush when finishing one round of scan, and during complete() phase we should scan one more round making sure no dirty page existed. In that case we shouldn't need one explicit FLUSH at the end of complete(), as when reaching there all pages should have been flushed.
Reviewed-by: Fabiano Rosas <farosas@suse.de> Tested-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
show more ...
|
9d70239e | 17-Jun-2024 |
Fabiano Rosas <farosas@suse.de> |
migration/multifd: Add direct-io support
When multifd is used along with mapped-ram, we can take benefit of a filesystem that supports the O_DIRECT flag and perform direct I/O in the multifd threads
migration/multifd: Add direct-io support
When multifd is used along with mapped-ram, we can take benefit of a filesystem that supports the O_DIRECT flag and perform direct I/O in the multifd threads. This brings a significant performance improvement because direct-io writes bypass the page cache which would otherwise be thrashed by the multifd data which is unlikely to be needed again in a short period of time.
To be able to use a multifd channel opened with O_DIRECT, we must ensure that a certain aligment is used. Filesystems usually require a block-size alignment for direct I/O. The way to achieve this is by enabling the mapped-ram feature, which already aligns its I/O properly (see MAPPED_RAM_FILE_OFFSET_ALIGNMENT at ram.c).
By setting O_DIRECT on the multifd channels, all writes to the same file descriptor need to be aligned as well, even the ones that come from outside multifd, such as the QEMUFile I/O from the main migration code. This makes it impossible to use the same file descriptor for the QEMUFile and for the multifd channels. The various flags and metadata written by the main migration code will always be unaligned by virtue of their small size. To workaround this issue, we'll require a second file descriptor to be used exclusively for direct I/O.
The second file descriptor can be obtained by QEMU by re-opening the migration file (already possible), or by being provided by the user or management application (support to be added in future patches).
Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
show more ...
|
6d327965 | 17-Jun-2024 |
Fabiano Rosas <farosas@suse.de> |
migration: Fix file migration with fdset
When the "file:" migration support was added we missed the special case in the qemu_open_old implementation that allows for a particular file name format to
migration: Fix file migration with fdset
When the "file:" migration support was added we missed the special case in the qemu_open_old implementation that allows for a particular file name format to be used to refer to a set of file descriptors that have been previously provided to QEMU via the add-fd QMP command.
When using this fdset feature, we should not truncate the migration file because being given an fd means that the management layer is in control of the file and will likely already have some data written to it. This is further indicated by the presence of the 'offset' argument, which indicates the start of the region where QEMU is allowed to write.
Fix the issue by replacing the O_TRUNC flag on open by an ftruncate call, which will take the offset into consideration.
Fixes: 385f510df5 ("migration: file URI offset") Suggested-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Prasad Pandit <pjp@fedoraproject.org> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
show more ...
|
c1dfd121 | 07-Jun-2024 |
Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> |
migration/multifd: Switch to no compression when no hardware support
Send raw packets over if UADK hardware support is not available. This is to satisfy Qemu qtest CI which may run on platforms tha
migration/multifd: Switch to no compression when no hardware support
Send raw packets over if UADK hardware support is not available. This is to satisfy Qemu qtest CI which may run on platforms that don't have UADK hardware support. Subsequent patch will add support for uadk migration qtest.
Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org> Signed-off-by: Fabiano Rosas <farosas@suse.de>
show more ...
|
3c49191a | 07-Jun-2024 |
Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> |
migration/multifd: Add UADK based compression and decompression
Uses UADK wd_do_comp_sync() API to (de)compress a normal page using hardware accelerator.
Reviewed-by: Fabiano Rosas <farosas@suse.de
migration/multifd: Add UADK based compression and decompression
Uses UADK wd_do_comp_sync() API to (de)compress a normal page using hardware accelerator.
Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org> Signed-off-by: Fabiano Rosas <farosas@suse.de>
show more ...
|
819dd206 | 07-Jun-2024 |
Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> |
migration/multifd: Add UADK initialization
Initialize UADK session and allocate buffers required. The actual compression/decompression will only be done in a subsequent patch.
Signed-off-by: Shamee
migration/multifd: Add UADK initialization
Initialize UADK session and allocate buffers required. The actual compression/decompression will only be done in a subsequent patch.
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org> Signed-off-by: Fabiano Rosas <farosas@suse.de>
show more ...
|
f6fe9fea | 10-Jun-2024 |
Yuan Liu <yuan1.liu@intel.com> |
migration/multifd: implement qpl compression and decompression
QPL compression and decompression will use IAA hardware path if the IAA hardware is available. Otherwise the QPL library software path
migration/multifd: implement qpl compression and decompression
QPL compression and decompression will use IAA hardware path if the IAA hardware is available. Otherwise the QPL library software path is used.
The hardware path will automatically fall back to QPL software path if the IAA queues are busy. In some scenarios, this may happen frequently, such as configuring 4 channels but only one IAA device is available. In the case of insufficient IAA hardware resources, retry and fallback can help optimize performance:
1. Retry + SW fallback: total time: 14649 ms downtime: 25 ms throughput: 17666.57 mbps pages-per-second: 1509647
2. No fallback, always wait for work queues to become available total time: 18381 ms downtime: 25 ms throughput: 13698.65 mbps pages-per-second: 859607
If both the hardware and software paths fail, the uncompressed page is sent directly.
Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Reviewed-by: Nanhai Zou <nanhai.zou@intel.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Fabiano Rosas <farosas@suse.de>
show more ...
|
34e104b8 | 10-Jun-2024 |
Yuan Liu <yuan1.liu@intel.com> |
migration/multifd: implement initialization of qpl compression
during initialization, a software job is allocated to each channel for software path fallabck when the IAA hardware is unavailable or t
migration/multifd: implement initialization of qpl compression
during initialization, a software job is allocated to each channel for software path fallabck when the IAA hardware is unavailable or the hardware job submission fails. If the IAA hardware is available, multiple hardware jobs are allocated for batch processing.
Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Reviewed-by: Nanhai Zou <nanhai.zou@intel.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Fabiano Rosas <farosas@suse.de>
show more ...
|
354cac28 | 10-Jun-2024 |
Yuan Liu <yuan1.liu@intel.com> |
migration/multifd: add qpl compression method
add the Query Processing Library (QPL) compression method
Introduce the qpl as a new multifd migration compression method, it can use In-Memory Analyti
migration/multifd: add qpl compression method
add the Query Processing Library (QPL) compression method
Introduce the qpl as a new multifd migration compression method, it can use In-Memory Analytics Accelerator(IAA) to accelerate compression and decompression, which can not only reduce network bandwidth requirement but also reduce host compression and decompression CPU overhead.
How to enable qpl compression during migration: migrate_set_parameter multifd-compression qpl
There is no qpl compression level parameter added since it only supports level one, users do not need to specify the qpl compression level.
Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Reviewed-by: Nanhai Zou <nanhai.zou@intel.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> [fixed docs spacing in migration.json] Signed-off-by: Fabiano Rosas <farosas@suse.de>
show more ...
|