History log of /openbmc/qemu/migration/migration.c (Results 1 – 25 of 983)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v9.2.0, v9.1.2
# 43f2def6 15-Nov-2024 Peter Maydell <peter.maydell@linaro.org>

Merge tag 'migration-20241113-pull-request' of https://gitlab.com/peterx/qemu into staging

Migration pull request for 9.2-rc1

- Dmitry's small patch to quiesce a warning in possible uint32_t overfl

Merge tag 'migration-20241113-pull-request' of https://gitlab.com/peterx/qemu into staging

Migration pull request for 9.2-rc1

- Dmitry's small patch to quiesce a warning in possible uint32_t overflow
- Peter's fix on a recent regression to iotests (and potentially elsewhere)

# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZzUJDxIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wbuEQEAoRiMlHo4cfRTuViRoGHwFkqboLdyqsC9
# ZtDcZ2efgMQBAMR65KSgEAakbm3h4fuj0lUTkg564Wi9XkBleo/w3tAD
# =uIeB
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 13 Nov 2024 20:16:15 GMT
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal]
# gpg: aka "Peter Xu <peterx@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706

* tag 'migration-20241113-pull-request' of https://gitlab.com/peterx/qemu:
migration: fix-possible-int-overflow
migration: Check current_migration in migration_is_running()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

show more ...


# 4daff81e 05-Nov-2024 Peter Xu <peterx@redhat.com>

migration: Check current_migration in migration_is_running()

Report shows that commit 34a8892dec broke iotest 055:

https://lore.kernel.org/r/b8806360-a2b6-4608-83a3-db67e264c733@linaro.org

Denis R

migration: Check current_migration in migration_is_running()

Report shows that commit 34a8892dec broke iotest 055:

https://lore.kernel.org/r/b8806360-a2b6-4608-83a3-db67e264c733@linaro.org

Denis Rastyogin reported more such issue:

https://lore.kernel.org/r/20241107114256.106831-1-gerben@altlinux.org

In this merge, the migration_is_idle() function was replaced with
migrate_is_running(). However, the null pointer check for `s` was
removed, leading to a dereference of `s` when using qemu-system-x86_64
-hda *.vdi.

When replacing migration_is_idle() with "!migration_is_running()", it was
overlooked that the idle helper also checks for current_migration being
available first. Sample stack dump:

migration_is_running
is_busy
migrate_add_blocker_modes
migrate_add_blocker_normal
vmdk_open
bdrv_open_driver
bdrv_open_common
bdrv_open_inherit
bdrv_open
blk_new_open
blockdev_init
drive_new
drive_init_func
qemu_opts_foreach
configure_blockdev
qemu_create_early_backends
qemu_init
main

The check would be there if the whole series was applied, but since the
last patches in the previous series rely on some other patches to land
first, we need to recover the behavior of migration_is_idle() first before
that whole set will be merged.

I left migration_is_active / migration_is_device alone, as I don't think
it's possible for them to hit uninitialized current_migration. Also they're
prone to removal soon from VFIO side.

Cc: Peter Maydell <peter.maydell@linaro.org>
Fixes: 34a8892dec ("migration: Drop migration_is_idle()")
Reported-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reported-by: Denis Rastyogin <gerben@altlinux.org>
Tested-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Tested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20241105182725.2393425-1-peterx@redhat.com
[peterx: enhance commit msg]
Signed-off-by: Peter Xu <peterx@redhat.com>

show more ...


# cbad4551 04-Nov-2024 Peter Maydell <peter.maydell@linaro.org>

Merge tag 'migration-20241030-pull-request' of https://gitlab.com/peterx/qemu into staging

Migration pull request for softfreeze

v2:
- Patch "migration: Move cpu-throttle.c from system to migration

Merge tag 'migration-20241030-pull-request' of https://gitlab.com/peterx/qemu into staging

Migration pull request for softfreeze

v2:
- Patch "migration: Move cpu-throttle.c from system to migration",
fix build on MacOS, and subject spelling

NOTE: checkpatch.pl could report a false positive on this branch:

WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#21:
{include/sysemu => migration}/cpu-throttle.h | 0

That's covered by "F: migration/" entry.

Changelog:

- Peter's cleanup patch on migrate_fd_cleanup()
- Peter's cleanup patch to introduce thread name macros
- Hanna's error path fix for vmstate subsection save()s
- Hyman's auto converge enhancement on background dirty sync
- Peter's additional tracepoints for save state entries
- Thomas's build fix for OpenBSD in dirtyrate.c
- Peter's deprecation of query-migrationthreads command
- Peter's cleanup/fixes from the "export misc.h" series
- Maciej's two small patches from multifd+vfio series

# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZyTbVRIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wan3wD+L4TVNDc34Hy4mvWu7u1lCOePX0GBdUEc
# oEeBGblwbrcBAIR8d+5z9O5YcWH1coozG1aUC4qCtSHHk5TGbJk4/UUD
# =XB5Q
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 01 Nov 2024 13:44:53 GMT
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal]
# gpg: aka "Peter Xu <peterx@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706

* tag 'migration-20241030-pull-request' of https://gitlab.com/peterx/qemu:
migration/multifd: Zero p->flags before starting filling a packet
migration/ram: Add load start trace event
migration: Drop migration_is_idle()
migration: Drop migration_is_setup_or_active()
migration: Unexport ram_mig_init()
migration: Unexport dirty_bitmap_mig_init()
migration: Take migration object refcount earlier for threads
migration: Deprecate query-migrationthreads command
migration/dirtyrate: Silence warning about strcpy() on OpenBSD
tests/migration: Add case for periodic ramblock dirty sync
migration: Support periodic RAMBlock dirty bitmap sync
migration: Remove "rs" parameter in migration_bitmap_sync_precopy
migration: Move cpu-throttle.c from system to migration
migration: Stop CPU throttling conditionally
accel/tcg/icount-common: Remove the reference to the unused header file
migration: Ensure vmstate_save() sets errp
migration: Put thread names together with macros
migration: Cleanup migrate_fd_cleanup() on accessing to_dst_file

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

show more ...


# 34a8892d 24-Oct-2024 Peter Xu <peterx@redhat.com>

migration: Drop migration_is_idle()

Now with the current migration_is_running(), it will report exactly the
opposite of what will be reported by migration_is_idle().

Drop migration_is_idle(), inste

migration: Drop migration_is_idle()

Now with the current migration_is_running(), it will report exactly the
opposite of what will be reported by migration_is_idle().

Drop migration_is_idle(), instead use "!migration_is_running()" which
should be identical on functionality.

In reality, most of the idle check is inverted, so it's even easier to
write with "migrate_is_running()" check.

Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20241024213056.1395400-6-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>

show more ...


# f018eb62 24-Oct-2024 Peter Xu <peterx@redhat.com>

migration: Drop migration_is_setup_or_active()

This helper is mostly the same as migration_is_running(), except that one
has COLO reported as true, the other has CANCELLING reported as true.

Per my

migration: Drop migration_is_setup_or_active()

This helper is mostly the same as migration_is_running(), except that one
has COLO reported as true, the other has CANCELLING reported as true.

Per my past years experience on the state changes, none of them should
matter.

To make it slightly safer, report both COLO || CANCELLING to be true in
migration_is_running(), then drop the other one. We kept the 1st only
because the name is simpler, and clear enough.

Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20241024213056.1395400-5-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>

show more ...


# 7fc8beb1 24-Oct-2024 Peter Xu <peterx@redhat.com>

migration: Take migration object refcount earlier for threads

Both migration thread or background snapshot thread will take a refcount of
the migration object at the entrace of the thread function.

migration: Take migration object refcount earlier for threads

Both migration thread or background snapshot thread will take a refcount of
the migration object at the entrace of the thread function.

That makes sense, because it protects the object from being freed by the
main thread in migration_shutdown() later, but it might still race with it
if the thread is scheduled too late. Consider the case right after
pthread_create() happened, VM shuts down with the object released, but
right after that the migration thread finally got created, referencing
MigrationState* in the opaque pointer which is already freed.

The only 100% safe way to make sure it won't get freed is taking the
refcount right before the thread is created, meanwhile when BQL is held.

Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20241024213056.1395400-2-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>

show more ...


Revision tags: v9.1.1
# 52ac968a 17-Oct-2024 Hyman Huang <yong.huang@smartx.com>

migration: Support periodic RAMBlock dirty bitmap sync

When VM is configured with huge memory, the current throttle logic
doesn't look like to scale, because migration_trigger_throttle()
is only cal

migration: Support periodic RAMBlock dirty bitmap sync

When VM is configured with huge memory, the current throttle logic
doesn't look like to scale, because migration_trigger_throttle()
is only called for each iteration, so it won't be invoked for a long
time if one iteration can take a long time.

The periodic dirty sync aims to fix the above issue by synchronizing
the ramblock from remote dirty bitmap and, when necessary, triggering
the CPU throttle multiple times during a long iteration.

This is a trade-off between synchronization overhead and CPU throttle
impact.

Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/f61f1b3653f2acf026901103e1c73d157d38b08f.1729146786.git.yong.huang@smartx.com
[peterx: make prev_cnt global, and reset for each migration]
Signed-off-by: Peter Xu <peterx@redhat.com>

show more ...


# d481cec7 17-Oct-2024 Hyman Huang <yong.huang@smartx.com>

migration: Move cpu-throttle.c from system to migration

Move cpu-throttle.c from system to migration since it's
only used for migration; this makes us avoid exporting the
util functions and variable

migration: Move cpu-throttle.c from system to migration

Move cpu-throttle.c from system to migration since it's
only used for migration; this makes us avoid exporting the
util functions and variables in misc.h but export them in
migration.h when implementing the periodic ramblock dirty
sync feature in the upcoming commits.

Since CPU throttle timers are only used in migration, move
their registry to migration_object_init.

Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/c1b3efaa0cb49e03d422e9da97bdb65cc3d234d1.1729146786.git.yong.huang@smartx.com
[peterx: Fix build on MacOS on cocoa.m, not move cpu-throttle.h yet]
[peterx: Fix subject spelling, per pm215]
Signed-off-by: Peter Xu <peterx@redhat.com>

show more ...


# 054e5d66 17-Oct-2024 Hyman Huang <yong.huang@smartx.com>

migration: Stop CPU throttling conditionally

Since CPU throttling only occurs when auto-converge
is on, stop it conditionally.

Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Fabian

migration: Stop CPU throttling conditionally

Since CPU throttling only occurs when auto-converge
is on, stop it conditionally.

Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/f0c787080bb9ab0c37952f0ca5bfaa525d5ddd14.1729146786.git.yong.huang@smartx.com
Signed-off-by: Peter Xu <peterx@redhat.com>

show more ...


# e620b1e4 11-Oct-2024 Peter Xu <peterx@redhat.com>

migration: Put thread names together with macros

Keep migration thread names together, so it's easier to see a list of all
possible migration threads.

Still two functional changes below besides the

migration: Put thread names together with macros

Keep migration thread names together, so it's easier to see a list of all
possible migration threads.

Still two functional changes below besides the macro defintions:

- There's one dirty rate thread that we overlooked before, now we add
that too and name it as "mig/dirtyrate" following the old rules.

- The old name "mig/src/rp-thr" has "-thr" but it may not be useful if
it's a thread name anyway, while "rp" can be slightly hard to read.
Taking this chance to rename it to "mig/src/return", hopefully a better
name.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Acked-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Link: https://lore.kernel.org/r/20241011153652.517440-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>

show more ...


# 6dd4f44c 19-Sep-2024 Peter Xu <peterx@redhat.com>

migration: Cleanup migrate_fd_cleanup() on accessing to_dst_file

The cleanup function can in many cases needs cleanup on its own.

The major thing we want to do here is not referencing to_dst_file w

migration: Cleanup migrate_fd_cleanup() on accessing to_dst_file

The cleanup function can in many cases needs cleanup on its own.

The major thing we want to do here is not referencing to_dst_file when
without the file mutex. When at it, touch things elsewhere too to make it
look slightly better in general.

One thing to mention is, migration_thread has its own "running" boolean, so
it doesn't need to rely on to_dst_file being non-NULL. Multifd has a
dependency so it needs to be skipped if to_dst_file is not yet set; add a
richer comment for such reason.

Resolves: Coverity CID 1527402
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240919163042.116767-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>

show more ...


# 423be09a 03-Oct-2024 Peter Maydell <peter.maydell@linaro.org>

Merge tag 'warn-pull-request' of https://gitlab.com/marcandre.lureau/qemu into staging

-Werror=maybe-uninitialized fixes

# -----BEGIN PGP SIGNATURE-----
#
# iQJQBAABCAA6FiEEh6m9kz+HxgbSdvYt2ujhCXWW

Merge tag 'warn-pull-request' of https://gitlab.com/marcandre.lureau/qemu into staging

-Werror=maybe-uninitialized fixes

# -----BEGIN PGP SIGNATURE-----
#
# iQJQBAABCAA6FiEEh6m9kz+HxgbSdvYt2ujhCXWWnOUFAmb9PWwcHG1hcmNhbmRy
# ZS5sdXJlYXVAcmVkaGF0LmNvbQAKCRDa6OEJdZac5ebjD/43hDmLUGtq3WvEzG7T
# A9LjvKJ4bZrJkjwuogMUAAq89L65OcHdnXARgueSwt6Mi0qoakj7F2Ymv2oSw6Gq
# 33uspja5PySCGkAs3qQ9lb/zsPFNmXZkhR/XaDGqAFjI24w/QTMq+wwiEuqjVC6P
# 2C4VEXxz2Qn+WTQQjzpQ7E7CAUE/grHqflm+5JFICHywjj+oyoa5EnqEXHNlMb2J
# b8YVJ3z4SPNkq3VkQMHT0+aVO3+uS0NGxfXxGkVsSTdG1NlvwUr7bSomwZfXiUvP
# C0meotfsZTHZCJRtskCvn3kAd3E5EmIjMyDsbhqB0doCLCElV4AlzWSscy1y/GO+
# xm49Kutc+GRx/pztVMGzmC7aJ3Gwa8gKIrY1C/OvO8G2dZrJmTs2ydD4J9mIGxvC
# 1p1XeHZi8UOVshBDyAKRovKGzGiRtoC05SvjPOgB58RYGbUfFYHUOah3qYfRRZSw
# nPOXiwcrqiIlzkPRXB1ACVLfoZAHWzEFhrGxIKVdHABfwg8Rt9SmJq3JX8ysbKUJ
# FUI0/ZExrzGTURWWCm48k2rXZGKG/YSgvdmsSB5QmPEdrrb2jKqp5dmAbg3o/04b
# z4A7AatVNfK3tG69/hD1PwAy50q/sbbRUL9ZbBnc4Fnx1xyAOL4LgZ2tMov/jQWE
# 1SXLu8GKi4Yt76hUXFn1anqR0A==
# =zBkM
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 02 Oct 2024 13:32:44 BST
# gpg: using RSA key 87A9BD933F87C606D276F62DDAE8E10975969CE5
# gpg: issuer "marcandre.lureau@redhat.com"
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>" [full]
# gpg: aka "Marc-André Lureau <marcandre.lureau@gmail.com>" [full]
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276 F62D DAE8 E109 7596 9CE5

* tag 'warn-pull-request' of https://gitlab.com/marcandre.lureau/qemu: (22 commits)
qom/object: fix -Werror=maybe-uninitialized
fsdep/9p: fix -Werror=maybe-uninitialized false-positive
block: fix -Werror=maybe-uninitialized false-positive
hw/virtio: freeing leaked memory from vhost_svq_get_buf in vhost_svq_poll
hw/virtio: fix -Werror=maybe-uninitialized
tests: fix -Werror=maybe-uninitialized false-positive
target/loongarch: fix -Werror=maybe-uninitialized false-positive
linux-user/hppa: fix -Werror=maybe-uninitialized false-positive
migration: fix -Werror=maybe-uninitialized false-positive
hw/virtio-blk: fix -Werror=maybe-uninitialized false-positive
migration: fix -Werror=maybe-uninitialized false-positives
block/block-copy: fix -Werror=maybe-uninitialized false-positive
hw/sdhci: fix -Werror=maybe-uninitialized false-positive
hw/vhost-scsi: fix -Werror=maybe-uninitialized
hw/ahci: fix -Werror=maybe-uninitialized false-positive
block/stream: fix -Werror=maybe-uninitialized false-positives
block/mirror: fix -Werror=maybe-uninitialized false-positive
block/mirror: fix -Werror=maybe-uninitialized false-positive
nbd: fix -Werror=maybe-uninitialized false-positive
hw/qxl: fix -Werror=maybe-uninitialized false-positives
...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

show more ...


Revision tags: v9.1.0
# 7cea8637 28-Mar-2024 Marc-André Lureau <marcandre.lureau@redhat.com>

migration: fix -Werror=maybe-uninitialized false-positives

../migration/dirtyrate.c:186:5: error: ‘records’ may be used uninitialized [-Werror=maybe-uninitialized]
../migration/dirtyrate.c:168:12: e

migration: fix -Werror=maybe-uninitialized false-positives

../migration/dirtyrate.c:186:5: error: ‘records’ may be used uninitialized [-Werror=maybe-uninitialized]
../migration/dirtyrate.c:168:12: error: ‘gen_id’ may be used uninitialized [-Werror=maybe-uninitialized]
../migration/migration.c:2273:5: error: ‘file’ may be used uninitialized [-Werror=maybe-uninitialized]

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Hyman Huang <yong.huang@smartx.com>

show more ...


# 9b42e33b 17-Sep-2024 Fabiano Rosas <farosas@suse.de>

migration/multifd: Fix rb->receivedmap cleanup race

Fix a segmentation fault in multifd when rb->receivedmap is cleared
too early.

After commit 5ef7e26bdb ("migration/multifd: solve zero page causi

migration/multifd: Fix rb->receivedmap cleanup race

Fix a segmentation fault in multifd when rb->receivedmap is cleared
too early.

After commit 5ef7e26bdb ("migration/multifd: solve zero page causing
multiple page faults"), multifd started using the rb->receivedmap
bitmap, which belongs to ram.c and is initialized and *freed* from the
ram SaveVMHandlers.

Multifd threads are live until migration_incoming_state_destroy(),
which is called after qemu_loadvm_state_cleanup(), leading to a crash
when accessing rb->receivedmap.

process_incoming_migration_co() ...
qemu_loadvm_state() multifd_nocomp_recv()
qemu_loadvm_state_cleanup() ramblock_recv_bitmap_set_offset()
rb->receivedmap = NULL set_bit_atomic(..., rb->receivedmap)
...
migration_incoming_state_destroy()
multifd_recv_cleanup()
multifd_recv_terminate_threads(NULL)

Move the loadvm cleanup into migration_incoming_state_destroy(), after
multifd_recv_cleanup() to ensure multifd threads have already exited
when rb->receivedmap is cleared.

Adjust the postcopy listen thread comment to indicate that we still
want to skip the cpu synchronization.

CC: qemu-stable@nongnu.org
Fixes: 5ef7e26bdb ("migration/multifd: solve zero page causing multiple page faults")
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240917185802.15619-3-farosas@suse.de
[peterx: added comment in migration_incoming_state_destroy()]
Signed-off-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit 4ce56229087860805877075ddb29dd44578365a9)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

show more ...


# 5eff4231 18-Sep-2024 Peter Maydell <peter.maydell@linaro.org>

Merge tag 'migration-20240917-pull-request' of https://gitlab.com/peterx/qemu into staging

Migration pull request for 9.2

- Fabiano's patch to move two tests to slow tests.
- Peter's patch to fix q

Merge tag 'migration-20240917-pull-request' of https://gitlab.com/peterx/qemu into staging

Migration pull request for 9.2

- Fabiano's patch to move two tests to slow tests.
- Peter's patch to fix qatzip builds
- Stefan's multifd-zstd fix on unsigned diff comparisons
- Fea's bug fix to consistently use memattrs when map() address space
- Fabiano's bug fix on multifd race condition against receivedmap

# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZuscdRIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wblcQD/amw/nw5LiL8/+QzJtjFTckouyHtCn++I
# x5yVJndm4M4BAOD+b6Pd5byAX1bH7eZ85368ivKLGIZep6qEvICQmw0G
# =wxVI
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 18 Sep 2024 19:31:17 BST
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal]
# gpg: aka "Peter Xu <peterx@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706

* tag 'migration-20240917-pull-request' of https://gitlab.com/peterx/qemu:
migration/multifd: Fix rb->receivedmap cleanup race
migration/savevm: Remove extra load cleanup calls
softmmu/physmem.c: Keep transaction attribute in address_space_map()
migration/multifd: Fix loop conditions in multifd_zstd_send_prepare and multifd_zstd_recv
migration/multifd: Fix build for qatzip
tests/qtest/migration: Move a couple of slow tests under g_test_slow

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

show more ...


# 4ce56229 17-Sep-2024 Fabiano Rosas <farosas@suse.de>

migration/multifd: Fix rb->receivedmap cleanup race

Fix a segmentation fault in multifd when rb->receivedmap is cleared
too early.

After commit 5ef7e26bdb ("migration/multifd: solve zero page causi

migration/multifd: Fix rb->receivedmap cleanup race

Fix a segmentation fault in multifd when rb->receivedmap is cleared
too early.

After commit 5ef7e26bdb ("migration/multifd: solve zero page causing
multiple page faults"), multifd started using the rb->receivedmap
bitmap, which belongs to ram.c and is initialized and *freed* from the
ram SaveVMHandlers.

Multifd threads are live until migration_incoming_state_destroy(),
which is called after qemu_loadvm_state_cleanup(), leading to a crash
when accessing rb->receivedmap.

process_incoming_migration_co() ...
qemu_loadvm_state() multifd_nocomp_recv()
qemu_loadvm_state_cleanup() ramblock_recv_bitmap_set_offset()
rb->receivedmap = NULL set_bit_atomic(..., rb->receivedmap)
...
migration_incoming_state_destroy()
multifd_recv_cleanup()
multifd_recv_terminate_threads(NULL)

Move the loadvm cleanup into migration_incoming_state_destroy(), after
multifd_recv_cleanup() to ensure multifd threads have already exited
when rb->receivedmap is cleared.

Adjust the postcopy listen thread comment to indicate that we still
want to skip the cpu synchronization.

CC: qemu-stable@nongnu.org
Fixes: 5ef7e26bdb ("migration/multifd: solve zero page causing multiple page faults")
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240917185802.15619-3-farosas@suse.de
[peterx: added comment in migration_incoming_state_destroy()]
Signed-off-by: Peter Xu <peterx@redhat.com>

show more ...


# ffeddb97 21-Jun-2024 Richard Henderson <richard.henderson@linaro.org>

Merge tag 'migration-20240621-pull-request' of https://gitlab.com/farosas/qemu into staging

Migration pull request

- Fabiano's fix for fdset + file migration truncating the migration
file

- Fabi

Merge tag 'migration-20240621-pull-request' of https://gitlab.com/farosas/qemu into staging

Migration pull request

- Fabiano's fix for fdset + file migration truncating the migration
file

- Fabiano's fdset + direct-io support for mapped-ram

- Peter's various cleanups (multifd sync, thread names, migration
states, tests)

- Peter's new migration state postcopy-recover-setup

- Philippe's unused vmstate macro cleanup

# -----BEGIN PGP SIGNATURE-----
#
# iQJEBAABCAAuFiEEqhtIsKIjJqWkw2TPx5jcdBvsMZ0FAmZ1vIsQHGZhcm9zYXNA
# c3VzZS5kZQAKCRDHmNx0G+wxnVZTEACdFIsQ/PJw2C9eeLNor5B5MNSEqUjxX0KN
# 6s/uTkJ/dcv+2PI92SzRCZ1dpR5e9AyjTFYbLc9tPRBIROEhlUaoc84iyEy0jCFU
# eJ65/RQbH5QHRpOZwbN5RmGwnapfOWHGTn3bpdrmSQTOAy8R2TPGY4SVYR+gamTn
# bAv1cAsrOOBUfCi8aqvSlmvuliOW0lzJdF4XHa3mAaigLoF14JdwUZdyIMP1mLDp
# /fllbHCKCvJ1vprE9hQmptBR9PzveJZOZamIVt96djJr5+C869+9PMCn3a5vxqNW
# b+/LhOZjac37Ecg5kgbq+cO1E4EXKC3zWOmDTw8kHUwp9oYNi1upwLdpHbAAZaQD
# /JmHKsExx9QuV8mrVyGBXMI92E6RrT54b1Bjcuo63gAP8p9JRRxGT22U3LghNbTm
# 1XcGPR3rswjT1yTgE6qAqAIMR+7X5MrJVWop9ub/lF5DQ1VYIwmlKSNdwDHFDhRq
# 0F1k2+EksNpcZ0BH2+3iFml7qKHLVupLQKTWcLdrlnQnTfSG3+yW7eyA5Mte79Qp
# nJPcHt8qBqUVQ9Uf/4490TM4Lrp+T+m16exIi0tISLaDXSVkFJnlowipSm+tQ7U3
# Sm68JWdWWEsXZVaMqJeBE8nA/hCoQDpo4hVdwftStI+NayXbRX/EgvPqrNAvwh+c
# i4AdHdn6hQ==
# =ZX0p
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 21 Jun 2024 10:46:51 AM PDT
# gpg: using RSA key AA1B48B0A22326A5A4C364CFC798DC741BEC319D
# gpg: issuer "farosas@suse.de"
# gpg: Good signature from "Fabiano Rosas <farosas@suse.de>" [unknown]
# gpg: aka "Fabiano Almeida Rosas <fabiano.rosas@suse.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: AA1B 48B0 A223 26A5 A4C3 64CF C798 DC74 1BEC 319D

* tag 'migration-20240621-pull-request' of https://gitlab.com/farosas/qemu: (28 commits)
migration: Remove unused VMSTATE_ARRAY_TEST() macro
tests/migration-tests: Cover postcopy failure on reconnect
tests/migration-tests: Verify postcopy-recover-setup status
tests/migration-tests: migration_event_wait()
tests/migration-tests: Always enable migration events
tests/migration-tests: Drop most WIN32 ifdefs for postcopy failure tests
migration/docs: Update postcopy recover session for SETUP phase
migration/postcopy: Add postcopy-recover-setup phase
migration: Cleanup incoming migration setup state change
migration: Use MigrationStatus instead of int
migration: Rename thread debug names
migration/multifd: Avoid the final FLUSH in complete()
tests/qtest/migration: Add a test for mapped-ram with passing of fds
migration: Add documentation for fdset with multifd + file
monitor: fdset: Match against O_DIRECT
tests/qtest/migration: Add tests for file migration with direct-io
migration/multifd: Add direct-io support
migration: Add direct-io parameter
io: Stop using qemu_open_old in channel-file
monitor: Report errors from monitor_fdset_dup_fd_add
...

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

show more ...


# 4146b77e 19-Jun-2024 Peter Xu <peterx@redhat.com>

migration/postcopy: Add postcopy-recover-setup phase

This patch adds a migration state on src called "postcopy-recover-setup".
The new state will describe the intermediate step starting from when th

migration/postcopy: Add postcopy-recover-setup phase

This patch adds a migration state on src called "postcopy-recover-setup".
The new state will describe the intermediate step starting from when the
src QEMU received a postcopy recovery request, until the migration channels
are properly established, but before the recovery process take place.

The request came from Libvirt where Libvirt currently rely on the migration
state events to detect migration state changes. That works for most of the
migration process but except postcopy recovery failures at the beginning.

Currently postcopy recovery only has two major states:

- postcopy-paused: this is the state that both sides of QEMU will be in
for a long time as long as the migration channel was interrupted.

- postcopy-recover: this is the state where both sides of QEMU handshake
with each other, preparing for a continuation of postcopy which used to
be interrupted.

The issue here is when the recovery port is invalid, the src QEMU will take
the URI/channels, noticing the ports are not valid, and it'll silently keep
in the postcopy-paused state, with no event sent to Libvirt. In this case,
the only thing Libvirt can do is to poll the migration status with a proper
interval, however that's less optimal.

Considering that this is the only case where Libvirt won't get a
notification from QEMU on such events, let's add postcopy-recover-setup
state to mimic what we have with the "setup" state of a newly initialized
migration, describing the phase of connection establishment.

With that, postcopy recovery will have two paths to go now, and either path
will guarantee an event generated. Now the events will look like this
during a recovery process on src QEMU:

- Initially when the recovery is initiated on src, QEMU will go from
"postcopy-paused" -> "postcopy-recover-setup". Old QEMUs don't have
this event.

- Depending on whether the channel re-establishment is succeeded:

- In succeeded case, src QEMU will move from "postcopy-recover-setup"
to "postcopy-recover". Old QEMUs also have this event.

- In failure case, src QEMU will move from "postcopy-recover-setup" to
"postcopy-paused" again. Old QEMUs don't have this event.

This guarantees that Libvirt will always receive a notification for
recovery process properly.

One thing to mention is, such new status is only needed on src QEMU not
both. On dest QEMU, the state machine doesn't change. Hence the events
don't change either. It's done like so because dest QEMU may not have an
explicit point of setup start. E.g., it can happen that when dest QEMUs
doesn't use migrate-recover command to use a new URI/channel, but the old
URI/channels can be reused in recovery, in which case the old ports simply
can work again after the network routes are fixed up.

Add a new helper postcopy_is_paused() detecting whether postcopy is still
paused, taking RECOVER_SETUP into account too. When using it on both
src/dst, a slight change is done altogether to always wait for the
semaphore before checking the status, because for both sides a sem_post()
will be required for a recovery.

Cc: Jiri Denemark <jdenemar@redhat.com>
Cc: Prasad Pandit <ppandit@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Buglink: https://issues.redhat.com/browse/RHEL-38485
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>

show more ...


# 4dd5f7b8 19-Jun-2024 Peter Xu <peterx@redhat.com>

migration: Cleanup incoming migration setup state change

Destination QEMU can setup incoming ports for two purposes: either a fresh
new incoming migration, in which QEMU will switch to SETUP for cha

migration: Cleanup incoming migration setup state change

Destination QEMU can setup incoming ports for two purposes: either a fresh
new incoming migration, in which QEMU will switch to SETUP for channel
establishment, or a paused postcopy migration, in which QEMU will stay in
POSTCOPY_PAUSED until kicking off the RECOVER phase.

Now the state machine worked on dest node for the latter, only because
migrate_set_state() implicitly will become a noop if the current state
check failed. It wasn't clear at all.

Clean it up by providing a helper migration_incoming_state_setup() doing
proper checks over current status. Postcopy-paused will be explicitly
checked now, and then we can bail out for unknown states.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>

show more ...


# a5c24e13 19-Jun-2024 Peter Xu <peterx@redhat.com>

migration: Use MigrationStatus instead of int

QEMU uses "int" in most cases even if it stores MigrationStatus. I don't
know why, so let's try to do that right and see what blows up..

Reviewed-by:

migration: Use MigrationStatus instead of int

QEMU uses "int" in most cases even if it stores MigrationStatus. I don't
know why, so let's try to do that right and see what blows up..

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>

show more ...


# 60ce4767 19-Jun-2024 Peter Xu <peterx@redhat.com>

migration: Rename thread debug names

The postcopy thread names on dest QEMU are slightly confusing, partly I'll
need to blame myself on 36f62f11e4 ("migration: Postcopy preemption
preparation on cha

migration: Rename thread debug names

The postcopy thread names on dest QEMU are slightly confusing, partly I'll
need to blame myself on 36f62f11e4 ("migration: Postcopy preemption
preparation on channel creation"). E.g., "fault-fast" reads like a fast
version of "fault-default", but it's actually the fast version of
"postcopy/listen".

Taking this chance, rename all the migration threads with proper rules.
Considering we only have 15 chars usable, prefix all threads with "mig/",
meanwhile identify src/dst threads properly this time. So now most thread
names will look like "mig/DIR/xxx", where DIR will be "src"/"dst", except
the bg-snapshot thread which doesn't have a direction.

For multifd threads, making them "mig/{src|dst}/{send|recv}_%d".

We used to have "live_migration" thread for a very long time, now it's
called "mig/src/main". We may hope to have "mig/dst/main" soon but not
yet.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Zhijian Li (Fujitsu) <lizhijian@fujitsu.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>

show more ...


# 9d70239e 17-Jun-2024 Fabiano Rosas <farosas@suse.de>

migration/multifd: Add direct-io support

When multifd is used along with mapped-ram, we can take benefit of a
filesystem that supports the O_DIRECT flag and perform direct I/O in
the multifd threads

migration/multifd: Add direct-io support

When multifd is used along with mapped-ram, we can take benefit of a
filesystem that supports the O_DIRECT flag and perform direct I/O in
the multifd threads. This brings a significant performance improvement
because direct-io writes bypass the page cache which would otherwise
be thrashed by the multifd data which is unlikely to be needed again
in a short period of time.

To be able to use a multifd channel opened with O_DIRECT, we must
ensure that a certain aligment is used. Filesystems usually require a
block-size alignment for direct I/O. The way to achieve this is by
enabling the mapped-ram feature, which already aligns its I/O properly
(see MAPPED_RAM_FILE_OFFSET_ALIGNMENT at ram.c).

By setting O_DIRECT on the multifd channels, all writes to the same
file descriptor need to be aligned as well, even the ones that come
from outside multifd, such as the QEMUFile I/O from the main migration
code. This makes it impossible to use the same file descriptor for the
QEMUFile and for the multifd channels. The various flags and metadata
written by the main migration code will always be unaligned by virtue
of their small size. To workaround this issue, we'll require a second
file descriptor to be used exclusively for direct I/O.

The second file descriptor can be obtained by QEMU by re-opening the
migration file (already possible), or by being provided by the user or
management application (support to be added in future patches).

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>

show more ...


# 7e1c0047 22-May-2024 Richard Henderson <richard.henderson@linaro.org>

Merge tag 'migration-20240522-pull-request' of https://gitlab.com/farosas/qemu into staging

Migration pull request

- Li Zhijian's COLO minor fixes
- Marc-André's virtio-gpu fix
- Fiona's virtio-net

Merge tag 'migration-20240522-pull-request' of https://gitlab.com/farosas/qemu into staging

Migration pull request

- Li Zhijian's COLO minor fixes
- Marc-André's virtio-gpu fix
- Fiona's virtio-net USO fix
- A couple of migration-test fixes from Thomas

# -----BEGIN PGP SIGNATURE-----
#
# iQJEBAABCAAuFiEEqhtIsKIjJqWkw2TPx5jcdBvsMZ0FAmZObggQHGZhcm9zYXNA
# c3VzZS5kZQAKCRDHmNx0G+wxnWE8D/49RGE+g29qyk9aKx3lU8mSq+ZzmX5GncBt
# 5+Mx5qoHDsBCQTE+dQpEVIoeMJ2HIbgbOML4qsnp6Hw/4/TWkfwC/R6+ZmHBevRk
# fVLkVh2JMHVg8Tq+0FO1X1QnMU03uJ7EAuWdDa8HqlJ5dQY/K3gDaku8oQBXk96X
# 13pChSbMob76tdb+wiwbdEakabigH7XfrPdI6lzI8MCGTIcPKc/UKTFYuoj/OsNx
# raqy+uBtvKtfHxiaYnIgHIPNAF/1f4tP3iAOcPoZWIMXWxFkE8+ANDJAbWo6xIcL
# DGg/wEzZO/OnXLjOhjvLBUHK/fx4wQ5bsqA09BVxoRyBGblkXr+bcwBLYjgiEqzT
# aniPiAx5W/Db+T7HqZPIWesFYj3cmcwvYUTrx/RPMdC0epG+ZczDMtescHdZbxvt
# Pjs3nFeCLhyYcVhlTI72eXRCxdd/26+r6/OmrBC2+GaZrybM61TvNo+3XvO0Pfhi
# UmwF2EN27XmSMelLvH/MnflUVgBHKDs3CCQzDlxreHq2jMVR0SL7LU5wMJJ58Iok
# M3u74izQM25bwYxiASH+4iRn0puH1mOwgOx28W0uiQfZY/678/lCnwa1Tul15BRE
# fIQZJhyIGzhSpwLqEXmdXdlLQs1isqIgpd/mzKgZ285nLr7kz+4gxCUqiXgVbrl7
# P45Dym1u4g==
# =DDrh
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 22 May 2024 03:13:28 PM PDT
# gpg: using RSA key AA1B48B0A22326A5A4C364CFC798DC741BEC319D
# gpg: issuer "farosas@suse.de"
# gpg: Good signature from "Fabiano Rosas <farosas@suse.de>" [unknown]
# gpg: aka "Fabiano Almeida Rosas <fabiano.rosas@suse.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: AA1B 48B0 A223 26A5 A4C3 64CF C798 DC74 1BEC 319D

* tag 'migration-20240522-pull-request' of https://gitlab.com/farosas/qemu:
tests/qtest/migration-test: Fix the check for a successful run of analyze-migration.py
tests/qtest/migration-test: Run some basic tests on s390x and ppc64 with TCG, too
hw/core/machine: move compatibility flags for VirtIO-net USO to machine 8.1
virtio-gpu: fix v2 migration
migration: fix a typo
migration: add "exists" info to load-state-field trace
migration/colo: Tidy up bql_unlock() around bdrv_activate_all()
migration/colo: make colo_incoming_co() return void
migration/colo: Minor fix for colo error message

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

show more ...


# 787ea49e 15-May-2024 Li Zhijian <lizhijian@fujitsu.com>

migration/colo: make colo_incoming_co() return void

Currently, it always returns 0, no need to check the return value at all.
In addition, enter colo coroutine only if migration_incoming_colo_enable

migration/colo: make colo_incoming_co() return void

Currently, it always returns 0, no need to check the return value at all.
In addition, enter colo coroutine only if migration_incoming_colo_enabled()
is true.
Once the destination side enters the COLO* state, the COLO process will
take over the remaining processes until COLO exits.

Cc: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
[fixed mangled author email address]
Signed-off-by: Fabiano Rosas <farosas@suse.de>

show more ...


# 55a33165 15-May-2024 Li Zhijian <lizhijian@fujitsu.com>

migration/colo: Minor fix for colo error message

- Explicitly show the missing module name: replication
- Fix capability name to x-colo

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Zhang

migration/colo: Minor fix for colo error message

- Explicitly show the missing module name: replication
- Fix capability name to x-colo

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Suggested-by: Michael Tokarev <mjt@tls.msk.ru>
[fixed mangled author email address]
Signed-off-by: Fabiano Rosas <farosas@suse.de>

show more ...


12345678910>>...40