History log of /openbmc/qemu/migration/trace-events (Results 1 – 25 of 182)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v9.2.0, v9.1.2
# cbad4551 04-Nov-2024 Peter Maydell <peter.maydell@linaro.org>

Merge tag 'migration-20241030-pull-request' of https://gitlab.com/peterx/qemu into staging

Migration pull request for softfreeze

v2:
- Patch "migration: Move cpu-throttle.c from system to migration

Merge tag 'migration-20241030-pull-request' of https://gitlab.com/peterx/qemu into staging

Migration pull request for softfreeze

v2:
- Patch "migration: Move cpu-throttle.c from system to migration",
fix build on MacOS, and subject spelling

NOTE: checkpatch.pl could report a false positive on this branch:

WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#21:
{include/sysemu => migration}/cpu-throttle.h | 0

That's covered by "F: migration/" entry.

Changelog:

- Peter's cleanup patch on migrate_fd_cleanup()
- Peter's cleanup patch to introduce thread name macros
- Hanna's error path fix for vmstate subsection save()s
- Hyman's auto converge enhancement on background dirty sync
- Peter's additional tracepoints for save state entries
- Thomas's build fix for OpenBSD in dirtyrate.c
- Peter's deprecation of query-migrationthreads command
- Peter's cleanup/fixes from the "export misc.h" series
- Maciej's two small patches from multifd+vfio series

# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZyTbVRIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wan3wD+L4TVNDc34Hy4mvWu7u1lCOePX0GBdUEc
# oEeBGblwbrcBAIR8d+5z9O5YcWH1coozG1aUC4qCtSHHk5TGbJk4/UUD
# =XB5Q
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 01 Nov 2024 13:44:53 GMT
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal]
# gpg: aka "Peter Xu <peterx@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706

* tag 'migration-20241030-pull-request' of https://gitlab.com/peterx/qemu:
migration/multifd: Zero p->flags before starting filling a packet
migration/ram: Add load start trace event
migration: Drop migration_is_idle()
migration: Drop migration_is_setup_or_active()
migration: Unexport ram_mig_init()
migration: Unexport dirty_bitmap_mig_init()
migration: Take migration object refcount earlier for threads
migration: Deprecate query-migrationthreads command
migration/dirtyrate: Silence warning about strcpy() on OpenBSD
tests/migration: Add case for periodic ramblock dirty sync
migration: Support periodic RAMBlock dirty bitmap sync
migration: Remove "rs" parameter in migration_bitmap_sync_precopy
migration: Move cpu-throttle.c from system to migration
migration: Stop CPU throttling conditionally
accel/tcg/icount-common: Remove the reference to the unused header file
migration: Ensure vmstate_save() sets errp
migration: Put thread names together with macros
migration: Cleanup migrate_fd_cleanup() on accessing to_dst_file

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

show more ...


# b0350c51 29-Oct-2024 Maciej S. Szmigiero <maciej.szmigiero@oracle.com>

migration/ram: Add load start trace event

There's a RAM load complete trace event but there wasn't its start equivalent.

Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by

migration/ram: Add load start trace event

There's a RAM load complete trace event but there wasn't its start equivalent.

Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/94ddfa7ecb83a78f73b82867dd30c8767592d257.1730203967.git.maciej.szmigiero@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>

show more ...


Revision tags: v9.1.1
# 52ac968a 17-Oct-2024 Hyman Huang <yong.huang@smartx.com>

migration: Support periodic RAMBlock dirty bitmap sync

When VM is configured with huge memory, the current throttle logic
doesn't look like to scale, because migration_trigger_throttle()
is only cal

migration: Support periodic RAMBlock dirty bitmap sync

When VM is configured with huge memory, the current throttle logic
doesn't look like to scale, because migration_trigger_throttle()
is only called for each iteration, so it won't be invoked for a long
time if one iteration can take a long time.

The periodic dirty sync aims to fix the above issue by synchronizing
the ramblock from remote dirty bitmap and, when necessary, triggering
the CPU throttle multiple times during a long iteration.

This is a trade-off between synchronization overhead and CPU throttle
impact.

Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/f61f1b3653f2acf026901103e1c73d157d38b08f.1729146786.git.yong.huang@smartx.com
[peterx: make prev_cnt global, and reset for each migration]
Signed-off-by: Peter Xu <peterx@redhat.com>

show more ...


# d481cec7 17-Oct-2024 Hyman Huang <yong.huang@smartx.com>

migration: Move cpu-throttle.c from system to migration

Move cpu-throttle.c from system to migration since it's
only used for migration; this makes us avoid exporting the
util functions and variable

migration: Move cpu-throttle.c from system to migration

Move cpu-throttle.c from system to migration since it's
only used for migration; this makes us avoid exporting the
util functions and variables in misc.h but export them in
migration.h when implementing the periodic ramblock dirty
sync feature in the upcoming commits.

Since CPU throttle timers are only used in migration, move
their registry to migration_object_init.

Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/c1b3efaa0cb49e03d422e9da97bdb65cc3d234d1.1729146786.git.yong.huang@smartx.com
[peterx: Fix build on MacOS on cocoa.m, not move cpu-throttle.h yet]
[peterx: Fix subject spelling, per pm215]
Signed-off-by: Peter Xu <peterx@redhat.com>

show more ...


# becd6944 06-Sep-2024 Peter Maydell <peter.maydell@linaro.org>

Merge tag 'migration-20240904-pull-request' of https://gitlab.com/farosas/qemu into staging

Migration pull request

- Steve's cleanup of unused variable
- Peter Maydell's fixes for several leaks in

Merge tag 'migration-20240904-pull-request' of https://gitlab.com/farosas/qemu into staging

Migration pull request

- Steve's cleanup of unused variable
- Peter Maydell's fixes for several leaks in migration-test
- Fabiano's flexibilization of multifd data structures for device
state migration
- Arman Nabiev's fix for ppc e500 migration
- Thomas' fix for migration-test vs. --without-default-devices

# -----BEGIN PGP SIGNATURE-----
#
# iQJEBAABCAAuFiEEqhtIsKIjJqWkw2TPx5jcdBvsMZ0FAmbYVXwQHGZhcm9zYXNA
# c3VzZS5kZQAKCRDHmNx0G+wxnRucEAC1vo046UGdUmbb4PaF5vKAg97io6RB2nrH
# HMz56Yc0AcAKRUGwe2Z80e2jY8B6zi8Ha8b9l7cVsej095eGCF+tINIL4wRX4lHm
# alDY/LkhuqjE5g5c/DaeTztyBOFLvdWHPU5eJyDOC9r7kSlnUcL1gAslH23b8uL0
# xvhPVKaTWjGIzNL1q/XfBr1WgRGqfD6dYb32HJDTq85yOnUT5sEr55aoEEu0euKh
# MYbXPmi5AMbrp8nP21kzUopX8iYERRdoKwhF0ZssciGi/qJVevH70tNdbDEQSxyp
# +vtP54TnL3LrzD4uY5Snng9zT9h0QrZujY79OEcxu20U0s29OQaudWkIjp7yLLUv
# UnPZHS+bIyaS53DdpV94GKGGBX1wrjGC/sn8eGYzmb2yMlMjLTBoE8L5r9cadshX
# XTeF4MtKGqaS3xDM2fIgACHHFl6qr/l0nENspv0raFzpf9Jx/WbpekghvTuWN6/B
# pZHnoOTNiAqXS/Rnyy829vsQ0Pw4hi6wx79Z73RP+35ubZTgTmOsQx9f2FjuEh6k
# JS+q9k4VJ+nntUWsYn4GS1Jlt+FXJ2hfzNj1NNFN4xLT1oioc6pCHsQyV7SBArB1
# ml2zYyfKCTC3riIRhcv/ew6OcKbhHcPFOpd/v0y40LO3mx8S0LZnUWXkcrl3XIZS
# Mj5CBdlFgA==
# =SRN4
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 04 Sep 2024 13:41:32 BST
# gpg: using RSA key AA1B48B0A22326A5A4C364CFC798DC741BEC319D
# gpg: issuer "farosas@suse.de"
# gpg: Good signature from "Fabiano Rosas <farosas@suse.de>" [unknown]
# gpg: aka "Fabiano Almeida Rosas <fabiano.rosas@suse.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: AA1B 48B0 A223 26A5 A4C3 64CF C798 DC74 1BEC 319D

* tag 'migration-20240904-pull-request' of https://gitlab.com/farosas/qemu: (34 commits)
tests/qtest/migration: Add a check for the availability of the "pc" machine
target/ppc: Fix migration of CPUs with TLB_EMB TLB type
migration/multifd: Add documentation for multifd methods
migration/multifd: Add a couple of asserts for p->iov
migration/multifd: Fix p->iov leak in multifd-uadk.c
migration/multifd: Stop changing the packet on recv side
migration/multifd: Make MultiFDMethods const
migration/multifd: Move nocomp code into multifd-nocomp.c
migration/multifd: Register nocomp ops dynamically
migration/multifd: Standardize on multifd ops names
migration/multifd: Allow multifd sync without flush
migration/multifd: Replace multifd_send_state->pages with client data
migration/multifd: Don't send ram data during SYNC
migration/multifd: Isolate ram pages packet data
migration/multifd: Remove total pages tracing
migration/multifd: Move pages accounting into multifd_send_zero_page_detect()
migration/multifd: Replace p->pages with an union pointer
migration/multifd: Make MultiFDPages_t:offset a flexible array member
migration/multifd: Introduce MultiFDSendData
migration/multifd: Pass in MultiFDPages_t to file_write_ramblock_iov
...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

show more ...


Revision tags: v9.1.0
# 87bb9e95 27-Aug-2024 Fabiano Rosas <farosas@suse.de>

migration/multifd: Isolate ram pages packet data

While we cannot yet disentangle the multifd packet from page data, we
can make the code a bit cleaner by setting the page-related fields in
a separat

migration/multifd: Isolate ram pages packet data

While we cannot yet disentangle the multifd packet from page data, we
can make the code a bit cleaner by setting the page-related fields in
a separate function.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>

show more ...


# 96d396bf 27-Aug-2024 Fabiano Rosas <farosas@suse.de>

migration/multifd: Remove total pages tracing

The total_normal_pages and total_zero_pages elements are used only for
the end tracepoints of the multifd threads. These are not super useful
since they

migration/multifd: Remove total pages tracing

The total_normal_pages and total_zero_pages elements are used only for
the end tracepoints of the multifd threads. These are not super useful
since they record per-channel numbers and are just the sum of all the
pages that are transmitted per-packet, for which we already have
tracepoints. Remove the totals from the tracing.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>

show more ...


# 7e1c0047 22-May-2024 Richard Henderson <richard.henderson@linaro.org>

Merge tag 'migration-20240522-pull-request' of https://gitlab.com/farosas/qemu into staging

Migration pull request

- Li Zhijian's COLO minor fixes
- Marc-André's virtio-gpu fix
- Fiona's virtio-net

Merge tag 'migration-20240522-pull-request' of https://gitlab.com/farosas/qemu into staging

Migration pull request

- Li Zhijian's COLO minor fixes
- Marc-André's virtio-gpu fix
- Fiona's virtio-net USO fix
- A couple of migration-test fixes from Thomas

# -----BEGIN PGP SIGNATURE-----
#
# iQJEBAABCAAuFiEEqhtIsKIjJqWkw2TPx5jcdBvsMZ0FAmZObggQHGZhcm9zYXNA
# c3VzZS5kZQAKCRDHmNx0G+wxnWE8D/49RGE+g29qyk9aKx3lU8mSq+ZzmX5GncBt
# 5+Mx5qoHDsBCQTE+dQpEVIoeMJ2HIbgbOML4qsnp6Hw/4/TWkfwC/R6+ZmHBevRk
# fVLkVh2JMHVg8Tq+0FO1X1QnMU03uJ7EAuWdDa8HqlJ5dQY/K3gDaku8oQBXk96X
# 13pChSbMob76tdb+wiwbdEakabigH7XfrPdI6lzI8MCGTIcPKc/UKTFYuoj/OsNx
# raqy+uBtvKtfHxiaYnIgHIPNAF/1f4tP3iAOcPoZWIMXWxFkE8+ANDJAbWo6xIcL
# DGg/wEzZO/OnXLjOhjvLBUHK/fx4wQ5bsqA09BVxoRyBGblkXr+bcwBLYjgiEqzT
# aniPiAx5W/Db+T7HqZPIWesFYj3cmcwvYUTrx/RPMdC0epG+ZczDMtescHdZbxvt
# Pjs3nFeCLhyYcVhlTI72eXRCxdd/26+r6/OmrBC2+GaZrybM61TvNo+3XvO0Pfhi
# UmwF2EN27XmSMelLvH/MnflUVgBHKDs3CCQzDlxreHq2jMVR0SL7LU5wMJJ58Iok
# M3u74izQM25bwYxiASH+4iRn0puH1mOwgOx28W0uiQfZY/678/lCnwa1Tul15BRE
# fIQZJhyIGzhSpwLqEXmdXdlLQs1isqIgpd/mzKgZ285nLr7kz+4gxCUqiXgVbrl7
# P45Dym1u4g==
# =DDrh
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 22 May 2024 03:13:28 PM PDT
# gpg: using RSA key AA1B48B0A22326A5A4C364CFC798DC741BEC319D
# gpg: issuer "farosas@suse.de"
# gpg: Good signature from "Fabiano Rosas <farosas@suse.de>" [unknown]
# gpg: aka "Fabiano Almeida Rosas <fabiano.rosas@suse.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: AA1B 48B0 A223 26A5 A4C3 64CF C798 DC74 1BEC 319D

* tag 'migration-20240522-pull-request' of https://gitlab.com/farosas/qemu:
tests/qtest/migration-test: Fix the check for a successful run of analyze-migration.py
tests/qtest/migration-test: Run some basic tests on s390x and ppc64 with TCG, too
hw/core/machine: move compatibility flags for VirtIO-net USO to machine 8.1
virtio-gpu: fix v2 migration
migration: fix a typo
migration: add "exists" info to load-state-field trace
migration/colo: Tidy up bql_unlock() around bdrv_activate_all()
migration/colo: make colo_incoming_co() return void
migration/colo: Minor fix for colo error message

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

show more ...


# 3f879f2f 16-May-2024 Marc-André Lureau <marcandre.lureau@redhat.com>

migration: add "exists" info to load-state-field trace

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fiona Ebner <f.ebner@prox

migration: add "exists" info to load-state-field trace

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>

show more ...


# a016dd50 09-May-2024 Richard Henderson <richard.henderson@linaro.org>

Merge tag 'migration-20240508-pull-request' of https://gitlab.com/farosas/qemu into staging

Migration pull request

- Will's WITH_QEMU_LOCK_GUARD cleanup
- Vladimir's new exit-on-error parameter
- F

Merge tag 'migration-20240508-pull-request' of https://gitlab.com/farosas/qemu into staging

Migration pull request

- Will's WITH_QEMU_LOCK_GUARD cleanup
- Vladimir's new exit-on-error parameter
- Fabiano's removals and deprecations series
(block migration and non-multifd compression removed)
- Peter's documentation fix for HMP migrate command

v2:
- updated Peter's documentation fix.

# -----BEGIN PGP SIGNATURE-----
#
# iQJEBAABCAAuFiEEqhtIsKIjJqWkw2TPx5jcdBvsMZ0FAmY7934QHGZhcm9zYXNA
# c3VzZS5kZQAKCRDHmNx0G+wxnXynEADHjRa7HqwuYPhft3wGgLiFbCyQNFpNrjM9
# prQSiLlYt9gRlE4c9ZavCxR28xtOrK2oFhCnLMXaIEkct6JuylfiwCPwPuxNQP9+
# EZirECf1yKkyt+RV/LfIx3R/prJgoH5XWhpna+WIBFFo2qSorHTAzjb5dKYZDjkB
# EjfN8R9goVH6aCPd4SyiCUUNxuR6/0si9AxfhUgUvUXyLZmE1ztZEoWI02FCYzVj
# kKDdVK2+Z1Rlv88tyY4/E6z4pwYLWx5EiXSFv0NXIpTdyO3dM+jeAHxcN7KmQ1+5
# GvX0n+mFYOzRIbRfAnhSZbkez/nuPcbJ76phzSYDs8f/7YtOpuOFKFw7yuGrl5N5
# ZqXo5MOOGliF2wozTjacsOrUhB+MbSb0iA71T7aAdBC2s4H9+XIWfoN/OZfsBhAW
# r2i1gSytVLQqsip7A0CFF+DqeSse9QHHlH8vfb8NUn1Tp0o2QfsX+/7LHlvl/2eJ
# EP/zmjD6c/8vjB3fTKZr52h2lEO/36xmX+OtZpep3EBvvl1BY1LP4nBNOW1vQM/b
# fzcq+agaikwS5gI2QSOC9HJ3aX6q416+wZEm3rQ8XRGSPDFfLPKM/GPPfWdj6ngb
# +e3EZPrs+3dOeH1kly5xVMGXGUof+VVBmVwdv4C+XNMM8fRZOxoqd0SD8dz/vOC7
# nSGztXUPqw==
# =5T+K
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 09 May 2024 12:06:54 AM CEST
# gpg: using RSA key AA1B48B0A22326A5A4C364CFC798DC741BEC319D
# gpg: issuer "farosas@suse.de"
# gpg: Good signature from "Fabiano Rosas <farosas@suse.de>" [unknown]
# gpg: aka "Fabiano Almeida Rosas <fabiano.rosas@suse.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: AA1B 48B0 A223 26A5 A4C3 64CF C798 DC74 1BEC 319D

* tag 'migration-20240508-pull-request' of https://gitlab.com/farosas/qemu:
hmp/migration: Fix "migrate" command's documentation
migration: Deprecate fd: for file migration
migration: Remove non-multifd compression
migration: Remove block migration
migration: Remove 'blk/-b' option from migrate commands
migration: Remove 'inc' option from migrate command
migration: Remove 'skipped' field from MigrationStats
qapi: introduce exit-on-error parameter for migrate-incoming
migration: process_incoming_migration_co(): rework error reporting
migration: process_incoming_migration_co(): fix reporting s->error
migration: process_incoming_migration_co(): complete cleanup on failure
migration: move trace-point from migrate_fd_error to migrate_set_error
migration/ram.c: API Conversion qemu_mutex_lock(), and qemu_mutex_unlock() to WITH_QEMU_LOCK_GUARD macro

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

show more ...


# d4a17b8f 30-Apr-2024 Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>

migration: move trace-point from migrate_fd_error to migrate_set_error

Cover more cases by trace-point.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Fabiano

migration: move trace-point from migrate_fd_error to migrate_set_error

Cover more cases by trace-point.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>

show more ...


# 8f3f329f 12-Mar-2024 Peter Maydell <peter.maydell@linaro.org>

Merge tag 'migration-20240311-pull-request' of https://gitlab.com/peterx/qemu into staging

Migration pull request

- Avihai's fix to allow vmstate iterators to not starve for VFIO
- Maksim's fix on

Merge tag 'migration-20240311-pull-request' of https://gitlab.com/peterx/qemu into staging

Migration pull request

- Avihai's fix to allow vmstate iterators to not starve for VFIO
- Maksim's fix on additional check on precopy load error
- Fabiano's fix on fdatasync() hang in mapped-ram
- Jonathan's fix on vring cached access over MMIO regions
- Cedric's cleanup patches 1-4 out of his error report series
- Yu's fix for RDMA migration (which used to be broken even for 8.2)
- Anthony's small cleanup/fix on err message
- Steve's patches on privatize migration.h
- Xiang's patchset to enable zero page detections in multifd threads

# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZe9+uBIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wamaQD/SvmpMEcuRndT9LPSxzXowAGDZTBpYUfv
# 5XAbx80dS9IBAO8PJJgQJIBHBeacyLBjHP9CsdVtgw5/VW+wCsbfV4AB
# =xavb
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 11 Mar 2024 21:59:20 GMT
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal]
# gpg: aka "Peter Xu <peterx@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706

* tag 'migration-20240311-pull-request' of https://gitlab.com/peterx/qemu: (34 commits)
migration/multifd: Add new migration test cases for legacy zero page checking.
migration/multifd: Enable multifd zero page checking by default.
migration/multifd: Implement ram_save_target_page_multifd to handle multifd version of MigrationOps::ram_save_target_page.
migration/multifd: Implement zero page transmission on the multifd thread.
migration/multifd: Add new migration option zero-page-detection.
migration/multifd: Allow clearing of the file_bmap from multifd
migration/multifd: Allow zero pages in file migration
migration: purge MigrationState from public interface
migration: delete unused accessors
migration: privatize colo interfaces
migration: migration_file_set_error
migration: migration_is_device
migration: migration_thread_is_self
migration: export vcpu_dirty_limit_period
migration: export migration_is_running
migration: export migration_is_active
migration: export migration_is_setup_or_active
migration: remove migration.h references
migration: export fewer options
migration: Fix format in error message
...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

show more ...


# 303e6f54 11-Mar-2024 Hao Xiang <hao.xiang@bytedance.com>

migration/multifd: Implement zero page transmission on the multifd thread.

1. Add zero_pages field in MultiFDPacket_t.
2. Implements the zero page detection and handling on the multifd
threads for n

migration/multifd: Implement zero page transmission on the multifd thread.

1. Add zero_pages field in MultiFDPacket_t.
2. Implements the zero page detection and handling on the multifd
threads for non-compression, zlib and zstd compression backends.
3. Added a new value 'multifd' in ZeroPageDetection enumeration.
4. Adds zero page counters and updates multifd send/receive tracing
format to track the newly added counters.

Signed-off-by: Hao Xiang <hao.xiang@bytedance.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240311180015.3359271-5-hao.xiang@linux.dev
Signed-off-by: Peter Xu <peterx@redhat.com>

show more ...


# c90cfb52 05-Mar-2024 Peter Maydell <peter.maydell@linaro.org>

Merge tag 'migration-next-pull-request' of https://gitlab.com/peterx/qemu into staging

Migartion pull request for 20240304

- Bryan's fix on multifd compression level API
- Fabiano's mapped-ram seri

Merge tag 'migration-next-pull-request' of https://gitlab.com/peterx/qemu into staging

Migartion pull request for 20240304

- Bryan's fix on multifd compression level API
- Fabiano's mapped-ram series (base + multifd only)
- Steve's amend on cpr document in qapi/

# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZeUjKhIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wbv5QD/ZexBUsmZA5qyxgGvZ2yvlUBEGNOvtmKY
# kRdiYPU7khMA/0N43rn4LcqKCoq4+T+EAnYizGjIyhH/7BRUyn4DUxgO
# =AeEn
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 04 Mar 2024 01:26:02 GMT
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal]
# gpg: aka "Peter Xu <peterx@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706

* tag 'migration-next-pull-request' of https://gitlab.com/peterx/qemu: (27 commits)
migration/multifd: Document two places for mapped-ram
tests/qtest/migration: Add a multifd + mapped-ram migration test
migration/multifd: Add mapped-ram support to fd: URI
migration/multifd: Support incoming mapped-ram stream format
migration/multifd: Support outgoing mapped-ram stream format
migration/multifd: Prepare multifd sync for mapped-ram migration
migration/multifd: Add incoming QIOChannelFile support
migration/multifd: Add outgoing QIOChannelFile support
migration/multifd: Add a wrapper for channels_created
migration/multifd: Allow receiving pages without packets
migration/multifd: Allow multifd without packets
migration/multifd: Decouple recv method from pages
migration/multifd: Rename MultiFDSend|RecvParams::data to compress_data
tests/qtest/migration: Add tests for mapped-ram file-based migration
migration/ram: Add incoming 'mapped-ram' migration
migration/ram: Add outgoing 'mapped-ram' migration
migration: Add mapped-ram URI compatibility check
migration/ram: Introduce 'mapped-ram' migration capability
migration/qemu-file: add utility methods for working with seekable channels
io: fsync before closing a file channel
...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

# Conflicts:
# migration/ram.c

show more ...


# 4aac6b1e 29-Feb-2024 Fabiano Rosas <farosas@suse.de>

migration/multifd: Cleanup multifd_recv_sync_main

Some minor cleanups and documentation for multifd_recv_sync_main.

Use thread_count as done in other parts of the code. Remove p->id from
the multif

migration/multifd: Cleanup multifd_recv_sync_main

Some minor cleanups and documentation for multifd_recv_sync_main.

Use thread_count as done in other parts of the code. Remove p->id from
the multifd_recv_state sync, since that is global and not tied to a
channel. Add documentation for the sync steps.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240229153017.2221-2-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>

show more ...


# 5d1fc614 09-Feb-2024 Peter Maydell <peter.maydell@linaro.org>

Merge tag 'migration-staging-pull-request' of https://gitlab.com/peterx/qemu into staging

Migration pull

- William's fix on hwpoison migration which used to crash QEMU
- Peter's multifd cleanup + b

Merge tag 'migration-staging-pull-request' of https://gitlab.com/peterx/qemu into staging

Migration pull

- William's fix on hwpoison migration which used to crash QEMU
- Peter's multifd cleanup + bugfix + optimizations
- Avihai's fix on multifd crash over non-socket channels
- Fabiano's multifd thread-race fix
- Peter's CI fix series

# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZcREtRIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wacrwEAl2aeQkh51h/e+OKX7MG4/4Y6Edf6Oz7o
# IJLk/cyrUFQA/2exo2lOdv5zHNOJKwAYj8HYDraezrC/MK1eED4Wji0M
# =k53l
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 08 Feb 2024 03:04:21 GMT
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal]
# gpg: aka "Peter Xu <peterx@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706

* tag 'migration-staging-pull-request' of https://gitlab.com/peterx/qemu: (34 commits)
ci: Update comment for migration-compat-aarch64
ci: Remove tag dependency for build-previous-qemu
tests/migration-test: Stick with gicv3 in aarch64 test
migration/multifd: Add a synchronization point for channel creation
migration/multifd: Unify multifd and TLS connection paths
migration/multifd: Move multifd_send_setup into migration thread
migration/multifd: Move multifd_send_setup error handling in to the function
migration/multifd: Remove p->running
migration/multifd: Join the TLS thread
migration: Fix logic of channels and transport compatibility check
migration/multifd: Optimize sender side to be lockless
migration/multifd: Fix MultiFDSendParams.packet_num race
migration/multifd: Stick with send/recv on function names
migration/multifd: Cleanup multifd_load_cleanup()
migration/multifd: Cleanup multifd_save_cleanup()
migration/multifd: Rewrite multifd_queue_page()
migration/multifd: Change retval of multifd_send_pages()
migration/multifd: Change retval of multifd_queue_page()
migration/multifd: Split multifd_send_terminate_threads()
migration/multifd: Forbid spurious wakeups
...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

show more ...


# 3ab4441d 02-Feb-2024 Peter Xu <peterx@redhat.com>

migration/multifd: Split multifd_send_terminate_threads()

Split multifd_send_terminate_threads() into two functions:

- multifd_send_set_error(): used when an error happened on the sender
side

migration/multifd: Split multifd_send_terminate_threads()

Split multifd_send_terminate_threads() into two functions:

- multifd_send_set_error(): used when an error happened on the sender
side, set error and quit state only

- multifd_send_terminate_threads(): used only by the main thread to kick
all multifd send threads out of sleep, for the last recycling.

Use multifd_send_set_error() in the three old call sites where only the
error will be set.

Use multifd_send_terminate_threads() in the last one where the main thread
will kick the multifd threads at last in multifd_save_cleanup().

Both helpers will need to set quitting=1.

Suggested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240202102857.110210-16-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>

show more ...


# 75b7b25d 02-Nov-2023 Stefan Hajnoczi <stefanha@redhat.com>

Merge tag 'migration-20231102-pull-request' of https://gitlab.com/juan.quintela/qemu into staging

Migration Pull request (20231102)

Hi

In this pull request:

- migration reboot mode (steve)
* I

Merge tag 'migration-20231102-pull-request' of https://gitlab.com/juan.quintela/qemu into staging

Migration Pull request (20231102)

Hi

In this pull request:

- migration reboot mode (steve)
* I disabled the test because our CI don't like programs using so
much shared memory. Searching for a fix.
- test for postcopy recover (fabiano)
- MigrateAddress QAPI (het)
- better return path error handling (peter)
- traces for downtime (peter)
- vmstate_register() check for duplicates (juan)
thomas find better solutions for s390x and ipmi.
now also works on s390x

Please, apply.

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAmVDipMACgkQ9IfvGFhy
# 1yNYnQ/9E5Cywsoqljqa/9FiKBSII2qMrmkfu6JLKqePnsh5pFZiukbudYRuJCCe
# ZTDEmD0NmKRJbDx2xRU1qx/e6gKJy+gz37KP89Buuh/WwZHPboPYtxQpGvCSiH26
# J3i+1+TgaqmkLzcO35wa8tp6gneQclWeAwKgMvdb4cm2pJEhgWRKI62ccyLzxeve
# UCzFQn60t55ETyVZGnRD4YwdTQvGKH+DPlyTuJOLR3DePuvZd8EdH+ypvB4RLAy7
# 3+CuQOxmF5LRXPbpJuAeOsudbmhhHzrO/yL7ZmsiKQTthsJv+SzC1bO94jhQrawZ
# Q7GCii5KpGq0KnRTRKZRGk6XKwxcYRduXMX3R5tXuVmDmCZsjhXzziU8yEdftph8
# 5TJdk1o0Gb043EFu81mrsQYS+9yJqe6sy6m3PTJaec54cAty5ln+c17WOvpAOaSV
# +1phe05ftuVPmQ3KWhbIR/tCmavNLwEZxpVIfyaKJx04bFbtQ9gRpRyURORX4KXc
# s4WXvNirQEohxYBnP4TPvA09xBTW3V08pk/wRDwt0YDXnLiqCltOuxD8r05K8K4B
# MkCLcWj0g7he2tBkF60oz1KSIE0oTB81um9AzLIv5F2YSYLaJM5BIcoC437MR2f4
# MOR7drR1fP5GsRu/SeU5BWvhVq3IvdOxR7G2MLNRJJvl7ZtGXDc=
# =uaqL
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 02 Nov 2023 19:40:03 HKT
# gpg: using RSA key 1899FF8EDEBF58CCEE034B82F487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>" [full]
# gpg: aka "Juan Quintela <quintela@trasno.org>" [full]
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723

* tag 'migration-20231102-pull-request' of https://gitlab.com/juan.quintela/qemu: (40 commits)
migration: modify test_multifd_tcp_none() to use new QAPI syntax.
migration: Implement MigrateChannelList to hmp migration flow.
migration: Implement MigrateChannelList to qmp migration flow.
migration: modify migration_channels_and_uri_compatible() for new QAPI syntax
migration: New migrate and migrate-incoming argument 'channels'
migration: Convert the file backend to the new QAPI syntax
migration: convert exec backend to accept MigrateAddress.
migration: convert rdma backend to accept MigrateAddress
migration: convert socket backend to accept MigrateAddress
migration: convert migration 'uri' into 'MigrateAddress'
migration: New QAPI type 'MigrateAddress'
migration: Change ram_dirty_bitmap_reload() retval to bool
tests/migration-test: Add a test for postcopy hangs during RECOVER
migration: Allow network to fail even during recovery
migration: Refactor error handling in source return path
tests/qtest: migration: add reboot mode test
cpr: reboot mode
cpr: relax vhost migration blockers
cpr: relax blockdev migration blockers
migration: per-mode blockers
...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

show more ...


# 7aa6070d 17-Oct-2023 Peter Xu <peterx@redhat.com>

migration: Refactor error handling in source return path

rp_state.error was a boolean used to show error happened in return path
thread. That's not only duplicating error reporting (migrate_set_err

migration: Refactor error handling in source return path

rp_state.error was a boolean used to show error happened in return path
thread. That's not only duplicating error reporting (migrate_set_error),
but also not good enough in that we only do error_report() and set it to
true, we never can keep a history of the exact error and show it in
query-migrate.

To make this better, a few things done:

- Use error_setg() rather than error_report() across the whole lifecycle
of return path thread, keeping the error in an Error*.

- With above, no need to have mark_source_rp_bad(), remove it, alongside
with rp_state.error itself.

- Use migrate_set_error() to apply that captured error to the global
migration object when error occured in this thread.

- Do the same when detected qemufile error in source return path

We need to re-export qemu_file_get_error_obj() to do the last one.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231017202633.296756-2-peterx@redhat.com>

show more ...


# 3e5f3bcd 30-Oct-2023 Peter Xu <peterx@redhat.com>

migration: Add tracepoints for downtime checkpoints

This patch is inspired by Joao Martin's patch here:

https://lore.kernel.org/r/20230926161841.98464-1-joao.m.martins@oracle.com

Add tracepoints f

migration: Add tracepoints for downtime checkpoints

This patch is inspired by Joao Martin's patch here:

https://lore.kernel.org/r/20230926161841.98464-1-joao.m.martins@oracle.com

Add tracepoints for major downtime checkpoints on both src and dst. They
share the same tracepoint with a string showing its stage.

Besides the checkpoints in the previous patch, this patch also added
destination checkpoints.

On src, we have these checkpoints added:

- src-downtime-start: right before vm stops on src
- src-vm-stopped: after vm is fully stopped
- src-iterable-saved: after all iterables saved (END sections)
- src-non-iterable-saved: after all non-iterable saved (FULL sections)
- src-downtime-stop: migration fully completed

On dst, we have these checkpoints added:

- dst-precopy-loadvm-completes: after loadvm all done for precopy
- dst-precopy-bh-*: record BH steps to resume VM for precopy
- dst-postcopy-bh-*: record BH steps to resume VM for postcopy

On dst side, we don't have a good way to trace total time consumed by
iterable or non-iterable for now. We can mark it by 1st time receiving a
FULL / END section, but rather than that let's just rely on the other
tracepoints added for vmstates to back up the information.

With this patch, one can enable "vmstate_downtime*" tracepoints and it'll
enable all tracepoints for downtime measurements necessary.

Drop loadvm_postcopy_handle_run_bh() tracepoint alongside, because they
service the same purpose, which was only for postcopy. We then have
unified prefix for all downtime relevant tracepoints.

Co-developed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231030163346.765724-6-peterx@redhat.com>

show more ...


# 3c80f142 30-Oct-2023 Peter Xu <peterx@redhat.com>

migration: Add per vmstate downtime tracepoints

We have a bunch of savevm_section* tracepoints, they're good to analyze
migration stream, but not always suitable if someone would like to analyze
the

migration: Add per vmstate downtime tracepoints

We have a bunch of savevm_section* tracepoints, they're good to analyze
migration stream, but not always suitable if someone would like to analyze
the migration downtime. Two major problems:

- savevm_section* tracepoints are dumping all sections, we only care
about the sections that contribute to the downtime

- They don't have an identifier to show the type of sections, so no way
to filter downtime information either easily.

We can add type into the tracepoints, but instead of doing so, this patch
kept them untouched, instead of adding a bunch of downtime specific
tracepoints, so one can enable "vmstate_downtime*" tracepoints and get a
full picture of how the downtime is distributed across iterative and
non-iterative vmstate save/load.

Note that here both save() and load() need to be traced, because both of
them may contribute to the downtime. The contribution is not a simple "add
them together", though: consider when the src is doing a save() of device1
while the dest can be load()ing for device2, so they can happen
concurrently.

Tracking both sides make sense because device load() and save() can be
imbalanced, one device can save() super fast, but load() super slow, vice
versa. We can't figure that out without tracing both.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231030163346.765724-4-peterx@redhat.com>

show more ...


# ec6f9f13 17-Oct-2023 Stefan Hajnoczi <stefanha@redhat.com>

Merge tag 'migration-20231017-pull-request' of https://gitlab.com/juan.quintela/qemu into staging

Migration Pull request (20231017)

Hi

Same that yesterday one, except:
- rebased to latest (clean r

Merge tag 'migration-20231017-pull-request' of https://gitlab.com/juan.quintela/qemu into staging

Migration Pull request (20231017)

Hi

Same that yesterday one, except:
- rebased to latest (clean rebase)
- fixed 64 bits read on big endian host

CI: https://gitlab.com/juan.quintela/qemu/-/pipelines/1039214198

Please, apply.

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAmUuReUACgkQ9IfvGFhy
# 1yO+FQ/+Nx2botbrUVJb3vLeG6f+x5xeWJjB0boOqhk7227cKmAA33Oqwx5l4UtL
# oLOHA6P4ThqacpaluGOMMp44BSr/jOMDC/HUDVJtSplTD+droPiklIIGUfYScLbA
# oYx6lXfSB2jMpSuSU19STbjwBRvd4bjJix3zDGwEIgXYqYt0tY0FY/nnGTmImnM1
# KDjRerf1lg4Rt0vvwg7I0onIDvh3CKX26Sj5a3wSRaLoocUe3jpsuBNH7MMqroHs
# WpocBIsLiBAf/CbeLZsQlhbVeOi1R+kSAR5hDPvvJCPWHIrd2wf8+3NXjcFepb7d
# M4wE2jLjCvHhzwYwSc0ir4n74jwD22IirEPQs8ONHrjLCb5VoBKYV5bqsFUHF55N
# SbFvcZIzJFiOm2anEWiiqiNTLtYAdQCKtUvbyJ7Mq4ck6icIInLdX9zrm4voofYJ
# 02lX/IIGlT3C3dGSz09LBoJ6E82zmQWNHmov8A90+3RYvMF9uSpxi0z40lhj6jWC
# 6Q2AHxrJJ040ZboeOfJQG78BtvZ/9PQ2ORhJ3ceRDND4kSTDtfe/TSNAZ3thM33y
# Sv99o+F/HaqrKnxK8eTJrvIEWxojDu3lnqJERWAm2AOxTnQ+6mgGtsCfLEdrv5D1
# xVsY2QczB1quRjaU2ml/7Cxe4Q1urTtfl82IEXGded6UL+cmF/I=
# =br93
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 17 Oct 2023 04:29:25 EDT
# gpg: using RSA key 1899FF8EDEBF58CCEE034B82F487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>" [full]
# gpg: aka "Juan Quintela <quintela@trasno.org>" [full]
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723

* tag 'migration-20231017-pull-request' of https://gitlab.com/juan.quintela/qemu: (38 commits)
migration/multifd: Clarify Error usage in multifd_channel_connect
migration/multifd: Unify multifd_send_thread error paths
migration/multifd: Remove direct "socket" references
migration/ram: Merge save_zero_page functions
migration/ram: Move xbzrle zero page handling into save_zero_page
migration/ram: Stop passing QEMUFile around in save_zero_page
migration/ram: Remove RAMState from xbzrle_cache_zero_page
migration/ram: Refactor precopy ram loading code
multifd: reset next_packet_len after sending pages
multifd: fix counters in multifd_send_thread
migration: check for rate_limit_max for RATE_LIMIT_DISABLED
migration: Improve json and formatting
migration/rdma: Remove all "ret" variables that are used only once
migration/rdma: Declare for index variables local
migration/rdma: Use i as for index instead of idx
migration/rdma: Check sooner if we are in postcopy for save_page()
migration/rdma: Remove qemu_ prefix from exported functions
migration/rdma: Move rdma constants from qemu-file.h to rdma.h
qemu-file: Remove QEMUFileHooks
migration/rdma: Create rdma_control_save_page()
...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

show more ...


# 967e3889 12-Oct-2023 Fabiano Rosas <farosas@suse.de>

migration/multifd: Clarify Error usage in multifd_channel_connect

The function is currently called from two sites, one always gives it a
NULL Error and the other always gives it a non-NULL Error.

I

migration/multifd: Clarify Error usage in multifd_channel_connect

The function is currently called from two sites, one always gives it a
NULL Error and the other always gives it a non-NULL Error.

In the non-NULL case, all it does it trace the error and return. One
of the callers already have tracing, add a tracepoint to the other and
stop passing the error into the function.

Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231012134343.23757-4-farosas@suse.de>

show more ...


# b1b38387 11-Oct-2023 Juan Quintela <quintela@redhat.com>

migration/rdma: Remove qemu_ prefix from exported functions

Functions are long enough even without this.

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Si

migration/rdma: Remove qemu_ prefix from exported functions

Functions are long enough even without this.

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231011203527.9061-10-quintela@redhat.com>

show more ...


# 8b239597 10-Oct-2023 Peter Xu <peterx@redhat.com>

migration: Allow user to specify available switchover bandwidth

Migration bandwidth is a very important value to live migration. It's
because it's one of the major factors that we'll make decision

migration: Allow user to specify available switchover bandwidth

Migration bandwidth is a very important value to live migration. It's
because it's one of the major factors that we'll make decision on when to
switchover to destination in a precopy process.

This value is currently estimated by QEMU during the whole live migration
process by monitoring how fast we were sending the data. This can be the
most accurate bandwidth if in the ideal world, where we're always feeding
unlimited data to the migration channel, and then it'll be limited to the
bandwidth that is available.

However in reality it may be very different, e.g., over a 10Gbps network we
can see query-migrate showing migration bandwidth of only a few tens of
MB/s just because there are plenty of other things the migration thread
might be doing. For example, the migration thread can be busy scanning
zero pages, or it can be fetching dirty bitmap from other external dirty
sources (like vhost or KVM). It means we may not be pushing data as much
as possible to migration channel, so the bandwidth estimated from "how many
data we sent in the channel" can be dramatically inaccurate sometimes.

With that, the decision to switchover will be affected, by assuming that we
may not be able to switchover at all with such a low bandwidth, but in
reality we can.

The migration may not even converge at all with the downtime specified,
with that wrong estimation of bandwidth, keeping iterations forever with a
low estimation of bandwidth.

The issue is QEMU itself may not be able to avoid those uncertainties on
measuing the real "available migration bandwidth". At least not something
I can think of so far.

One way to fix this is when the user is fully aware of the available
bandwidth, then we can allow the user to help providing an accurate value.

For example, if the user has a dedicated channel of 10Gbps for migration
for this specific VM, the user can specify this bandwidth so QEMU can
always do the calculation based on this fact, trusting the user as long as
specified. It may not be the exact bandwidth when switching over (in which
case qemu will push migration data as fast as possible), but much better
than QEMU trying to wildly guess, especially when very wrong.

A new parameter "avail-switchover-bandwidth" is introduced just for this.
So when the user specified this parameter, instead of trusting the
estimated value from QEMU itself (based on the QEMUFile send speed), it
trusts the user more by using this value to decide when to switchover,
assuming that we'll have such bandwidth available then.

Note that specifying this value will not throttle the bandwidth for
switchover yet, so QEMU will always use the full bandwidth possible for
sending switchover data, assuming that should always be the most important
way to use the network at that time.

This can resolve issues like "unconvergence migration" which is caused by
hilarious low "migration bandwidth" detected for whatever reason.

Reported-by: Zhiyi Guo <zhguo@redhat.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231010221922.40638-1-peterx@redhat.com>

show more ...


12345678