| bb51f2fa | 14-Jan-2021 |
Daniel Henrique Barboza <danielhb413@gmail.com> |
spapr.h: fix trailing whitespace in phb_placement
This whitespace was messing with lots of diffs if you happen to use an editor that eliminates trailing whitespaces on file save.
Signed-off-by: Dan
spapr.h: fix trailing whitespace in phb_placement
This whitespace was messing with lots of diffs if you happen to use an editor that eliminates trailing whitespaces on file save.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210114180628.1675603-2-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
show more ...
|
| 73598c75 | 08-Jan-2021 |
Greg Kurz <groug@kaod.org> |
spapr: Improve handling of memory unplug with old guests
Since commit 1e8b5b1aa16b ("spapr: Allow memory unplug to always succeed") trying to unplug memory from a guest that doesn't support it (eg.
spapr: Improve handling of memory unplug with old guests
Since commit 1e8b5b1aa16b ("spapr: Allow memory unplug to always succeed") trying to unplug memory from a guest that doesn't support it (eg. rhel6) no longer generates an error like it used to. Instead, it leaves the memory around : only a subsequent reboot or manual use of drmgr within the guest can complete the hot-unplug sequence. A flag was added to SpaprMachineClass so that this new behavior only applies to the default machine type.
We can do better. CAS processes all pending hot-unplug requests. This means that we don't really care about what the guest supports if the hot-unplug request happens before CAS.
All guests that we care for, even old ones, set enough bits in OV5 that lead to a non-empty bitmap in spapr->ov5_cas. Use that as a heuristic to decide if CAS has already occured or not.
Always accept unplug requests that happen before CAS since CAS will process them. Restore the previous behavior of rejecting them after CAS when we know that the guest doesn't support memory hot-unplug.
This behavior is suitable for all machine types : this allows to drop the pre_6_0_memory_unplug flag.
Fixes: 1e8b5b1aa16b ("spapr: Allow memory unplug to always succeed") Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <161012708715.801107.11418801796987916516.stgit@bahia.lan> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
show more ...
|
| babb819f | 18-Dec-2020 |
Greg Kurz <groug@kaod.org> |
spapr: Introduce spapr_drc_reset_all()
No need to expose the way DRCs are traversed outside of spapr_drc.c.
Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20201218103400.689660-4-groug@kaod
spapr: Introduce spapr_drc_reset_all()
No need to expose the way DRCs are traversed outside of spapr_drc.c.
Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20201218103400.689660-4-groug@kaod.org> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
show more ...
|
| 930ef3b5 | 18-Dec-2020 |
Greg Kurz <groug@kaod.org> |
spapr: Fix reset of transient DR connectors
Documentation of object_property_iter_init() clearly stipulates that "it is forbidden to modify the property list while iterating". But this is exactly wh
spapr: Fix reset of transient DR connectors
Documentation of object_property_iter_init() clearly stipulates that "it is forbidden to modify the property list while iterating". But this is exactly what we do when resetting transient DR connectors during CAS. The call to spapr_drc_reset() can finalize the hot-unplug sequence of a PHB or a PCI bridge, both of which will then in turn destroy their PCI DRCs. This could potentially invalidate the iterator. It is pure luck that this haven't caused any issues so far.
Change spapr_drc_reset() to return true if it caused a device to be removed. Restart from scratch in this case. This can potentially increase the overall DRC reset time, especially with a high maxmem which generates a lot of LMB DRCs. But this kind of setup is rare, and so is the use case of rebooting a guest while doing hot-unplug.
Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20201218103400.689660-3-groug@kaod.org> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
show more ...
|
| cd725bd7 | 18-Dec-2020 |
Greg Kurz <groug@kaod.org> |
spapr: Call spapr_drc_reset() for all DRCs at CAS
Non-transient DRCs are either in the empty or the ready state, which means spapr_drc_reset() doesn't change their state. It is thus not needed to do
spapr: Call spapr_drc_reset() for all DRCs at CAS
Non-transient DRCs are either in the empty or the ready state, which means spapr_drc_reset() doesn't change their state. It is thus not needed to do any checking. Call spapr_drc_reset() unconditionally and squash spapr_drc_transient() into its only user, spapr_drc_needed().
Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20201218103400.689660-2-groug@kaod.org> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
show more ...
|
| 30499fdd | 18-Dec-2020 |
Greg Kurz <groug@kaod.org> |
spapr: Fix buffer overflow in spapr_numa_associativity_init()
Running a guest with 128 NUMA nodes crashes QEMU:
../../util/error.c:59: error_setv: Assertion `*errp == NULL' failed.
The crash happe
spapr: Fix buffer overflow in spapr_numa_associativity_init()
Running a guest with 128 NUMA nodes crashes QEMU:
../../util/error.c:59: error_setv: Assertion `*errp == NULL' failed.
The crash happens when setting the FWNMI migration blocker:
2861 if (spapr_get_cap(spapr, SPAPR_CAP_FWNMI) == SPAPR_CAP_ON) { 2862 /* Create the error string for live migration blocker */ 2863 error_setg(&spapr->fwnmi_migration_blocker, 2864 "A machine check is being handled during migration. The handler" 2865 "may run and log hardware error on the destination"); 2866 }
Inspection reveals that papr->fwnmi_migration_blocker isn't NULL:
(gdb) p spapr->fwnmi_migration_blocker $1 = (Error *) 0x8000000004000000
Since this is the only place where papr->fwnmi_migration_blocker is set, this means someone wrote there in our back. Further analysis points to spapr_numa_associativity_init(), especially the part that initializes the associative arrays for NVLink GPUs:
max_nodes_with_gpus = nb_numa_nodes + NVGPU_MAX_NUM;
ie. max_nodes_with_gpus = 128 + 6, but the array isn't sized to accommodate the 6 extra nodes:
struct SpaprMachineState { . . . uint32_t numa_assoc_array[MAX_NODES][NUMA_ASSOC_SIZE];
Error *fwnmi_migration_blocker; };
and the following loops happily overwrite spapr->fwnmi_migration_blocker, and probably more:
for (i = nb_numa_nodes; i < max_nodes_with_gpus; i++) { spapr->numa_assoc_array[i][0] = cpu_to_be32(MAX_DISTANCE_REF_POINTS);
for (j = 1; j < MAX_DISTANCE_REF_POINTS; j++) { uint32_t gpu_assoc = smc->pre_5_1_assoc_refpoints ? SPAPR_GPU_NUMA_ID : cpu_to_be32(i); spapr->numa_assoc_array[i][j] = gpu_assoc; }
spapr->numa_assoc_array[i][MAX_DISTANCE_REF_POINTS] = cpu_to_be32(i); }
Fix the size of the array. This requires "hw/ppc/spapr.h" to see NVGPU_MAX_NUM. Including "hw/pci-host/spapr.h" introduces a circular dependency that breaks the build, so this moves the definition of NVGPU_MAX_NUM to "hw/ppc/spapr.h" instead.
Reported-by: Min Deng <mdeng@redhat.com> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1908693 Fixes: dd7e1d7ae431 ("spapr_numa: move NVLink2 associativity handling to spapr_numa.c") Cc: danielhb413@gmail.com Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <160829960428.734871.12634150161215429514.stgit@bahia.lan> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
show more ...
|
| 1e8b5b1a | 14-Dec-2020 |
Greg Kurz <groug@kaod.org> |
spapr: Allow memory unplug to always succeed
It is currently impossible to hot-unplug a memory device between machine reset and CAS.
(qemu) device_del dimm1 Error: Memory hot unplug not supported f
spapr: Allow memory unplug to always succeed
It is currently impossible to hot-unplug a memory device between machine reset and CAS.
(qemu) device_del dimm1 Error: Memory hot unplug not supported for this guest
This limitation was introduced in order to provide an explicit error path for older guests that didn't support hot-plug event sources (and thus memory hot-unplug).
The linux kernel has been supporting these since 4.11. All recent enough guests are thus capable of handling the removal of a memory device at all time, including during early boot.
Lift the limitation for the latest machine type. This means that trying to unplug memory from a guest that doesn't support it will likely just do nothing and the memory will only get removed at next reboot. Such older guests can still get the existing behavior by using an older machine type.
Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <160794035064.23292.17560963281911312439.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
show more ...
|
| f5598c92 | 20-Nov-2020 |
Greg Kurz <groug@kaod.org> |
spapr: Make PHB placement functions and spapr_pre_plug_phb() return status
Read documentation in "qapi/error.h" and changelog of commit e3fe3988d785 ("error: Document Error API usage rules") for rat
spapr: Make PHB placement functions and spapr_pre_plug_phb() return status
Read documentation in "qapi/error.h" and changelog of commit e3fe3988d785 ("error: Document Error API usage rules") for rationale.
Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20201120234208.683521-7-groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
show more ...
|
| ea042c53 | 20-Nov-2020 |
Greg Kurz <groug@kaod.org> |
spapr: Do NVDIMM/PC-DIMM device hotplug sanity checks at pre-plug only
Pre-plug of a memory device, be it an NVDIMM or a PC-DIMM, ensures that the memory slot is available and that addresses don't o
spapr: Do NVDIMM/PC-DIMM device hotplug sanity checks at pre-plug only
Pre-plug of a memory device, be it an NVDIMM or a PC-DIMM, ensures that the memory slot is available and that addresses don't overlap with existing memory regions. The corresponding DRCs in the LMB and PMEM namespaces are thus necessarily attachable at plug time.
Pass &error_abort to spapr_drc_attach() in spapr_add_lmbs() and spapr_add_nvdimm(). This allows to greatly simplify error handling on the plug path.
Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20201120234208.683521-3-groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
show more ...
|
| a4e3a7c0 | 26-Oct-2020 |
Greg Kurz <groug@kaod.org> |
spapr: Improve spapr_reallocate_hpt() error reporting
spapr_reallocate_hpt() has three users, two of which pass &error_fatal and the third one, htab_load(), passes &local_err, uses it to detect fail
spapr: Improve spapr_reallocate_hpt() error reporting
spapr_reallocate_hpt() has three users, two of which pass &error_fatal and the third one, htab_load(), passes &local_err, uses it to detect failures and simply propagates -EINVAL up to vmstate_load(), which will cause QEMU to exit. It is thus confusing that spapr_reallocate_hpt() doesn't return right away when an error is detected in some cases. Also, the comment suggesting that the caller is welcome to try to carry on seems like a remnant in this respect.
This can be improved: - change spapr_reallocate_hpt() to always report a negative errno on failure, either as reported by KVM or -ENOSPC if the HPT is smaller than what was asked, - use that to detect failures in htab_load() which is preferred over checking &local_err, - propagate this negative errno to vmstate_load() because it is more accurate than propagating -EINVAL for all possible errors.
[dwg: Fix compile error due to omitted prelim patch] Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <160371605460.305923.5890143959901241157.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
show more ...
|
| 6e837f98 | 19-Oct-2020 |
Greg Kurz <groug@kaod.org> |
spapr: Simplify error handling in spapr_memory_plug()
As recommended in "qapi/error.h", add a bool return value to spapr_add_lmbs() and spapr_add_nvdimm(), and use them instead of local_err in spapr
spapr: Simplify error handling in spapr_memory_plug()
As recommended in "qapi/error.h", add a bool return value to spapr_add_lmbs() and spapr_add_nvdimm(), and use them instead of local_err in spapr_memory_plug().
This allows to get rid of the error propagation overhead.
Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <160309734178.2739814.3488437759887793902.stgit@bahia.lan> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
show more ...
|
| 29bfe52a | 07-Oct-2020 |
Daniel Henrique Barboza <danielhb413@gmail.com> |
spapr: add spapr_machine_using_legacy_numa() helper
The changes to come to NUMA support are all guest visible. In theory we could just create a new 5_1 class option flag to avoid the changes to casc
spapr: add spapr_machine_using_legacy_numa() helper
The changes to come to NUMA support are all guest visible. In theory we could just create a new 5_1 class option flag to avoid the changes to cascade to 5.1 and under. The reality is that these changes are only relevant if the machine has more than one NUMA node. There is no need to change guest behavior that has been around for years needlesly.
This new helper will be used by the next patches to determine whether we should retain the (soon to be) legacy NUMA behavior in the pSeries machine. The new behavior will only be exposed if:
- machine is pseries-5.2 and newer; - more than one NUMA node is declared in NUMA state.
Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20201007172849.302240-2-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
show more ...
|
| 35dce34f | 14-Sep-2020 |
Greg Kurz <groug@kaod.org> |
spapr: Add a return value to spapr_check_pagesize()
As recommended in "qapi/error.h", return true on success and false on failure. This allows to reduce error propagation overhead in the callers.
S
spapr: Add a return value to spapr_check_pagesize()
As recommended in "qapi/error.h", return true on success and false on failure. This allows to reduce error propagation overhead in the callers.
Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20200914123505.612812-14-groug@kaod.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
show more ...
|
| 451c6905 | 14-Sep-2020 |
Greg Kurz <groug@kaod.org> |
spapr: Add a return value to spapr_nvdimm_validate()
As recommended in "qapi/error.h", return true on success and false on failure. This allows to reduce error propagation overhead in the callers.
spapr: Add a return value to spapr_nvdimm_validate()
As recommended in "qapi/error.h", return true on success and false on failure. This allows to reduce error propagation overhead in the callers.
Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20200914123505.612812-13-groug@kaod.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
show more ...
|
| cfdc5274 | 14-Sep-2020 |
Greg Kurz <groug@kaod.org> |
spapr: Add a return value to spapr_set_vcpu_id()
As recommended in "qapi/error.h", return true on success and false on failure. This allows to reduce error propagation overhead in the callers.
Sign
spapr: Add a return value to spapr_set_vcpu_id()
As recommended in "qapi/error.h", return true on success and false on failure. This allows to reduce error propagation overhead in the callers.
Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20200914123505.612812-11-groug@kaod.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
show more ...
|