8d216d8c | 24-Jul-2019 |
Greg Kurz <groug@kaod.org> |
xics/kvm: Fix fallback to emulated XICS
Commit 4812f2615288 tried to fix rollback path of xics_kvm_connect() but it isn't enough. If we fail to create the KVM device, the guest fails to boot later o
xics/kvm: Fix fallback to emulated XICS
Commit 4812f2615288 tried to fix rollback path of xics_kvm_connect() but it isn't enough. If we fail to create the KVM device, the guest fails to boot later on with:
[ 0.010817] pci 0000:00:00.0: Adding to iommu group 0 [ 0.010863] irq: unknown-1 didn't like hwirq-0x1200 to VIRQ17 mapping (rc=-22) [ 0.010923] pci 0000:00:01.0: Adding to iommu group 0 [ 0.010968] irq: unknown-1 didn't like hwirq-0x1201 to VIRQ17 mapping (rc=-22) [ 0.011543] EEH: No capable adapters found [ 0.011597] irq: unknown-1 didn't like hwirq-0x1000 to VIRQ17 mapping (rc=-22) [ 0.011651] audit: type=2000 audit(1563977526.000:1): state=initialized audit_enabled=0 res=1 [ 0.011703] ------------[ cut here ]------------ [ 0.011729] event-sources: Unable to allocate interrupt number for /event-sources/epow-events [ 0.011776] WARNING: CPU: 0 PID: 1 at arch/powerpc/platforms/pseries/event_sources.c:34 request_event_sources_irqs+0xbc/0x150 [ 0.011828] Modules linked in: [ 0.011850] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.1.17-300.fc30.ppc64le #1 [ 0.011886] NIP: c0000000000d4fac LR: c0000000000d4fa8 CTR: c0000000018f0000 [ 0.011923] REGS: c00000001e4c38d0 TRAP: 0700 Not tainted (5.1.17-300.fc30.ppc64le) [ 0.011966] MSR: 8000000002029033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 28000284 XER: 20040000 [ 0.012012] CFAR: c00000000011b42c IRQMASK: 0 [ 0.012012] GPR00: c0000000000d4fa8 c00000001e4c3b60 c0000000015fc400 0000000000000051 [ 0.012012] GPR04: 0000000000000001 0000000000000000 0000000000000081 772d6576656e7473 [ 0.012012] GPR08: 000000001edf0000 c0000000014d4830 c0000000014d4830 6e6576652f20726f [ 0.012012] GPR12: 0000000000000000 c0000000018f0000 c000000000010bf0 0000000000000000 [ 0.012012] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 0.012012] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 0.012012] GPR24: 0000000000000000 0000000000000000 c000000000ebbf00 c0000000000d5570 [ 0.012012] GPR28: c000000000ebc008 c00000001fff8248 0000000000000000 0000000000000000 [ 0.012372] NIP [c0000000000d4fac] request_event_sources_irqs+0xbc/0x150 [ 0.012409] LR [c0000000000d4fa8] request_event_sources_irqs+0xb8/0x150 [ 0.012445] Call Trace: [ 0.012462] [c00000001e4c3b60] [c0000000000d4fa8] request_event_sources_irqs+0xb8/0x150 (unreliable) [ 0.012513] [c00000001e4c3bf0] [c000000001042848] __machine_initcall_pseries_init_ras_IRQ+0xc8/0xf8 [ 0.012563] [c00000001e4c3c20] [c000000000010810] do_one_initcall+0x60/0x254 [ 0.012611] [c00000001e4c3cf0] [c000000001024538] kernel_init_freeable+0x35c/0x444 [ 0.012655] [c00000001e4c3db0] [c000000000010c14] kernel_init+0x2c/0x148 [ 0.012693] [c00000001e4c3e20] [c00000000000bdc4] ret_from_kernel_thread+0x5c/0x78 [ 0.012736] Instruction dump: [ 0.012759] 38a00000 7c7f1b78 7f64db78 2c1f0000 2fbf0000 78630020 4180002c 409effa8 [ 0.012805] 7fa4eb78 7f43d378 48046421 60000000 <0fe00000> 3bde0001 2c1e0010 7fde07b4 [ 0.012851] ---[ end trace aa5785707323fad3 ]---
This happens because QEMU fell back on XICS emulation but didn't unregister the RTAS calls from KVM. The emulated RTAS calls are hence never called and the KVM ones return an error to the guest since the KVM device is absent.
The sanity checks in xics_kvm_disconnect() are abusive since we're freeing the KVM device. Simply drop them.
Fixes: 4812f2615288 "xics/kvm: Add proper rollback to xics_kvm_init()" Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <156398744035.546975.7029414194633598474.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
show more ...
|
38298611 | 03-Jul-2019 |
Greg Kurz <groug@kaod.org> |
xics/kvm: Always set the MASKED bit if interrupt is masked
The ics_set_kvm_state_one() function is called either to restore the state of an interrupt source during migration or to set the interrupt
xics/kvm: Always set the MASKED bit if interrupt is masked
The ics_set_kvm_state_one() function is called either to restore the state of an interrupt source during migration or to set the interrupt source to a default state during reset.
Since always, ie. 2013, the code only sets the MASKED bit if the 'current priority' and the 'saved priority' are different. This is likely true when restoring an interrupt that had been previously masked with the ibm,int-off RTAS call. However this is always false in the case of reset since both 'current priority' and 'saved priority' are equal to 0xff, and the MASKED bit is never set.
The legacy KVM XICS device gets away with that because it ends updating its internal structure the same way, whether the MASKED bit is set or the priority is 0xff.
The XICS-on-XIVE device for POWER9 is different. It sticks to the KVM documentation [1] and _really_ relies on the MASKED bit to correctly set. If not, it will configure the interrupt source in the XIVE HW, even though the guest hasn't configured the interrupt yet. This disturbs the complex logic implemented in XICS-on-XIVE and may result in the loss of subsequent queued events.
Always set the MASKED bit if interrupt is masked as expected by the KVM XICS-on-XIVE device. This has no impact on the legacy KVM XICS.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/virtual/kvm/devices/xics.txt
Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <156217454083.559957.7359208229523652842.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
show more ...
|
03f990a5 | 21-Jun-2019 |
Li Qiang <liq3ea@163.com> |
ioapic: use irq number instead of vector in ioapic_eoi_broadcast
When emulating irqchip in qemu, such as following command:
x86_64-softmmu/qemu-system-x86_64 -m 1024 -smp 4 -hda /home/test/test.img
ioapic: use irq number instead of vector in ioapic_eoi_broadcast
When emulating irqchip in qemu, such as following command:
x86_64-softmmu/qemu-system-x86_64 -m 1024 -smp 4 -hda /home/test/test.img -machine kernel-irqchip=off --enable-kvm -vnc :0 -device edu -monitor stdio
We will get a crash with following asan output:
(qemu) /home/test/qemu5/qemu/hw/intc/ioapic.c:266:27: runtime error: index 35 out of bounds for type 'int [24]' ================================================================= ==113504==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61b000003114 at pc 0x5579e3c7a80f bp 0x7fd004bf8c10 sp 0x7fd004bf8c00 WRITE of size 4 at 0x61b000003114 thread T4 #0 0x5579e3c7a80e in ioapic_eoi_broadcast /home/test/qemu5/qemu/hw/intc/ioapic.c:266 #1 0x5579e3c6f480 in apic_eoi /home/test/qemu5/qemu/hw/intc/apic.c:428 #2 0x5579e3c720a7 in apic_mem_write /home/test/qemu5/qemu/hw/intc/apic.c:802 #3 0x5579e3b1e31a in memory_region_write_accessor /home/test/qemu5/qemu/memory.c:503 #4 0x5579e3b1e6a2 in access_with_adjusted_size /home/test/qemu5/qemu/memory.c:569 #5 0x5579e3b28d77 in memory_region_dispatch_write /home/test/qemu5/qemu/memory.c:1497 #6 0x5579e3a1b36b in flatview_write_continue /home/test/qemu5/qemu/exec.c:3323 #7 0x5579e3a1b633 in flatview_write /home/test/qemu5/qemu/exec.c:3362 #8 0x5579e3a1bcb1 in address_space_write /home/test/qemu5/qemu/exec.c:3452 #9 0x5579e3a1bd03 in address_space_rw /home/test/qemu5/qemu/exec.c:3463 #10 0x5579e3b8b979 in kvm_cpu_exec /home/test/qemu5/qemu/accel/kvm/kvm-all.c:2045 #11 0x5579e3ae4499 in qemu_kvm_cpu_thread_fn /home/test/qemu5/qemu/cpus.c:1287 #12 0x5579e4cbdb9f in qemu_thread_start util/qemu-thread-posix.c:502 #13 0x7fd0146376da in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76da) #14 0x7fd01436088e in __clone (/lib/x86_64-linux-gnu/libc.so.6+0x12188e
This is because in ioapic_eoi_broadcast function, we uses 'vector' to index the 's->irq_eoi'. To fix this, we should uses the irq number.
Signed-off-by: Li Qiang <liq3ea@163.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <20190622002119.126834-1-liq3ea@163.com>
show more ...
|
be32116e | 04-Jul-2019 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: v8M: Check state of exception being returned from
In v8M, an attempt to return from an exception which is not active is an illegal exception return. For this purpose, exceptions which ca
target/arm: v8M: Check state of exception being returned from
In v8M, an attempt to return from an exception which is not active is an illegal exception return. For this purpose, exceptions which can configurably target either Secure or NonSecure are not considered to be active if they are configured for the opposite security state for the one we're trying to return from (eg attempt to return from an NS NMI but NMI targets Secure). In the pseudocode this is handled by IsActiveForState().
Detect this case rather than counting an active exception possibly of the wrong security state as being sufficient.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20190617175317.27557-4-peter.maydell@linaro.org
show more ...
|
1c3d4a8f | 01-Jul-2019 |
Greg Kurz <groug@kaod.org> |
spapr/xive: Add proper rollback to kvmppc_xive_connect()
Make kvmppc_xive_disconnect() able to undo the changes of a partial execution of kvmppc_xive_connect() and use it to perform rollback.
Signe
spapr/xive: Add proper rollback to kvmppc_xive_connect()
Make kvmppc_xive_disconnect() able to undo the changes of a partial execution of kvmppc_xive_connect() and use it to perform rollback.
Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Message-Id: <156198735673.293938.7313195993600841641.stgit@bahia> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
show more ...
|
aaa45030 | 30-Jun-2019 |
Cédric Le Goater <clg@kaod.org> |
ppc/xive: Fix TM_PULL_POOL_CTX special operation
When a CPU is reseted, the hypervisor (Linux or OPAL) invalidates the POOL interrupt context of a CPU with this special command. It returns the POOL
ppc/xive: Fix TM_PULL_POOL_CTX special operation
When a CPU is reseted, the hypervisor (Linux or OPAL) invalidates the POOL interrupt context of a CPU with this special command. It returns the POOL CAM line value and resets the VP bit.
Fixes: 4836b45510aa ("ppc/xive: activate HV support") Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190630204601.30574-5-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
show more ...
|
0df68c7e | 30-Jun-2019 |
Cédric Le Goater <clg@kaod.org> |
ppc/pnv: Rework cache watch model of PnvXIVE
When the software modifies the XIVE internal structures, ESB, EAS, END, NVT, it also must update the caches of the different XIVE sub-engines. HW offers
ppc/pnv: Rework cache watch model of PnvXIVE
When the software modifies the XIVE internal structures, ESB, EAS, END, NVT, it also must update the caches of the different XIVE sub-engines. HW offers a set of common interface for such purpose.
The CWATCH_SPEC register defines the block/index of the target and a set of flags to perform a full update and to watch for update conflicts.
The cache watch CWATCH_DATAX registers are then loaded with the target data with a first read on CWATCH_DATA0. Writing back is done in the opposit order, CWATCH_DATA0 triggering the update.
The SCRUB_TRIG registers are used to flush the cache in RAM, and to possibly invalidate it. Cache disablement is also an option but as we do not model the cache, these registers are no-ops
Today, the modeling of these registers is incorrect but it did not impact the set up of a baremetal system. However, running KVM requires a rework.
Fixes: 2dfa91a2aa5a ("ppc/pnv: add a XIVE interrupt controller model for POWER9") Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190630204601.30574-4-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
show more ...
|
8256870a | 30-Jun-2019 |
Cédric Le Goater <clg@kaod.org> |
ppc/xive: Make the PIPR register readonly
When the hypervisor (KVM) dispatches a vCPU on a HW thread, it restores its thread interrupt context. The Pending Interrupt Priority Register (PIPR) is comp
ppc/xive: Make the PIPR register readonly
When the hypervisor (KVM) dispatches a vCPU on a HW thread, it restores its thread interrupt context. The Pending Interrupt Priority Register (PIPR) is computed from the Interrupt Pending Buffer (IPB) and stores should not be allowed to change its value.
Fixes: 207d9fe98510 ("ppc/xive: introduce the XIVE interrupt thread context") Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190630204601.30574-3-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
show more ...
|
fe9a9d52 | 30-Jun-2019 |
Cédric Le Goater <clg@kaod.org> |
ppc/xive: Force the Physical CAM line value to group mode
When an interrupt needs to be delivered, the XIVE interrupt controller presenter scans the CAM lines of the thread interrupt contexts of the
ppc/xive: Force the Physical CAM line value to group mode
When an interrupt needs to be delivered, the XIVE interrupt controller presenter scans the CAM lines of the thread interrupt contexts of the HW threads of the chip to find a matching vCPU. The interrupt context is composed of 4 different sets of registers: Physical, HV, OS and User.
The encoding of the Physical CAM line depends on the mode in which the interrupt controller is operating: CAM mode or block group mode. Block group mode being the default configuration today on POWER9 and the only one available on the next POWER10 generation, enforce this encoding in the Physical CAM line :
chip << 19 | 0000000 0 0001 thread (7Bit)
It fits the overall encoding of the NVT ids and simplifies the matching algorithm in the presenter.
Fixes: d514c48d41fb ("ppc/xive: hardwire the Physical CAM line of the thread context") Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190630204601.30574-2-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
show more ...
|
981b1c62 | 14-Jun-2019 |
Cédric Le Goater <clg@kaod.org> |
spapr/xive: rework the mapping the KVM memory regions
Today, the interrupt device is fully initialized at reset when the CAS negotiation process has completed. Depending on the KVM capabilities, the
spapr/xive: rework the mapping the KVM memory regions
Today, the interrupt device is fully initialized at reset when the CAS negotiation process has completed. Depending on the KVM capabilities, the SpaprXive memory regions (ESB, TIMA) are initialized with a host MMIO backend or a QEMU emulated backend. This results in a complex initialization sequence partially done at realize and later at reset, and some memory region leaks.
To simplify this sequence and to remove of the late initialization of the emulated device which is required to be done only once, we introduce new memory regions specific for KVM. These regions are mapped as overlaps on top of the emulated device to make use of the host MMIOs. Also provide proper cleanups of these regions when the XIVE KVM device is destroyed to fix the leaks.
Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190614165920.12670-2-clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
show more ...
|
4812f261 | 17-Jun-2019 |
Greg Kurz <groug@kaod.org> |
xics/kvm: Add proper rollback to xics_kvm_init()
Make xics_kvm_disconnect() able to undo the changes of a partial execution of xics_kvm_connect() and use it to perform rollback.
Note that kvmppc_de
xics/kvm: Add proper rollback to xics_kvm_init()
Make xics_kvm_disconnect() able to undo the changes of a partial execution of xics_kvm_connect() and use it to perform rollback.
Note that kvmppc_define_rtas_kernel_token(0) never fails, no matter the RTAS call has been defined or not.
Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <156077922319.433243.609897156640506891.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
show more ...
|
330a21e3 | 17-Jun-2019 |
Greg Kurz <groug@kaod.org> |
xics/kvm: Add error propagation to ic*_set_kvm_state() functions
This allows errors happening there to be propagated up to spapr_irq, just like XIVE already does.
Signed-off-by: Greg Kurz <groug@ka
xics/kvm: Add error propagation to ic*_set_kvm_state() functions
This allows errors happening there to be propagated up to spapr_irq, just like XIVE already does.
Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <156077921763.433243.4614327010172954196.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
show more ...
|
ab3d15fa | 17-Jun-2019 |
Greg Kurz <groug@kaod.org> |
xics/kvm: Always use local_err in xics_kvm_init()
Passing both errp and &local_err to functions is a recipe for messing things up.
Since we must use &local_err for icp_kvm_realize(), use &local_err
xics/kvm: Always use local_err in xics_kvm_init()
Passing both errp and &local_err to functions is a recipe for messing things up.
Since we must use &local_err for icp_kvm_realize(), use &local_err everywhere where rollback must happen and have a single call to error_propagate() them all. While here, add errno to the error message.
Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <156077921212.433243.11716701611944816815.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
show more ...
|
64fb9621 | 17-Jun-2019 |
Greg Kurz <groug@kaod.org> |
xics/kvm: Skip rollback when KVM XICS is absent
There is no need to rollback anything at this point, so just return an error.
Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <156077920657.433
xics/kvm: Skip rollback when KVM XICS is absent
There is no need to rollback anything at this point, so just return an error.
Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <156077920657.433243.13541093940589972734.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
show more ...
|