xref: /openbmc/qemu/docs/devel/nested-papr.txt (revision bef6a77f)
1Nested PAPR API (aka KVM on PowerVM)
2====================================
3
4This API aims at providing support to enable nested virtualization with
5KVM on PowerVM. While the existing support for nested KVM on PowerNV was
6introduced with cap-nested-hv option, however, with a slight design change,
7to enable this on papr/pseries, a new cap-nested-papr option is added. eg:
8
9  qemu-system-ppc64 -cpu POWER10 -machine pseries,cap-nested-papr=true ...
10
11Work by:
12    Michael Neuling <mikey@neuling.org>
13    Vaibhav Jain <vaibhav@linux.ibm.com>
14    Jordan Niethe <jniethe5@gmail.com>
15    Harsh Prateek Bora <harshpb@linux.ibm.com>
16    Shivaprasad G Bhat <sbhat@linux.ibm.com>
17    Kautuk Consul <kconsul@linux.vnet.ibm.com>
18
19Below taken from the kernel documentation:
20
21Introduction
22============
23
24This document explains how a guest operating system can act as a
25hypervisor and run nested guests through the use of hypercalls, if the
26hypervisor has implemented them. The terms L0, L1, and L2 are used to
27refer to different software entities. L0 is the hypervisor mode entity
28that would normally be called the "host" or "hypervisor". L1 is a
29guest virtual machine that is directly run under L0 and is initiated
30and controlled by L0. L2 is a guest virtual machine that is initiated
31and controlled by L1 acting as a hypervisor. A significant design change
32wrt existing API is that now the entire L2 state is maintained within L0.
33
34Existing Nested-HV API
35======================
36
37Linux/KVM has had support for Nesting as an L0 or L1 since 2018
38
39The L0 code was added::
40
41   commit 8e3f5fc1045dc49fd175b978c5457f5f51e7a2ce
42   Author: Paul Mackerras <paulus@ozlabs.org>
43   Date:   Mon Oct 8 16:31:03 2018 +1100
44   KVM: PPC: Book3S HV: Framework and hcall stubs for nested virtualization
45
46The L1 code was added::
47
48   commit 360cae313702cdd0b90f82c261a8302fecef030a
49   Author: Paul Mackerras <paulus@ozlabs.org>
50   Date:   Mon Oct 8 16:31:04 2018 +1100
51   KVM: PPC: Book3S HV: Nested guest entry via hypercall
52
53This API works primarily using a signal hcall h_enter_nested(). This
54call made by the L1 to tell the L0 to start an L2 vCPU with the given
55state. The L0 then starts this L2 and runs until an L2 exit condition
56is reached. Once the L2 exits, the state of the L2 is given back to
57the L1 by the L0. The full L2 vCPU state is always transferred from
58and to L1 when the L2 is run. The L0 doesn't keep any state on the L2
59vCPU (except in the short sequence in the L0 on L1 -> L2 entry and L2
60-> L1 exit).
61
62The only state kept by the L0 is the partition table. The L1 registers
63it's partition table using the h_set_partition_table() hcall. All
64other state held by the L0 about the L2s is cached state (such as
65shadow page tables).
66
67The L1 may run any L2 or vCPU without first informing the L0. It
68simply starts the vCPU using h_enter_nested(). The creation of L2s and
69vCPUs is done implicitly whenever h_enter_nested() is called.
70
71In this document, we call this existing API the v1 API.
72
73New PAPR API
74===============
75
76The new PAPR API changes from the v1 API such that the creating L2 and
77associated vCPUs is explicit. In this document, we call this the v2
78API.
79
80h_enter_nested() is replaced with H_GUEST_VCPU_RUN().  Before this can
81be called the L1 must explicitly create the L2 using h_guest_create()
82and any associated vCPUs() created with h_guest_create_vCPU(). Getting
83and setting vCPU state can also be performed using h_guest_{g|s}et
84hcall.
85
86The basic execution flow is for an L1 to create an L2, run it, and
87delete it is:
88
89- L1 and L0 negotiate capabilities with H_GUEST_{G,S}ET_CAPABILITIES()
90  (normally at L1 boot time).
91
92- L1 requests the L0 to create an L2 with H_GUEST_CREATE() and receives a token
93
94- L1 requests the L0 to create an L2 vCPU with H_GUEST_CREATE_VCPU()
95
96- L1 and L0 communicate the vCPU state using the H_GUEST_{G,S}ET() hcall
97
98- L1 requests the L0 to run the vCPU using H_GUEST_RUN_VCPU() hcall
99
100- L1 deletes L2 with H_GUEST_DELETE()
101
102For more details, please refer:
103
104[1] Linux Kernel documentation (upstream documentation commit):
105
106commit 476652297f94a2e5e5ef29e734b0da37ade94110
107Author: Michael Neuling <mikey@neuling.org>
108Date:   Thu Sep 14 13:06:00 2023 +1000
109
110    docs: powerpc: Document nested KVM on POWER
111
112    Document support for nested KVM on POWER using the existing API as well
113    as the new PAPR API. This includes the new HCALL interface and how it
114    used by KVM.
115
116    Signed-off-by: Michael Neuling <mikey@neuling.org>
117    Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
118    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
119    Link: https://msgid.link/20230914030600.16993-12-jniethe5@gmail.com
120