xref: /openbmc/qemu/docs/system/ppc/pseries.rst (revision c49b67f7)
1===================================
2pSeries family boards (``pseries``)
3===================================
4
5The Power machine para-virtualized environment described by the Linux on Power
6Architecture Reference ([LoPAR]_) document is called pSeries. This environment
7is also known as sPAPR, System p guests, or simply Power Linux guests (although
8it is capable of running other operating systems, such as AIX).
9
10Even though pSeries is designed to behave as a guest environment, it is also
11capable of acting as a hypervisor OS, providing, on that role, nested
12virtualization capabilities.
13
14Supported devices
15=================
16
17 * Multi processor support for many Power processors generations: POWER7,
18   POWER7+, POWER8, POWER8NVL, POWER9, and Power10. Support for POWER5+ exists,
19   but its state is unknown.
20 * Interrupt Controller, XICS (POWER8) and XIVE (POWER9 and Power10)
21 * vPHB PCIe Host bridge.
22 * vscsi and vnet devices, compatible with the same devices available on a
23   PowerVM hypervisor with VIOS managing LPARs.
24 * Virtio based devices.
25 * PCIe device pass through.
26
27Missing devices
28===============
29
30 * SPICE support.
31
32Firmware
33========
34
35The pSeries platform in QEMU comes with 2 firmwares:
36
37`SLOF <https://github.com/aik/SLOF>`_ (Slimline Open Firmware) is an
38implementation of the `IEEE 1275-1994, Standard for Boot (Initialization
39Configuration) Firmware: Core Requirements and Practices
40<https://standards.ieee.org/standard/1275-1994.html>`_.
41
42SLOF performs bus scanning, PCI resource allocation, provides the client
43interface to boot from block devices and network.
44
45QEMU includes a prebuilt image of SLOF which is updated when a more recent
46version is required.
47
48VOF (Virtual Open Firmware) is a minimalistic firmware to work with
49``-machine pseries,x-vof=on``. When enabled, the firmware acts as a slim
50shim and QEMU implements parts of the IEEE 1275 Open Firmware interface.
51
52VOF does not have device drivers, does not do PCI resource allocation and
53relies on ``-kernel`` used with Linux kernels recent enough (v5.4+)
54to PCI resource assignment. It is ideal to use with petitboot.
55
56Booting via ``-kernel`` supports the following:
57
58+-------------------+-------------------+------------------+
59| kernel            | pseries,x-vof=off | pseries,x-vof=on |
60+===================+===================+==================+
61| vmlinux BE        |     ✓             |     ✓            |
62+-------------------+-------------------+------------------+
63| vmlinux LE        |     ✓             |     ✓            |
64+-------------------+-------------------+------------------+
65| zImage.pseries BE |     ✓¹            |     ✓¹           |
66+-------------------+-------------------+------------------+
67| zImage.pseries LE |     ✓             |     ✓            |
68+-------------------+-------------------+------------------+
69
70¹ must set kernel-addr=0
71
72Build directions
73================
74
75.. code-block:: bash
76
77  ./configure --target-list=ppc64-softmmu && make
78
79Running instructions
80====================
81
82Someone can select the pSeries machine type by running QEMU with the following
83options:
84
85.. code-block:: bash
86
87  qemu-system-ppc64 -M pseries <other QEMU arguments>
88
89sPAPR devices
90=============
91
92The sPAPR specification defines a set of para-virtualized devices, which are
93also supported by the pSeries machine in QEMU and can be instantiated with the
94``-device`` option:
95
96* ``spapr-vlan`` : a virtual network interface.
97* ``spapr-vscsi`` : a virtual SCSI disk interface.
98* ``spapr-rng`` : a pseudo-device for passing random number generator data to the
99  guest (see the `H_RANDOM hypercall feature
100  <https://wiki.qemu.org/Features/HRandomHypercall>`_ for details).
101* ``spapr-vty``: a virtual teletype.
102* ``spapr-pci-host-bridge``: a PCI host bridge.
103* ``tpm-spapr``: a Trusted Platform Module (TPM).
104* ``spapr-tpm-proxy``: a TPM proxy.
105
106These are compatible with the devices historically available for use when
107running the IBM PowerVM hypervisor with LPARs.
108
109However, since these devices have originally been specified with another
110hypervisor and non-Linux guests in mind, you should use the virtio counterparts
111(virtio-net, virtio-blk/scsi and virtio-rng for instance) if possible instead,
112since they will most probably give you better performance with Linux guests in a
113QEMU environment.
114
115The pSeries machine in QEMU is always instantiated with the following devices:
116
117* A NVRAM device (``spapr-nvram``).
118* A virtual teletype (``spapr-vty``).
119* A PCI host bridge (``spapr-pci-host-bridge``).
120
121Hence, it is not needed to add them manually, unless you use the ``-nodefaults``
122command line option in QEMU.
123
124In the case of the default ``spapr-nvram`` device, if someone wants to make the
125contents of the NVRAM device persistent, they will need to specify a PFLASH
126device when starting QEMU, i.e. either use
127``-drive if=pflash,file=<filename>,format=raw`` to set the default PFLASH
128device, or specify one with an ID
129(``-drive if=none,file=<filename>,format=raw,id=pfid``) and pass that ID to the
130NVRAM device with ``-global spapr-nvram.drive=pfid``.
131
132sPAPR specification
133-------------------
134
135The main source of documentation on the sPAPR standard is the [LoPAR]_ document.
136However, documentation specific to QEMU's implementation of the specification
137can  also be found in QEMU documentation:
138
139.. toctree::
140   :maxdepth: 1
141
142   ../../specs/ppc-spapr-hotplug.rst
143   ../../specs/ppc-spapr-hcalls.rst
144   ../../specs/ppc-spapr-numa.rst
145   ../../specs/ppc-spapr-uv-hcalls.rst
146   ../../specs/ppc-spapr-xive.rst
147
148Switching between the KVM-PR and KVM-HV kernel module
149=====================================================
150
151Currently, there are two implementations of KVM on Power, ``kvm_hv.ko`` and
152``kvm_pr.ko``.
153
154
155If a host supports both KVM modes, and both KVM kernel modules are loaded, it is
156possible to switch between the two modes with the ``kvm-type`` parameter:
157
158* Use ``qemu-system-ppc64 -M pseries,accel=kvm,kvm-type=PR`` to use the
159  ``kvm_pr.ko`` kernel module.
160* Use ``qemu-system-ppc64 -M pseries,accel=kvm,kvm-type=HV`` to use ``kvm_hv.ko``
161  instead.
162
163KVM-PR
164------
165
166KVM-PR uses the so-called **PR**\ oblem state of the PPC CPUs to run the guests,
167i.e. the virtual machine is run in user mode and all privileged instructions
168trap and have to be emulated by the host. That means you can run KVM-PR inside
169a pSeries guest (or a PowerVM LPAR for that matter), and that is where it has
170originated, as historically (prior to POWER7) it was not possible to run Linux
171on hypervisor mode on a Power processor (this function was restricted to
172PowerVM, the IBM proprietary hypervisor).
173
174Because all privileged instructions are trapped, guests that use a lot of
175privileged instructions run quite slow with KVM-PR. On the other hand, because
176of that, this kernel module can run on pretty much every PPC hardware, and is
177able to emulate a lot of guests CPUs. This module can even be used to run other
178PowerPC guests like an emulated PowerMac.
179
180As KVM-PR can be run inside a pSeries guest, it can also provide nested
181virtualization capabilities (i.e. running a guest from within a guest).
182
183It is important to notice that, as KVM-HV provides a much better execution
184performance, maintenance work has been much more focused on it in the past
185years. Maintenance for KVM-PR has been minimal.
186
187In order to run KVM-PR guests with POWER9 processors, someone will need to start
188QEMU with ``kernel_irqchip=off`` command line option.
189
190KVM-HV
191------
192
193KVM-HV uses the hypervisor mode of more recent Power processors, that allow
194access to the bare metal hardware directly. Although POWER7 had this capability,
195it was only starting with POWER8 that this was officially supported by IBM.
196
197Originally, KVM-HV was only available when running on a PowerNV platform (a.k.a.
198Power bare metal). Although it runs on a PowerNV platform, it can only be used
199to start pSeries guests. As the pSeries guest doesn't have access to the
200hypervisor mode of the Power CPU, it wasn't possible to run KVM-HV on a guest.
201This limitation has been lifted, and now it is possible to run KVM-HV inside
202pSeries guests as well, making nested virtualization possible with KVM-HV.
203
204As KVM-HV has access to privileged instructions, guests that use a lot of these
205can run much faster than with KVM-PR. On the other hand, the guest CPU has to be
206of the same type as the host CPU this way, e.g. it is not possible to specify an
207embedded PPC CPU for the guest with KVM-HV. However, there is at least the
208possibility to run the guest in a backward-compatibility mode of the previous
209CPUs generations, e.g. you can run a POWER7 guest on a POWER8 host by using
210``-cpu POWER8,compat=power7`` as parameter to QEMU.
211
212Modules support
213===============
214
215As noticed in the sections above, each module can run in a different
216environment. The following table shows with which environment each module can
217run. As long as you are in a supported environment, you can run KVM-PR or KVM-HV
218nested. Combinations not shown in the table are not available.
219
220+--------------+------------+------+-------------------+----------+--------+
221| Platform     | Host type  | Bits | Page table format | KVM-HV   | KVM-PR |
222+==============+============+======+===================+==========+========+
223| PowerNV      | bare metal | 32   | hash              | no       | yes    |
224|              |            |      +-------------------+----------+--------+
225|              |            |      | radix             | N/A      | N/A    |
226|              |            +------+-------------------+----------+--------+
227|              |            | 64   | hash              | yes      | yes    |
228|              |            |      +-------------------+----------+--------+
229|              |            |      | radix             | yes      | no     |
230+--------------+------------+------+-------------------+----------+--------+
231| pSeries [1]_ | PowerNV    | 32   | hash              | no       | yes    |
232|              |            |      +-------------------+----------+--------+
233|              |            |      | radix             | N/A      | N/A    |
234|              |            +------+-------------------+----------+--------+
235|              |            | 64   | hash              | no       | yes    |
236|              |            |      +-------------------+----------+--------+
237|              |            |      | radix             | yes [2]_ | no     |
238|              +------------+------+-------------------+----------+--------+
239|              | PowerVM    | 32   | hash              | no       | yes    |
240|              |            |      +-------------------+----------+--------+
241|              |            |      | radix             | N/A      | N/A    |
242|              |            +------+-------------------+----------+--------+
243|              |            | 64   | hash              | no       | yes    |
244|              |            |      +-------------------+----------+--------+
245|              |            |      | radix [3]_        | no       | yes    |
246+--------------+------------+------+-------------------+----------+--------+
247
248.. [1] On POWER9 DD2.1 processors, the page table format on the host and guest
249   must be the same.
250
251.. [2] KVM-HV cannot run nested on POWER8 machines.
252
253.. [3] Introduced on Power10 machines.
254
255
256.. _power-papr-protected-execution-facility-pef:
257
258POWER (PAPR) Protected Execution Facility (PEF)
259-----------------------------------------------
260
261Protected Execution Facility (PEF), also known as Secure Guest support
262is a feature found on IBM POWER9 and POWER10 processors.
263
264If a suitable firmware including an Ultravisor is installed, it adds
265an extra memory protection mode to the CPU.  The ultravisor manages a
266pool of secure memory which cannot be accessed by the hypervisor.
267
268When this feature is enabled in QEMU, a guest can use ultracalls to
269enter "secure mode".  This transfers most of its memory to secure
270memory, where it cannot be eavesdropped by a compromised hypervisor.
271
272Launching
273^^^^^^^^^
274
275To launch a guest which will be permitted to enter PEF secure mode::
276
277  $ qemu-system-ppc64 \
278      -object pef-guest,id=pef0 \
279      -machine confidential-guest-support=pef0 \
280      ...
281
282Live Migration
283^^^^^^^^^^^^^^
284
285Live migration is not yet implemented for PEF guests.  For
286consistency, QEMU currently prevents migration if the PEF feature is
287enabled, whether or not the guest has actually entered secure mode.
288
289
290Maintainer contact information
291==============================
292
293Cédric Le Goater <clg@kaod.org>
294
295Daniel Henrique Barboza <danielhb413@gmail.com>
296
297.. [LoPAR] `Linux on Power Architecture Reference document (LoPAR) revision
298   2.9 <https://openpowerfoundation.org/wp-content/uploads/2020/07/LoPAR-20200812.pdf>`_.
299