xref: /openbmc/qemu/docs/specs/ppc-spapr-xive.rst (revision 3d9569b8)
1XIVE for sPAPR (pseries machines)
2=================================
3
4The POWER9 processor comes with a new interrupt controller
5architecture, called XIVE as "eXternal Interrupt Virtualization
6Engine". It supports a larger number of interrupt sources and offers
7virtualization features which enables the HW to deliver interrupts
8directly to virtual processors without hypervisor assistance.
9
10A QEMU ``pseries`` machine (which is PAPR compliant) using POWER9
11processors can run under two interrupt modes:
12
13- *Legacy Compatibility Mode*
14
15  the hypervisor provides identical interfaces and similar
16  functionality to PAPR+ Version 2.7.  This is the default mode
17
18  It is also referred as *XICS* in QEMU.
19
20- *XIVE native exploitation mode*
21
22  the hypervisor provides new interfaces to manage the XIVE control
23  structures, and provides direct control for interrupt management
24  through MMIO pages.
25
26Which interrupt modes can be used by the machine is negotiated with
27the guest O/S during the Client Architecture Support negotiation
28sequence. The two modes are mutually exclusive.
29
30Both interrupt mode share the same IRQ number space. See below for the
31layout.
32
33CAS Negotiation
34---------------
35
36QEMU advertises the supported interrupt modes in the device tree
37property "ibm,arch-vec-5-platform-support" in byte 23 and the OS
38Selection for XIVE is indicated in the "ibm,architecture-vec-5"
39property byte 23.
40
41The interrupt modes supported by the machine depend on the CPU type
42(POWER9 is required for XIVE) but also on the machine property
43``ic-mode`` which can be set on the command line. It can take the
44following values: ``xics``, ``xive``, ``dual`` and currently ``xics``
45is the default but it may change in the future.
46
47The choosen interrupt mode is activated after a reconfiguration done
48in a machine reset.
49
50XIVE Device tree properties
51---------------------------
52
53The properties for the PAPR interrupt controller node when the *XIVE
54native exploitation mode* is selected shoud contain:
55
56- ``device_type``
57
58  value should be "power-ivpe".
59
60- ``compatible``
61
62  value should be "ibm,power-ivpe".
63
64- ``reg``
65
66  contains the base address and size of the thread interrupt
67  managnement areas (TIMA), for the User level and for the Guest OS
68  level. Only the Guest OS level is taken into account today.
69
70- ``ibm,xive-eq-sizes``
71
72  the size of the event queues. One cell per size supported, contains
73  log2 of size, in ascending order.
74
75- ``ibm,xive-lisn-ranges``
76
77  the IRQ interrupt number ranges assigned to the guest for the IPIs.
78
79The root node also exports :
80
81- ``ibm,plat-res-int-priorities``
82
83  contains a list of priorities that the hypervisor has reserved for
84  its own use.
85
86IRQ number space
87----------------
88
89IRQ Number space of the ``pseries`` machine is 8K wide and is the same
90for both interrupt mode. The different ranges are defined as follow :
91
92- ``0x0000 .. 0x0FFF`` 4K CPU IPIs (only used under XIVE)
93- ``0x1000 .. 0x1000`` 1 EPOW
94- ``0x1001 .. 0x1001`` 1 HOTPLUG
95- ``0x1100 .. 0x11FF`` 256 VIO devices
96- ``0x1200 .. 0x127F`` 32 PHBs devices
97- ``0x1280 .. 0x12FF`` unused
98- ``0x1300 .. 0x1FFF`` PHB MSIs
99
100Monitoring XIVE
101---------------
102
103The state of the XIVE interrupt controller can be queried through the
104monitor commands ``info pic``. The output comes in two parts.
105
106First, the state of the thread interrupt context registers is dumped
107for each CPU :
108
109::
110
111   (qemu) info pic
112   CPU[0000]:   QW   NSR CPPR IPB LSMFB ACK# INC AGE PIPR  W2
113   CPU[0000]: USER    00   00  00    00   00  00  00   00  00000000
114   CPU[0000]:   OS    00   ff  00    00   ff  00  ff   ff  80000400
115   CPU[0000]: POOL    00   00  00    00   00  00  00   00  00000000
116   CPU[0000]: PHYS    00   00  00    00   00  00  00   ff  00000000
117   ...
118
119In the case of a ``pseries`` machine, QEMU acts as the hypervisor and only
120the O/S and USER register rings make sense. ``W2`` contains the vCPU CAM
121line which is set to the VP identifier.
122
123Then comes the routing information which aggregates the EAS and the
124END configuration:
125
126::
127
128   ...
129   LISN         PQ    EISN     CPU/PRIO EQ
130   00000000 MSI --    00000010   0/6    380/16384 @1fe3e0000 ^1 [ 80000010 ... ]
131   00000001 MSI --    00000010   1/6    305/16384 @1fc230000 ^1 [ 80000010 ... ]
132   00000002 MSI --    00000010   2/6    220/16384 @1fc2f0000 ^1 [ 80000010 ... ]
133   00000003 MSI --    00000010   3/6    201/16384 @1fc390000 ^1 [ 80000010 ... ]
134   00000004 MSI -Q  M 00000000
135   00000005 MSI -Q  M 00000000
136   00000006 MSI -Q  M 00000000
137   00000007 MSI -Q  M 00000000
138   00001000 MSI --    00000012   0/6    380/16384 @1fe3e0000 ^1 [ 80000010 ... ]
139   00001001 MSI --    00000013   0/6    380/16384 @1fe3e0000 ^1 [ 80000010 ... ]
140   00001100 MSI --    00000100   1/6    305/16384 @1fc230000 ^1 [ 80000010 ... ]
141   00001101 MSI -Q  M 00000000
142   00001200 LSI -Q  M 00000000
143   00001201 LSI -Q  M 00000000
144   00001202 LSI -Q  M 00000000
145   00001203 LSI -Q  M 00000000
146   00001300 MSI --    00000102   1/6    305/16384 @1fc230000 ^1 [ 80000010 ... ]
147   00001301 MSI --    00000103   2/6    220/16384 @1fc2f0000 ^1 [ 80000010 ... ]
148   00001302 MSI --    00000104   3/6    201/16384 @1fc390000 ^1 [ 80000010 ... ]
149
150The source information and configuration:
151
152- The ``LISN`` column outputs the interrupt number of the source in
153  range ``[ 0x0 ... 0x1FFF ]`` and its type : ``MSI`` or ``LSI``
154- The ``PQ`` column reflects the state of the PQ bits of the source :
155
156  - ``--`` source is ready to take events
157  - ``P-`` an event was sent and an EOI is PENDING
158  - ``PQ`` an event was QUEUED
159  - ``-Q`` source is OFF
160
161  a ``M`` indicates that source is *MASKED* at the EAS level,
162
163The targeting configuration :
164
165- The ``EISN`` column is the event data that will be queued in the event
166  queue of the O/S.
167- The ``CPU/PRIO`` column is the tuple defining the CPU number and
168  priority queue serving the source.
169- The ``EQ`` column outputs :
170
171  - the current index of the event queue/ the max number of entries
172  - the O/S event queue address
173  - the toggle bit
174  - the last entries that were pushed in the event queue.
175