xref: /openbmc/qemu/docs/system/s390x/vfio-ap.rst (revision d2dfe0b5)
1Adjunct Processor (AP) Device
2=============================
3
4.. contents::
5
6Introduction
7------------
8
9The IBM Adjunct Processor (AP) Cryptographic Facility is comprised
10of three AP instructions and from 1 to 256 PCIe cryptographic adapter cards.
11These AP devices provide cryptographic functions to all CPUs assigned to a
12linux system running in an IBM Z system LPAR.
13
14On s390x, AP adapter cards are exposed via the AP bus. This document
15describes how those cards may be made available to KVM guests using the
16VFIO mediated device framework.
17
18AP Architectural Overview
19-------------------------
20
21In order understand the terminology used in the rest of this document, let's
22start with some definitions:
23
24* AP adapter
25
26  An AP adapter is an IBM Z adapter card that can perform cryptographic
27  functions. There can be from 0 to 256 adapters assigned to an LPAR depending
28  on the machine model. Adapters assigned to the LPAR in which a linux host is
29  running will be available to the linux host. Each adapter is identified by a
30  number from 0 to 255; however, the maximum adapter number allowed is
31  determined by machine model. When installed, an AP adapter is accessed by
32  AP instructions executed by any CPU.
33
34* AP domain
35
36  An adapter is partitioned into domains. Each domain can be thought of as
37  a set of hardware registers for processing AP instructions. An adapter can
38  hold up to 256 domains; however, the maximum domain number allowed is
39  determined by machine model. Each domain is identified by a number from 0 to
40  255. Domains can be further classified into two types:
41
42    * Usage domains are domains that can be accessed directly to process AP
43      commands
44
45    * Control domains are domains that are accessed indirectly by AP
46      commands sent to a usage domain to control or change the domain; for
47      example, to set a secure private key for the domain.
48
49* AP Queue
50
51  An AP queue is the means by which an AP command-request message is sent to an
52  AP usage domain inside a specific AP. An AP queue is identified by a tuple
53  comprised of an AP adapter ID (APID) and an AP queue index (APQI). The
54  APQI corresponds to a given usage domain number within the adapter. This tuple
55  forms an AP Queue Number (APQN) uniquely identifying an AP queue. AP
56  instructions include a field containing the APQN to identify the AP queue to
57  which the AP command-request message is to be sent for processing.
58
59* AP Instructions:
60
61  There are three AP instructions:
62
63  * NQAP: to enqueue an AP command-request message to a queue
64  * DQAP: to dequeue an AP command-reply message from a queue
65  * PQAP: to administer the queues
66
67  AP instructions identify the domain that is targeted to process the AP
68  command; this must be one of the usage domains. An AP command may modify a
69  domain that is not one of the usage domains, but the modified domain
70  must be one of the control domains.
71
72Start Interpretive Execution (SIE) Instruction
73----------------------------------------------
74
75A KVM guest is started by executing the Start Interpretive Execution (SIE)
76instruction. The SIE state description is a control block that contains the
77state information for a KVM guest and is supplied as input to the SIE
78instruction. The SIE state description contains a satellite control block called
79the Crypto Control Block (CRYCB). The CRYCB contains three fields to identify
80the adapters, usage domains and control domains assigned to the KVM guest:
81
82* The AP Mask (APM) field is a bit mask that identifies the AP adapters assigned
83  to the KVM guest. Each bit in the mask, from left to right, corresponds to
84  an APID from 0-255. If a bit is set, the corresponding adapter is valid for
85  use by the KVM guest.
86
87* The AP Queue Mask (AQM) field is a bit mask identifying the AP usage domains
88  assigned to the KVM guest. Each bit in the mask, from left to right,
89  corresponds to  an AP queue index (APQI) from 0-255. If a bit is set, the
90  corresponding queue is valid for use by the KVM guest.
91
92* The AP Domain Mask field is a bit mask that identifies the AP control domains
93  assigned to the KVM guest. The ADM bit mask controls which domains can be
94  changed by an AP command-request message sent to a usage domain from the
95  guest. Each bit in the mask, from left to right, corresponds to a domain from
96  0-255. If a bit is set, the corresponding domain can be modified by an AP
97  command-request message sent to a usage domain.
98
99If you recall from the description of an AP Queue, AP instructions include
100an APQN to identify the AP adapter and AP queue to which an AP command-request
101message is to be sent (NQAP and PQAP instructions), or from which a
102command-reply message is to be received (DQAP instruction). The validity of an
103APQN is defined by the matrix calculated from the APM and AQM; it is the
104cross product of all assigned adapter numbers (APM) with all assigned queue
105indexes (AQM). For example, if adapters 1 and 2 and usage domains 5 and 6 are
106assigned to a guest, the APQNs (1,5), (1,6), (2,5) and (2,6) will be valid for
107the guest.
108
109The APQNs can provide secure key functionality - i.e., a private key is stored
110on the adapter card for each of its domains - so each APQN must be assigned to
111at most one guest or the linux host.
112
113Example 1: Valid configuration
114~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
115
116+----------+--------+--------+
117|          | Guest1 | Guest2 |
118+==========+========+========+
119| adapters |  1, 2  |  1, 2  |
120+----------+--------+--------+
121| domains  |  5, 6  |  7     |
122+----------+--------+--------+
123
124This is valid because both guests have a unique set of APQNs:
125
126* Guest1 has APQNs (1,5), (1,6), (2,5) and (2,6);
127* Guest2 has APQNs (1,7) and (2,7).
128
129Example 2: Valid configuration
130~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
131
132+----------+--------+--------+
133|          | Guest1 | Guest2 |
134+==========+========+========+
135| adapters |  1, 2  |  3, 4  |
136+----------+--------+--------+
137| domains  |  5, 6  |  5, 6  |
138+----------+--------+--------+
139
140This is also valid because both guests have a unique set of APQNs:
141
142* Guest1 has APQNs (1,5), (1,6), (2,5), (2,6);
143* Guest2 has APQNs (3,5), (3,6), (4,5), (4,6)
144
145Example 3: Invalid configuration
146~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
147
148+----------+--------+--------+
149|          | Guest1 | Guest2 |
150+==========+========+========+
151| adapters |  1, 2  |  1     |
152+----------+--------+--------+
153| domains  |  5, 6  |  6, 7  |
154+----------+--------+--------+
155
156This is an invalid configuration because both guests have access to
157APQN (1,6).
158
159AP Matrix Configuration on Linux Host
160-------------------------------------
161
162A linux system is a guest of the LPAR in which it is running and has access to
163the AP resources configured for the LPAR. The LPAR's AP matrix is
164configured via its Activation Profile which can be edited on the HMC. When the
165linux system is started, the AP bus will detect the AP devices assigned to the
166LPAR and create the following in sysfs::
167
168  /sys/bus/ap
169  ... [devices]
170  ...... xx.yyyy
171  ...... ...
172  ...... cardxx
173  ...... ...
174
175Where:
176
177``cardxx``
178  is AP adapter number xx (in hex)
179
180``xx.yyyy``
181  is an APQN with xx specifying the APID and yyyy specifying the APQI
182
183For example, if AP adapters 5 and 6 and domains 4, 71 (0x47), 171 (0xab) and
184255 (0xff) are configured for the LPAR, the sysfs representation on the linux
185host system would look like this::
186
187  /sys/bus/ap
188  ... [devices]
189  ...... 05.0004
190  ...... 05.0047
191  ...... 05.00ab
192  ...... 05.00ff
193  ...... 06.0004
194  ...... 06.0047
195  ...... 06.00ab
196  ...... 06.00ff
197  ...... card05
198  ...... card06
199
200A set of default device drivers are also created to control each type of AP
201device that can be assigned to the LPAR on which a linux host is running::
202
203  /sys/bus/ap
204  ... [drivers]
205  ...... [cex2acard]        for Crypto Express 2/3 accelerator cards
206  ...... [cex2aqueue]       for AP queues served by Crypto Express 2/3
207                            accelerator cards
208  ...... [cex4card]         for Crypto Express 4/5/6 accelerator and coprocessor
209                            cards
210  ...... [cex4queue]        for AP queues served by Crypto Express 4/5/6
211                            accelerator and coprocessor cards
212  ...... [pcixcccard]       for Crypto Express 2/3 coprocessor cards
213  ...... [pcixccqueue]      for AP queues served by Crypto Express 2/3
214                            coprocessor cards
215
216Binding AP devices to device drivers
217~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
218
219There are two sysfs files that specify bitmasks marking a subset of the APQN
220range as 'usable by the default AP queue device drivers' or 'not usable by the
221default device drivers' and thus available for use by the alternate device
222driver(s). The sysfs locations of the masks are::
223
224   /sys/bus/ap/apmask
225   /sys/bus/ap/aqmask
226
227The ``apmask`` is a 256-bit mask that identifies a set of AP adapter IDs
228(APID). Each bit in the mask, from left to right (i.e., from most significant
229to least significant bit in big endian order), corresponds to an APID from
2300-255. If a bit is set, the APID is marked as usable only by the default AP
231queue device drivers; otherwise, the APID is usable by the vfio_ap
232device driver.
233
234The ``aqmask`` is a 256-bit mask that identifies a set of AP queue indexes
235(APQI). Each bit in the mask, from left to right (i.e., from most significant
236to least significant bit in big endian order), corresponds to an APQI from
2370-255. If a bit is set, the APQI is marked as usable only by the default AP
238queue device drivers; otherwise, the APQI is usable by the vfio_ap device
239driver.
240
241Take, for example, the following mask::
242
243      0x7dffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
244
245It indicates:
246
247      1, 2, 3, 4, 5, and 7-255 belong to the default drivers' pool, and 0 and 6
248      belong to the vfio_ap device driver's pool.
249
250The APQN of each AP queue device assigned to the linux host is checked by the
251AP bus against the set of APQNs derived from the cross product of APIDs
252and APQIs marked as usable only by the default AP queue device drivers. If a
253match is detected,  only the default AP queue device drivers will be probed;
254otherwise, the vfio_ap device driver will be probed.
255
256By default, the two masks are set to reserve all APQNs for use by the default
257AP queue device drivers. There are two ways the default masks can be changed:
258
259 1. The sysfs mask files can be edited by echoing a string into the
260    respective sysfs mask file in one of two formats:
261
262    * An absolute hex string starting with 0x - like "0x12345678" - sets
263      the mask. If the given string is shorter than the mask, it is padded
264      with 0s on the right; for example, specifying a mask value of 0x41 is
265      the same as specifying::
266
267           0x4100000000000000000000000000000000000000000000000000000000000000
268
269      Keep in mind that the mask reads from left to right (i.e., most
270      significant to least significant bit in big endian order), so the mask
271      above identifies device numbers 1 and 7 (``01000001``).
272
273      If the string is longer than the mask, the operation is terminated with
274      an error (EINVAL).
275
276    * Individual bits in the mask can be switched on and off by specifying
277      each bit number to be switched in a comma separated list. Each bit
278      number string must be prepended with a (``+``) or minus (``-``) to indicate
279      the corresponding bit is to be switched on (``+``) or off (``-``). Some
280      valid values are::
281
282           "+0"    switches bit 0 on
283           "-13"   switches bit 13 off
284           "+0x41" switches bit 65 on
285           "-0xff" switches bit 255 off
286
287      The following example::
288
289              +0,-6,+0x47,-0xf0
290
291      Switches bits 0 and 71 (0x47) on
292      Switches bits 6 and 240 (0xf0) off
293
294      Note that the bits not specified in the list remain as they were before
295      the operation.
296
297 2. The masks can also be changed at boot time via parameters on the kernel
298    command line like this::
299
300         ap.apmask=0xffff ap.aqmask=0x40
301
302    This would create the following masks:
303
304    apmask::
305
306            0xffff000000000000000000000000000000000000000000000000000000000000
307
308    aqmask::
309
310            0x4000000000000000000000000000000000000000000000000000000000000000
311
312    Resulting in these two pools::
313
314            default drivers pool:    adapter 0-15, domain 1
315            alternate drivers pool:  adapter 16-255, domains 0, 2-255
316
317Configuring an AP matrix for a linux guest
318~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
319
320The sysfs interfaces for configuring an AP matrix for a guest are built on the
321VFIO mediated device framework. To configure an AP matrix for a guest, a
322mediated matrix device must first be created for the ``/sys/devices/vfio_ap/matrix``
323device. When the vfio_ap device driver is loaded, it registers with the VFIO
324mediated device framework. When the driver registers, the sysfs interfaces for
325creating mediated matrix devices is created::
326
327  /sys/devices
328  ... [vfio_ap]
329  ......[matrix]
330  ......... [mdev_supported_types]
331  ............ [vfio_ap-passthrough]
332  ............... create
333  ............... [devices]
334
335A mediated AP matrix device is created by writing a UUID to the attribute file
336named ``create``, for example::
337
338   uuidgen > create
339
340or
341
342::
343
344   echo $uuid > create
345
346When a mediated AP matrix device is created, a sysfs directory named after
347the UUID is created in the ``devices`` subdirectory::
348
349  /sys/devices
350  ... [vfio_ap]
351  ......[matrix]
352  ......... [mdev_supported_types]
353  ............ [vfio_ap-passthrough]
354  ............... create
355  ............... [devices]
356  .................. [$uuid]
357
358There will also be three sets of attribute files created in the mediated
359matrix device's sysfs directory to configure an AP matrix for the
360KVM guest::
361
362  /sys/devices
363  ... [vfio_ap]
364  ......[matrix]
365  ......... [mdev_supported_types]
366  ............ [vfio_ap-passthrough]
367  ............... create
368  ............... [devices]
369  .................. [$uuid]
370  ..................... assign_adapter
371  ..................... assign_control_domain
372  ..................... assign_domain
373  ..................... matrix
374  ..................... unassign_adapter
375  ..................... unassign_control_domain
376  ..................... unassign_domain
377
378``assign_adapter``
379   To assign an AP adapter to the mediated matrix device, its APID is written
380   to the ``assign_adapter`` file. This may be done multiple times to assign more
381   than one adapter. The APID may be specified using conventional semantics
382   as a decimal, hexadecimal, or octal number. For example, to assign adapters
383   4, 5 and 16 to a mediated matrix device in decimal, hexadecimal and octal
384   respectively::
385
386       echo 4 > assign_adapter
387       echo 0x5 > assign_adapter
388       echo 020 > assign_adapter
389
390   In order to successfully assign an adapter:
391
392   * The adapter number specified must represent a value from 0 up to the
393     maximum adapter number allowed by the machine model. If an adapter number
394     higher than the maximum is specified, the operation will terminate with
395     an error (ENODEV).
396
397   * All APQNs that can be derived from the adapter ID being assigned and the
398     IDs of the previously assigned domains must be bound to the vfio_ap device
399     driver. If no domains have yet been assigned, then there must be at least
400     one APQN with the specified APID bound to the vfio_ap driver. If no such
401     APQNs are bound to the driver, the operation will terminate with an
402     error (EADDRNOTAVAIL).
403
404   * No APQN that can be derived from the adapter ID and the IDs of the
405     previously assigned domains can be assigned to another mediated matrix
406     device. If an APQN is assigned to another mediated matrix device, the
407     operation will terminate with an error (EADDRINUSE).
408
409``unassign_adapter``
410   To unassign an AP adapter, its APID is written to the ``unassign_adapter``
411   file. This may also be done multiple times to unassign more than one adapter.
412
413``assign_domain``
414   To assign a usage domain, the domain number is written into the
415   ``assign_domain`` file. This may be done multiple times to assign more than one
416   usage domain. The domain number is specified using conventional semantics as
417   a decimal, hexadecimal, or octal number. For example, to assign usage domains
418   4, 8, and 71 to a mediated matrix device in decimal, hexadecimal and octal
419   respectively::
420
421      echo 4 > assign_domain
422      echo 0x8 > assign_domain
423      echo 0107 > assign_domain
424
425   In order to successfully assign a domain:
426
427   * The domain number specified must represent a value from 0 up to the
428     maximum domain number allowed by the machine model. If a domain number
429     higher than the maximum is specified, the operation will terminate with
430     an error (ENODEV).
431
432   * All APQNs that can be derived from the domain ID being assigned and the IDs
433     of the previously assigned adapters must be bound to the vfio_ap device
434     driver. If no domains have yet been assigned, then there must be at least
435     one APQN with the specified APQI bound to the vfio_ap driver. If no such
436     APQNs are bound to the driver, the operation will terminate with an
437     error (EADDRNOTAVAIL).
438
439   * No APQN that can be derived from the domain ID being assigned and the IDs
440     of the previously assigned adapters can be assigned to another mediated
441     matrix device. If an APQN is assigned to another mediated matrix device,
442     the operation will terminate with an error (EADDRINUSE).
443
444``unassign_domain``
445   To unassign a usage domain, the domain number is written into the
446   ``unassign_domain`` file. This may be done multiple times to unassign more than
447   one usage domain.
448
449``assign_control_domain``
450   To assign a control domain, the domain number is written into the
451   ``assign_control_domain`` file. This may be done multiple times to
452   assign more than one control domain. The domain number may be specified using
453   conventional semantics as a decimal, hexadecimal, or octal number. For
454   example, to assign  control domains 4, 8, and 71 to  a mediated matrix device
455   in decimal, hexadecimal and octal respectively::
456
457      echo 4 > assign_domain
458      echo 0x8 > assign_domain
459      echo 0107 > assign_domain
460
461   In order to successfully assign a control domain, the domain number
462   specified must represent a value from 0 up to the maximum domain number
463   allowed by the machine model. If a control domain number higher than the
464   maximum is specified, the operation will terminate with an error (ENODEV).
465
466``unassign_control_domain``
467   To unassign a control domain, the domain number is written into the
468   ``unassign_domain`` file. This may be done multiple times to unassign more than
469   one control domain.
470
471Notes: No changes to the AP matrix will be allowed while a guest using
472the mediated matrix device is running. Attempts to assign an adapter,
473domain or control domain will be rejected and an error (EBUSY) returned.
474
475Starting a Linux Guest Configured with an AP Matrix
476~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
477
478To provide a mediated matrix device for use by a guest, the following option
479must be specified on the QEMU command line::
480
481   -device vfio_ap,sysfsdev=$path-to-mdev
482
483The sysfsdev parameter specifies the path to the mediated matrix device.
484There are a number of ways to specify this path::
485
486  /sys/devices/vfio_ap/matrix/$uuid
487  /sys/bus/mdev/devices/$uuid
488  /sys/bus/mdev/drivers/vfio_mdev/$uuid
489  /sys/devices/vfio_ap/matrix/mdev_supported_types/vfio_ap-passthrough/devices/$uuid
490
491When the linux guest is started, the guest will open the mediated
492matrix device's file descriptor to get information about the mediated matrix
493device. The ``vfio_ap`` device driver will update the APM, AQM, and ADM fields in
494the guest's CRYCB with the adapter, usage domain and control domains assigned
495via the mediated matrix device's sysfs attribute files. Programs running on the
496linux guest will then:
497
4981. Have direct access to the APQNs derived from the cross product of the AP
499   adapter numbers (APID) and queue indexes (APQI) specified in the APM and AQM
500   fields of the guests's CRYCB respectively. These APQNs identify the AP queues
501   that are valid for use by the guest; meaning, AP commands can be sent by the
502   guest to any of these queues for processing.
503
5042. Have authorization to process AP commands to change a control domain
505   identified in the ADM field of the guest's CRYCB. The AP command must be sent
506   to a valid APQN (see 1 above).
507
508CPU model features:
509
510Three CPU model features are available for controlling guest access to AP
511facilities:
512
5131. AP facilities feature
514
515   The AP facilities feature indicates that AP facilities are installed on the
516   guest. This feature will be exposed for use only if the AP facilities
517   are installed on the host system. The feature is s390-specific and is
518   represented as a parameter of the -cpu option on the QEMU command line::
519
520      qemu-system-s390x -cpu $model,ap=on|off
521
522   Where:
523
524      ``$model``
525        is the CPU model defined for the guest (defaults to the model of
526        the host system if not specified).
527
528      ``ap=on|off``
529        indicates whether AP facilities are installed (on) or not
530        (off). The default for CPU models zEC12 or newer
531        is ``ap=on``. AP facilities must be installed on the guest if a
532        vfio-ap device (``-device vfio-ap,sysfsdev=$path``) is configured
533        for the guest, or the guest will fail to start.
534
5352. Query Configuration Information (QCI) facility
536
537   The QCI facility is used by the AP bus running on the guest to query the
538   configuration of the AP facilities. This facility will be available
539   only if the QCI facility is installed on the host system. The feature is
540   s390-specific and is represented as a parameter of the -cpu option on the
541   QEMU command line::
542
543      qemu-system-s390x -cpu $model,apqci=on|off
544
545   Where:
546
547      ``$model``
548        is the CPU model defined for the guest
549
550      ``apqci=on|off``
551        indicates whether the QCI facility is installed (on) or
552        not (off). The default for CPU models zEC12 or newer
553        is ``apqci=on``; for older models, QCI will not be installed.
554
555        If QCI is installed (``apqci=on``) but AP facilities are not
556        (``ap=off``), an error message will be logged, but the guest
557        will be allowed to start. It makes no sense to have QCI
558        installed if the AP facilities are not; this is considered
559        an invalid configuration.
560
561        If the QCI facility is not installed, APQNs with an APQI
562        greater than 15 will not be detected by the AP bus
563        running on the guest.
564
5653. Adjunct Process Facility Test (APFT) facility
566
567   The APFT facility is used by the AP bus running on the guest to test the
568   AP facilities available for a given AP queue. This facility will be available
569   only if the APFT facility is installed on the host system. The feature is
570   s390-specific and is represented as a parameter of the -cpu option on the
571   QEMU command line::
572
573      qemu-system-s390x -cpu $model,apft=on|off
574
575   Where:
576
577      ``$model``
578        is the CPU model defined for the guest (defaults to the model of
579        the host system if not specified).
580
581      ``apft=on|off``
582        indicates whether the APFT facility is installed (on) or
583        not (off). The default for CPU models zEC12 and
584        newer is ``apft=on`` for older models, APFT will not be
585        installed.
586
587        If APFT is installed (``apft=on``) but AP facilities are not
588        (``ap=off``), an error message will be logged, but the guest
589        will be allowed to start. It makes no sense to have APFT
590        installed if the AP facilities are not; this is considered
591        an invalid configuration.
592
593        It also makes no sense to turn APFT off because the AP bus
594        running on the guest will not detect CEX4 and newer devices
595        without it. Since only CEX4 and newer devices are supported
596        for guest usage, no AP devices can be made accessible to a
597        guest started without APFT installed.
598
599Hot plug a vfio-ap device into a running guest
600~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
601
602Only one vfio-ap device can be attached to the virtual machine's ap-bus, so a
603vfio-ap device can be hot plugged if and only if no vfio-ap device is attached
604to the bus already, whether via the QEMU command line or a prior hot plug
605action.
606
607To hot plug a vfio-ap device, use the QEMU ``device_add`` command::
608
609    (qemu) device_add vfio-ap,sysfsdev="$path-to-mdev",id="$id"
610
611Where the ``$path-to-mdev`` value specifies the absolute path to a mediated
612device to which AP resources to be used by the guest have been assigned.
613``$id`` is the name value for the optional id parameter.
614
615Note that on Linux guests, the AP devices will be created in the
616``/sys/bus/ap/devices`` directory when the AP bus subsequently performs its periodic
617scan, so there may be a short delay before the AP devices are accessible on the
618guest.
619
620The command will fail if:
621
622* A vfio-ap device has already been attached to the virtual machine's ap-bus.
623
624* The CPU model features for controlling guest access to AP facilities are not
625  enabled (see 'CPU model features' subsection in the previous section).
626
627Hot unplug a vfio-ap device from a running guest
628~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
629
630A vfio-ap device can be unplugged from a running KVM guest if a vfio-ap device
631has been attached to the virtual machine's ap-bus via the QEMU command line
632or a prior hot plug action.
633
634To hot unplug a vfio-ap device, use the QEMU ``device_del`` command::
635
636    (qemu) device_del "$id"
637
638Where ``$id`` is the same id that was specified at device creation.
639
640On a Linux guest, the AP devices will be removed from the ``/sys/bus/ap/devices``
641directory on the guest when the AP bus subsequently performs its periodic scan,
642so there may be a short delay before the AP devices are no longer accessible by
643the guest.
644
645The command will fail if the ``$path-to-mdev`` specified on the ``device_del`` command
646does not match the value specified when the vfio-ap device was attached to
647the virtual machine's ap-bus.
648
649Example: Configure AP Matrices for Three Linux Guests
650-----------------------------------------------------
651
652Let's now provide an example to illustrate how KVM guests may be given
653access to AP facilities. For this example, we will show how to configure
654three guests such that executing the lszcrypt command on the guests would
655look like this:
656
657Guest1::
658
659  CARD.DOMAIN TYPE  MODE
660  ------------------------------
661  05          CEX5C CCA-Coproc
662  05.0004     CEX5C CCA-Coproc
663  05.00ab     CEX5C CCA-Coproc
664  06          CEX5A Accelerator
665  06.0004     CEX5A Accelerator
666  06.00ab     CEX5C CCA-Coproc
667
668Guest2::
669
670  CARD.DOMAIN TYPE  MODE
671  ------------------------------
672  05          CEX5A Accelerator
673  05.0047     CEX5A Accelerator
674  05.00ff     CEX5A Accelerator
675
676Guest3::
677
678  CARD.DOMAIN TYPE  MODE
679  ------------------------------
680  06          CEX5A Accelerator
681  06.0047     CEX5A Accelerator
682  06.00ff     CEX5A Accelerator
683
684These are the steps:
685
6861. Install the vfio_ap module on the linux host. The dependency chain for the
687   vfio_ap module is:
688
689   * iommu
690   * s390
691   * zcrypt
692   * vfio
693   * vfio_mdev
694   * vfio_mdev_device
695   * KVM
696
697   To build the vfio_ap module, the kernel build must be configured with the
698   following Kconfig elements selected:
699
700   * IOMMU_SUPPORT
701   * S390
702   * ZCRYPT
703   * S390_AP_IOMMU
704   * VFIO
705   * VFIO_MDEV
706   * VFIO_MDEV_DEVICE
707   * KVM
708
709   If using make menuconfig select the following to build the vfio_ap module::
710     -> Device Drivers
711        -> IOMMU Hardware Support
712           select S390 AP IOMMU Support
713        -> VFIO Non-Privileged userspace driver framework
714           -> Mediated device driver framework
715              -> VFIO driver for Mediated devices
716     -> I/O subsystem
717        -> VFIO support for AP devices
718
7192. Secure the AP queues to be used by the three guests so that the host can not
720   access them. To secure the AP queues 05.0004, 05.0047, 05.00ab, 05.00ff,
721   06.0004, 06.0047, 06.00ab, and 06.00ff for use by the vfio_ap device driver,
722   the corresponding APQNs must be removed from the default queue drivers pool
723   as follows::
724
725      echo -5,-6 > /sys/bus/ap/apmask
726
727      echo -4,-0x47,-0xab,-0xff > /sys/bus/ap/aqmask
728
729   This will result in AP queues 05.0004, 05.0047, 05.00ab, 05.00ff, 06.0004,
730   06.0047, 06.00ab, and 06.00ff getting bound to the vfio_ap device driver. The
731   sysfs directory for the vfio_ap device driver will now contain symbolic links
732   to the AP queue devices bound to it::
733
734     /sys/bus/ap
735     ... [drivers]
736     ...... [vfio_ap]
737     ......... [05.0004]
738     ......... [05.0047]
739     ......... [05.00ab]
740     ......... [05.00ff]
741     ......... [06.0004]
742     ......... [06.0047]
743     ......... [06.00ab]
744     ......... [06.00ff]
745
746   Keep in mind that only type 10 and newer adapters (i.e., CEX4 and later)
747   can be bound to the vfio_ap device driver. The reason for this is to
748   simplify the implementation by not needlessly complicating the design by
749   supporting older devices that will go out of service in the relatively near
750   future, and for which there are few older systems on which to test.
751
752   The administrator, therefore, must take care to secure only AP queues that
753   can be bound to the vfio_ap device driver. The device type for a given AP
754   queue device can be read from the parent card's sysfs directory. For example,
755   to see the hardware type of the queue 05.0004::
756
757     cat /sys/bus/ap/devices/card05/hwtype
758
759   The hwtype must be 10 or higher (CEX4 or newer) in order to be bound to the
760   vfio_ap device driver.
761
7623. Create the mediated devices needed to configure the AP matrixes for the
763   three guests and to provide an interface to the vfio_ap driver for
764   use by the guests::
765
766     /sys/devices/vfio_ap/matrix/
767     ... [mdev_supported_types]
768     ...... [vfio_ap-passthrough] (passthrough mediated matrix device type)
769     ......... create
770     ......... [devices]
771
772   To create the mediated devices for the three guests::
773
774       uuidgen > create
775       uuidgen > create
776       uuidgen > create
777
778   or
779
780   ::
781
782       echo $uuid1 > create
783       echo $uuid2 > create
784       echo $uuid3 > create
785
786   This will create three mediated devices in the [devices] subdirectory named
787   after the UUID used to create the mediated device. We'll call them $uuid1,
788   $uuid2 and $uuid3 and this is the sysfs directory structure after creation::
789
790     /sys/devices/vfio_ap/matrix/
791     ... [mdev_supported_types]
792     ...... [vfio_ap-passthrough]
793     ......... [devices]
794     ............ [$uuid1]
795     ............... assign_adapter
796     ............... assign_control_domain
797     ............... assign_domain
798     ............... matrix
799     ............... unassign_adapter
800     ............... unassign_control_domain
801     ............... unassign_domain
802
803     ............ [$uuid2]
804     ............... assign_adapter
805     ............... assign_control_domain
806     ............... assign_domain
807     ............... matrix
808     ............... unassign_adapter
809     ............... unassign_control_domain
810     ............... unassign_domain
811
812     ............ [$uuid3]
813     ............... assign_adapter
814     ............... assign_control_domain
815     ............... assign_domain
816     ............... matrix
817     ............... unassign_adapter
818     ............... unassign_control_domain
819     ............... unassign_domain
820
8214. The administrator now needs to configure the matrixes for the mediated
822   devices $uuid1 (for Guest1), $uuid2 (for Guest2) and $uuid3 (for Guest3).
823
824   This is how the matrix is configured for Guest1::
825
826      echo 5 > assign_adapter
827      echo 6 > assign_adapter
828      echo 4 > assign_domain
829      echo 0xab > assign_domain
830
831   Control domains can similarly be assigned using the assign_control_domain
832   sysfs file.
833
834   If a mistake is made configuring an adapter, domain or control domain,
835   you can use the ``unassign_xxx`` interfaces to unassign the adapter, domain or
836   control domain.
837
838   To display the matrix configuration for Guest1::
839
840         cat matrix
841
842   The output will display the APQNs in the format ``xx.yyyy``, where xx is
843   the adapter number and yyyy is the domain number. The output for Guest1
844   will look like this::
845
846         05.0004
847         05.00ab
848         06.0004
849         06.00ab
850
851   This is how the matrix is configured for Guest2::
852
853      echo 5 > assign_adapter
854      echo 0x47 > assign_domain
855      echo 0xff > assign_domain
856
857   This is how the matrix is configured for Guest3::
858
859      echo 6 > assign_adapter
860      echo 0x47 > assign_domain
861      echo 0xff > assign_domain
862
8635. Start Guest1::
864
865   /usr/bin/qemu-system-s390x ... -cpu host,ap=on,apqci=on,apft=on -device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/$uuid1 ...
866
8677. Start Guest2::
868
869   /usr/bin/qemu-system-s390x ... -cpu host,ap=on,apqci=on,apft=on -device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/$uuid2 ...
870
8717. Start Guest3::
872
873   /usr/bin/qemu-system-s390x ... -cpu host,ap=on,apqci=on,apft=on -device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/$uuid3 ...
874
875When the guest is shut down, the mediated matrix devices may be removed.
876
877Using our example again, to remove the mediated matrix device $uuid1::
878
879   /sys/devices/vfio_ap/matrix/
880   ... [mdev_supported_types]
881   ...... [vfio_ap-passthrough]
882   ......... [devices]
883   ............ [$uuid1]
884   ............... remove
885
886
887   echo 1 > remove
888
889This will remove all of the mdev matrix device's sysfs structures including
890the mdev device itself. To recreate and reconfigure the mdev matrix device,
891all of the steps starting with step 3 will have to be performed again. Note
892that the remove will fail if a guest using the mdev is still running.
893
894It is not necessary to remove an mdev matrix device, but one may want to
895remove it if no guest will use it during the remaining lifetime of the linux
896host. If the mdev matrix device is removed, one may want to also reconfigure
897the pool of adapters and queues reserved for use by the default drivers.
898
899Limitations
900-----------
901
902* The KVM/kernel interfaces do not provide a way to prevent restoring an APQN
903  to the default drivers pool of a queue that is still assigned to a mediated
904  device in use by a guest. It is incumbent upon the administrator to
905  ensure there is no mediated device in use by a guest to which the APQN is
906  assigned lest the host be given access to the private data of the AP queue
907  device, such as a private key configured specifically for the guest.
908
909* Dynamically assigning AP resources to or unassigning AP resources from a
910  mediated matrix device - see `Configuring an AP matrix for a linux guest`_
911  section above - while a running guest is using it is currently not supported.
912
913* Live guest migration is not supported for guests using AP devices. If a guest
914  is using AP devices, the vfio-ap device configured for the guest must be
915  unplugged before migrating the guest (see `Hot unplug a vfio-ap device from a
916  running guest`_ section above.)
917