1.. include:: <isonum.txt>
2
3=====================
4VFIO Mediated devices
5=====================
6
7:Copyright: |copy| 2016, NVIDIA CORPORATION. All rights reserved.
8:Author: Neo Jia <cjia@nvidia.com>
9:Author: Kirti Wankhede <kwankhede@nvidia.com>
10
11This program is free software; you can redistribute it and/or modify
12it under the terms of the GNU General Public License version 2 as
13published by the Free Software Foundation.
14
15
16Virtual Function I/O (VFIO) Mediated devices[1]
17===============================================
18
19The number of use cases for virtualizing DMA devices that do not have built-in
20SR_IOV capability is increasing. Previously, to virtualize such devices,
21developers had to create their own management interfaces and APIs, and then
22integrate them with user space software. To simplify integration with user space
23software, we have identified common requirements and a unified management
24interface for such devices.
25
26The VFIO driver framework provides unified APIs for direct device access. It is
27an IOMMU/device-agnostic framework for exposing direct device access to user
28space in a secure, IOMMU-protected environment. This framework is used for
29multiple devices, such as GPUs, network adapters, and compute accelerators. With
30direct device access, virtual machines or user space applications have direct
31access to the physical device. This framework is reused for mediated devices.
32
33The mediated core driver provides a common interface for mediated device
34management that can be used by drivers of different devices. This module
35provides a generic interface to perform these operations:
36
37* Create and destroy a mediated device
38* Add a mediated device to and remove it from a mediated bus driver
39* Add a mediated device to and remove it from an IOMMU group
40
41The mediated core driver also provides an interface to register a bus driver.
42For example, the mediated VFIO mdev driver is designed for mediated devices and
43supports VFIO APIs. The mediated bus driver adds a mediated device to and
44removes it from a VFIO group.
45
46The following high-level block diagram shows the main components and interfaces
47in the VFIO mediated driver framework. The diagram shows NVIDIA, Intel, and IBM
48devices as examples, as these devices are the first devices to use this module::
49
50     +---------------+
51     |               |
52     | +-----------+ |  mdev_register_driver() +--------------+
53     | |           | +<------------------------+              |
54     | |  mdev     | |                         |              |
55     | |  bus      | +------------------------>+ vfio_mdev.ko |<-> VFIO user
56     | |  driver   | |     probe()/remove()    |              |    APIs
57     | |           | |                         +--------------+
58     | +-----------+ |
59     |               |
60     |  MDEV CORE    |
61     |   MODULE      |
62     |   mdev.ko     |
63     | +-----------+ |  mdev_register_device() +--------------+
64     | |           | +<------------------------+              |
65     | |           | |                         |  nvidia.ko   |<-> physical
66     | |           | +------------------------>+              |    device
67     | |           | |        callbacks        +--------------+
68     | | Physical  | |
69     | |  device   | |  mdev_register_device() +--------------+
70     | | interface | |<------------------------+              |
71     | |           | |                         |  i915.ko     |<-> physical
72     | |           | +------------------------>+              |    device
73     | |           | |        callbacks        +--------------+
74     | |           | |
75     | |           | |  mdev_register_device() +--------------+
76     | |           | +<------------------------+              |
77     | |           | |                         | ccw_device.ko|<-> physical
78     | |           | +------------------------>+              |    device
79     | |           | |        callbacks        +--------------+
80     | +-----------+ |
81     +---------------+
82
83
84Registration Interfaces
85=======================
86
87The mediated core driver provides the following types of registration
88interfaces:
89
90* Registration interface for a mediated bus driver
91* Physical device driver interface
92
93Registration Interface for a Mediated Bus Driver
94------------------------------------------------
95
96The registration interface for a mediated device driver provides the following
97structure to represent a mediated device's driver::
98
99     /*
100      * struct mdev_driver [2] - Mediated device's driver
101      * @probe: called when new device created
102      * @remove: called when device removed
103      * @driver: device driver structure
104      */
105     struct mdev_driver {
106	     int  (*probe)  (struct mdev_device *dev);
107	     void (*remove) (struct mdev_device *dev);
108	     struct attribute_group **supported_type_groups;
109	     struct device_driver    driver;
110     };
111
112A mediated bus driver for mdev should use this structure in the function calls
113to register and unregister itself with the core driver:
114
115* Register::
116
117    extern int  mdev_register_driver(struct mdev_driver *drv);
118
119* Unregister::
120
121    extern void mdev_unregister_driver(struct mdev_driver *drv);
122
123The mediated bus driver's probe function should create a vfio_device on top of
124the mdev_device and connect it to an appropriate implementation of
125vfio_device_ops.
126
127When a driver wants to add the GUID creation sysfs to an existing device it has
128probe'd to then it should call::
129
130	extern int  mdev_register_device(struct device *dev,
131	                                 struct mdev_driver *mdev_driver);
132
133This will provide the 'mdev_supported_types/XX/create' files which can then be
134used to trigger the creation of a mdev_device. The created mdev_device will be
135attached to the specified driver.
136
137When the driver needs to remove itself it calls::
138
139	extern void mdev_unregister_device(struct device *dev);
140
141Which will unbind and destroy all the created mdevs and remove the sysfs files.
142
143Mediated Device Management Interface Through sysfs
144==================================================
145
146The management interface through sysfs enables user space software, such as
147libvirt, to query and configure mediated devices in a hardware-agnostic fashion.
148This management interface provides flexibility to the underlying physical
149device's driver to support features such as:
150
151* Mediated device hot plug
152* Multiple mediated devices in a single virtual machine
153* Multiple mediated devices from different physical devices
154
155Links in the mdev_bus Class Directory
156-------------------------------------
157The /sys/class/mdev_bus/ directory contains links to devices that are registered
158with the mdev core driver.
159
160Directories and files under the sysfs for Each Physical Device
161--------------------------------------------------------------
162
163::
164
165  |- [parent physical device]
166  |--- Vendor-specific-attributes [optional]
167  |--- [mdev_supported_types]
168  |     |--- [<type-id>]
169  |     |   |--- create
170  |     |   |--- name
171  |     |   |--- available_instances
172  |     |   |--- device_api
173  |     |   |--- description
174  |     |   |--- [devices]
175  |     |--- [<type-id>]
176  |     |   |--- create
177  |     |   |--- name
178  |     |   |--- available_instances
179  |     |   |--- device_api
180  |     |   |--- description
181  |     |   |--- [devices]
182  |     |--- [<type-id>]
183  |          |--- create
184  |          |--- name
185  |          |--- available_instances
186  |          |--- device_api
187  |          |--- description
188  |          |--- [devices]
189
190* [mdev_supported_types]
191
192  The list of currently supported mediated device types and their details.
193
194  [<type-id>], device_api, and available_instances are mandatory attributes
195  that should be provided by vendor driver.
196
197* [<type-id>]
198
199  The [<type-id>] name is created by adding the device driver string as a prefix
200  to the string provided by the vendor driver. This format of this name is as
201  follows::
202
203	sprintf(buf, "%s-%s", dev_driver_string(parent->dev), group->name);
204
205  (or using mdev_parent_dev(mdev) to arrive at the parent device outside
206  of the core mdev code)
207
208* device_api
209
210  This attribute should show which device API is being created, for example,
211  "vfio-pci" for a PCI device.
212
213* available_instances
214
215  This attribute should show the number of devices of type <type-id> that can be
216  created.
217
218* [device]
219
220  This directory contains links to the devices of type <type-id> that have been
221  created.
222
223* name
224
225  This attribute should show human readable name. This is optional attribute.
226
227* description
228
229  This attribute should show brief features/description of the type. This is
230  optional attribute.
231
232Directories and Files Under the sysfs for Each mdev Device
233----------------------------------------------------------
234
235::
236
237  |- [parent phy device]
238  |--- [$MDEV_UUID]
239         |--- remove
240         |--- mdev_type {link to its type}
241         |--- vendor-specific-attributes [optional]
242
243* remove (write only)
244
245Writing '1' to the 'remove' file destroys the mdev device. The vendor driver can
246fail the remove() callback if that device is active and the vendor driver
247doesn't support hot unplug.
248
249Example::
250
251	# echo 1 > /sys/bus/mdev/devices/$mdev_UUID/remove
252
253Mediated device Hot plug
254------------------------
255
256Mediated devices can be created and assigned at runtime. The procedure to hot
257plug a mediated device is the same as the procedure to hot plug a PCI device.
258
259Translation APIs for Mediated Devices
260=====================================
261
262The following APIs are provided for translating user pfn to host pfn in a VFIO
263driver::
264
265	int vfio_pin_pages(struct vfio_device *device, unsigned long *user_pfn,
266				  int npage, int prot, unsigned long *phys_pfn);
267
268	int vfio_unpin_pages(struct vfio_device *device, unsigned long *user_pfn,
269				    int npage);
270
271These functions call back into the back-end IOMMU module by using the pin_pages
272and unpin_pages callbacks of the struct vfio_iommu_driver_ops[4]. Currently
273these callbacks are supported in the TYPE1 IOMMU module. To enable them for
274other IOMMU backend modules, such as PPC64 sPAPR module, they need to provide
275these two callback functions.
276
277Using the Sample Code
278=====================
279
280mtty.c in samples/vfio-mdev/ directory is a sample driver program to
281demonstrate how to use the mediated device framework.
282
283The sample driver creates an mdev device that simulates a serial port over a PCI
284card.
285
2861. Build and load the mtty.ko module.
287
288   This step creates a dummy device, /sys/devices/virtual/mtty/mtty/
289
290   Files in this device directory in sysfs are similar to the following::
291
292     # tree /sys/devices/virtual/mtty/mtty/
293        /sys/devices/virtual/mtty/mtty/
294        |-- mdev_supported_types
295        |   |-- mtty-1
296        |   |   |-- available_instances
297        |   |   |-- create
298        |   |   |-- device_api
299        |   |   |-- devices
300        |   |   `-- name
301        |   `-- mtty-2
302        |       |-- available_instances
303        |       |-- create
304        |       |-- device_api
305        |       |-- devices
306        |       `-- name
307        |-- mtty_dev
308        |   `-- sample_mtty_dev
309        |-- power
310        |   |-- autosuspend_delay_ms
311        |   |-- control
312        |   |-- runtime_active_time
313        |   |-- runtime_status
314        |   `-- runtime_suspended_time
315        |-- subsystem -> ../../../../class/mtty
316        `-- uevent
317
3182. Create a mediated device by using the dummy device that you created in the
319   previous step::
320
321     # echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001" >	\
322              /sys/devices/virtual/mtty/mtty/mdev_supported_types/mtty-2/create
323
3243. Add parameters to qemu-kvm::
325
326     -device vfio-pci,\
327      sysfsdev=/sys/bus/mdev/devices/83b8f4f2-509f-382f-3c1e-e6bfe0fa1001
328
3294. Boot the VM.
330
331   In the Linux guest VM, with no hardware on the host, the device appears
332   as  follows::
333
334     # lspci -s 00:05.0 -xxvv
335     00:05.0 Serial controller: Device 4348:3253 (rev 10) (prog-if 02 [16550])
336             Subsystem: Device 4348:3253
337             Physical Slot: 5
338             Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr-
339     Stepping- SERR- FastB2B- DisINTx-
340             Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
341     <TAbort- <MAbort- >SERR- <PERR- INTx-
342             Interrupt: pin A routed to IRQ 10
343             Region 0: I/O ports at c150 [size=8]
344             Region 1: I/O ports at c158 [size=8]
345             Kernel driver in use: serial
346     00: 48 43 53 32 01 00 00 02 10 02 00 07 00 00 00 00
347     10: 51 c1 00 00 59 c1 00 00 00 00 00 00 00 00 00 00
348     20: 00 00 00 00 00 00 00 00 00 00 00 00 48 43 53 32
349     30: 00 00 00 00 00 00 00 00 00 00 00 00 0a 01 00 00
350
351     In the Linux guest VM, dmesg output for the device is as follows:
352
353     serial 0000:00:05.0: PCI INT A -> Link[LNKA] -> GSI 10 (level, high) -> IRQ 10
354     0000:00:05.0: ttyS1 at I/O 0xc150 (irq = 10) is a 16550A
355     0000:00:05.0: ttyS2 at I/O 0xc158 (irq = 10) is a 16550A
356
357
3585. In the Linux guest VM, check the serial ports::
359
360     # setserial -g /dev/ttyS*
361     /dev/ttyS0, UART: 16550A, Port: 0x03f8, IRQ: 4
362     /dev/ttyS1, UART: 16550A, Port: 0xc150, IRQ: 10
363     /dev/ttyS2, UART: 16550A, Port: 0xc158, IRQ: 10
364
3656. Using minicom or any terminal emulation program, open port /dev/ttyS1 or
366   /dev/ttyS2 with hardware flow control disabled.
367
3687. Type data on the minicom terminal or send data to the terminal emulation
369   program and read the data.
370
371   Data is loop backed from hosts mtty driver.
372
3738. Destroy the mediated device that you created::
374
375     # echo 1 > /sys/bus/mdev/devices/83b8f4f2-509f-382f-3c1e-e6bfe0fa1001/remove
376
377References
378==========
379
3801. See Documentation/driver-api/vfio.rst for more information on VFIO.
3812. struct mdev_driver in include/linux/mdev.h
3823. struct mdev_parent_ops in include/linux/mdev.h
3834. struct vfio_iommu_driver_ops in include/linux/vfio.h
384