xref: /openbmc/qemu/docs/interop/vhost-user.rst (revision 9eb9350c0e519be97716f6b27f664bd0a3c41a36)
115d9c3ceSAlex Bennée.. _vhost_user_proto:
215d9c3ceSAlex Bennée
3ed1be66bSMarc-André Lureau===================
4ed1be66bSMarc-André LureauVhost-user Protocol
5ed1be66bSMarc-André Lureau===================
64e0b15c2SPeter Maydell
74e0b15c2SPeter Maydell..
84e0b15c2SPeter Maydell  Copyright 2014 Virtual Open Systems Sarl.
94e0b15c2SPeter Maydell  Copyright 2019 Intel Corporation
104e0b15c2SPeter Maydell  Licence: This work is licensed under the terms of the GNU GPL,
11ed1be66bSMarc-André Lureau           version 2 or later. See the COPYING file in the top-level
12ed1be66bSMarc-André Lureau           directory.
13ed1be66bSMarc-André Lureau
14ed1be66bSMarc-André Lureau.. contents:: Table of Contents
15ed1be66bSMarc-André Lureau
16ed1be66bSMarc-André LureauIntroduction
17ed1be66bSMarc-André Lureau============
18ed1be66bSMarc-André Lureau
19ed1be66bSMarc-André LureauThis protocol is aiming to complement the ``ioctl`` interface used to
20ed1be66bSMarc-André Lureaucontrol the vhost implementation in the Linux kernel. It implements
21ed1be66bSMarc-André Lureauthe control plane needed to establish virtqueue sharing with a user
22ed1be66bSMarc-André Lureauspace process on the same host. It uses communication over a Unix
23ed1be66bSMarc-André Lureaudomain socket to share file descriptors in the ancillary data of the
24ed1be66bSMarc-André Lureaumessage.
25ed1be66bSMarc-André Lureau
26bd59f2a1SPaolo BonziniThe protocol defines 2 sides of the communication, *front-end* and
27bd59f2a1SPaolo Bonzini*back-end*. The *front-end* is the application that shares its virtqueues, in
28bd59f2a1SPaolo Bonziniour case QEMU. The *back-end* is the consumer of the virtqueues.
29ed1be66bSMarc-André Lureau
30bd59f2a1SPaolo BonziniIn the current implementation QEMU is the *front-end*, and the *back-end*
31bd59f2a1SPaolo Bonziniis the external process consuming the virtio queues, for example a
32ed1be66bSMarc-André Lureausoftware Ethernet switch running in user space, such as Snabbswitch,
33bd59f2a1SPaolo Bonzinior a block device back-end processing read & write to a virtual
34bd59f2a1SPaolo Bonzinidisk. In order to facilitate interoperability between various back-end
35ed1be66bSMarc-André Lureauimplementations, it is recommended to follow the :ref:`Backend program
36ed1be66bSMarc-André Lureauconventions <backend_conventions>`.
37ed1be66bSMarc-André Lureau
38bd59f2a1SPaolo BonziniThe *front-end* and *back-end* can be either a client (i.e. connecting) or
39ed1be66bSMarc-André Lureauserver (listening) in the socket communication.
40ed1be66bSMarc-André Lureau
41b5108290SSergio LopezSupport for platforms other than Linux
42b5108290SSergio Lopez--------------------------------------
43b5108290SSergio Lopez
44b5108290SSergio LopezWhile vhost-user was initially developed targeting Linux, nowadays it
45b5108290SSergio Lopezis supported on any platform that provides the following features:
46b5108290SSergio Lopez
47b5108290SSergio Lopez- A way for requesting shared memory represented by a file descriptor
48b5108290SSergio Lopez  so it can be passed over a UNIX domain socket and then mapped by the
49b5108290SSergio Lopez  other process.
50b5108290SSergio Lopez
51b5108290SSergio Lopez- AF_UNIX sockets with SCM_RIGHTS, so QEMU and the other process can
52b5108290SSergio Lopez  exchange messages through it, including ancillary data when needed.
53b5108290SSergio Lopez
54b5108290SSergio Lopez- Either eventfd or pipe/pipe2. On platforms where eventfd is not
55b5108290SSergio Lopez  available, QEMU will automatically fall back to pipe2 or, as a last
56b5108290SSergio Lopez  resort, pipe. Each file descriptor will be used for receiving or
57b5108290SSergio Lopez  sending events by reading or writing (respectively) an 8-byte value
58b5108290SSergio Lopez  to the corresponding it. The 8-value itself has no meaning and
59b5108290SSergio Lopez  should not be interpreted.
60b5108290SSergio Lopez
61ed1be66bSMarc-André LureauMessage Specification
62ed1be66bSMarc-André Lureau=====================
63ed1be66bSMarc-André Lureau
64ed1be66bSMarc-André Lureau.. Note:: All numbers are in the machine native byte order.
65ed1be66bSMarc-André Lureau
66ed1be66bSMarc-André LureauA vhost-user message consists of 3 header fields and a payload.
67ed1be66bSMarc-André Lureau
68ed1be66bSMarc-André Lureau+---------+-------+------+---------+
69ed1be66bSMarc-André Lureau| request | flags | size | payload |
70ed1be66bSMarc-André Lureau+---------+-------+------+---------+
71ed1be66bSMarc-André Lureau
72ed1be66bSMarc-André LureauHeader
73ed1be66bSMarc-André Lureau------
74ed1be66bSMarc-André Lureau
75ed1be66bSMarc-André Lureau:request: 32-bit type of the request
76ed1be66bSMarc-André Lureau
77ed1be66bSMarc-André Lureau:flags: 32-bit bit field
78ed1be66bSMarc-André Lureau
79ed1be66bSMarc-André Lureau- Lower 2 bits are the version (currently 0x01)
80bd59f2a1SPaolo Bonzini- Bit 2 is the reply flag - needs to be sent on each reply from the back-end
81ed1be66bSMarc-André Lureau- Bit 3 is the need_reply flag - see :ref:`REPLY_ACK <reply_ack>` for
82ed1be66bSMarc-André Lureau  details.
83ed1be66bSMarc-André Lureau
84ed1be66bSMarc-André Lureau:size: 32-bit size of the payload
85ed1be66bSMarc-André Lureau
86ed1be66bSMarc-André LureauPayload
87ed1be66bSMarc-André Lureau-------
88ed1be66bSMarc-André Lureau
89ed1be66bSMarc-André LureauDepending on the request type, **payload** can be:
90ed1be66bSMarc-André Lureau
91ed1be66bSMarc-André LureauA single 64-bit integer
92ed1be66bSMarc-André Lureau^^^^^^^^^^^^^^^^^^^^^^^
93ed1be66bSMarc-André Lureau
94ed1be66bSMarc-André Lureau+-----+
95ed1be66bSMarc-André Lureau| u64 |
96ed1be66bSMarc-André Lureau+-----+
97ed1be66bSMarc-André Lureau
98ed1be66bSMarc-André Lureau:u64: a 64-bit unsigned integer
99ed1be66bSMarc-André Lureau
100ed1be66bSMarc-André LureauA vring state description
101ed1be66bSMarc-André Lureau^^^^^^^^^^^^^^^^^^^^^^^^^
102ed1be66bSMarc-André Lureau
103ed1be66bSMarc-André Lureau+-------+-----+
104ed1be66bSMarc-André Lureau| index | num |
105ed1be66bSMarc-André Lureau+-------+-----+
106ed1be66bSMarc-André Lureau
107ed1be66bSMarc-André Lureau:index: a 32-bit index
108ed1be66bSMarc-André Lureau
109ed1be66bSMarc-André Lureau:num: a 32-bit number
110ed1be66bSMarc-André Lureau
111c7f21816SHanna CzenczekA vring descriptor index for split virtqueues
112c7f21816SHanna Czenczek^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
113c7f21816SHanna Czenczek
114c7f21816SHanna Czenczek+-------------+---------------------+
115c7f21816SHanna Czenczek| vring index | index in avail ring |
116c7f21816SHanna Czenczek+-------------+---------------------+
117c7f21816SHanna Czenczek
118c7f21816SHanna Czenczek:vring index: 32-bit index of the respective virtqueue
119c7f21816SHanna Czenczek
120c7f21816SHanna Czenczek:index in avail ring: 32-bit value, of which currently only the lower 16
121c7f21816SHanna Czenczek  bits are used:
122c7f21816SHanna Czenczek
123c7f21816SHanna Czenczek  - Bits 0–15: Index of the next *Available Ring* descriptor that the
124c7f21816SHanna Czenczek    back-end will process.  This is a free-running index that is not
125c7f21816SHanna Czenczek    wrapped by the ring size.
126c7f21816SHanna Czenczek  - Bits 16–31: Reserved (set to zero)
127c7f21816SHanna Czenczek
128c7f21816SHanna CzenczekVring descriptor indices for packed virtqueues
129c7f21816SHanna Czenczek^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
130c7f21816SHanna Czenczek
131c7f21816SHanna Czenczek+-------------+--------------------+
132c7f21816SHanna Czenczek| vring index | descriptor indices |
133c7f21816SHanna Czenczek+-------------+--------------------+
134c7f21816SHanna Czenczek
135c7f21816SHanna Czenczek:vring index: 32-bit index of the respective virtqueue
136c7f21816SHanna Czenczek
137c7f21816SHanna Czenczek:descriptor indices: 32-bit value:
138c7f21816SHanna Czenczek
139c7f21816SHanna Czenczek  - Bits 0–14: Index of the next *Available Ring* descriptor that the
140c7f21816SHanna Czenczek    back-end will process.  This is a free-running index that is not
141c7f21816SHanna Czenczek    wrapped by the ring size.
142c7f21816SHanna Czenczek  - Bit 15: Driver (Available) Ring Wrap Counter
143c7f21816SHanna Czenczek  - Bits 16–30: Index of the entry in the *Used Ring* where the back-end
144c7f21816SHanna Czenczek    will place the next descriptor.  This is a free-running index that
145c7f21816SHanna Czenczek    is not wrapped by the ring size.
146c7f21816SHanna Czenczek  - Bit 31: Device (Used) Ring Wrap Counter
147c7f21816SHanna Czenczek
148ed1be66bSMarc-André LureauA vring address description
149ed1be66bSMarc-André Lureau^^^^^^^^^^^^^^^^^^^^^^^^^^^
150ed1be66bSMarc-André Lureau
151aa05bd9eSAndrey Ignatov+-------+-------+------------+------+-----------+-----+
152aa05bd9eSAndrey Ignatov| index | flags | descriptor | used | available | log |
153aa05bd9eSAndrey Ignatov+-------+-------+------------+------+-----------+-----+
154ed1be66bSMarc-André Lureau
155ed1be66bSMarc-André Lureau:index: a 32-bit vring index
156ed1be66bSMarc-André Lureau
157ed1be66bSMarc-André Lureau:flags: a 32-bit vring flags
158ed1be66bSMarc-André Lureau
159ed1be66bSMarc-André Lureau:descriptor: a 64-bit ring address of the vring descriptor table
160ed1be66bSMarc-André Lureau
161ed1be66bSMarc-André Lureau:used: a 64-bit ring address of the vring used ring
162ed1be66bSMarc-André Lureau
163ed1be66bSMarc-André Lureau:available: a 64-bit ring address of the vring available ring
164ed1be66bSMarc-André Lureau
165ed1be66bSMarc-André Lureau:log: a 64-bit guest address for logging
166ed1be66bSMarc-André Lureau
167ed1be66bSMarc-André LureauNote that a ring address is an IOVA if ``VIRTIO_F_IOMMU_PLATFORM`` has
168ed1be66bSMarc-André Lureaubeen negotiated. Otherwise it is a user address.
169ed1be66bSMarc-André Lureau
170*b87ea798Sluzhixing12345.. _memory_region_description:
171*b87ea798Sluzhixing12345
17217c67f68SViresh KumarMemory region description
17317c67f68SViresh Kumar^^^^^^^^^^^^^^^^^^^^^^^^^
174ed1be66bSMarc-André Lureau
175ed1be66bSMarc-André Lureau+---------------+------+--------------+-------------+
176ed1be66bSMarc-André Lureau| guest address | size | user address | mmap offset |
177ed1be66bSMarc-André Lureau+---------------+------+--------------+-------------+
178ed1be66bSMarc-André Lureau
179ed1be66bSMarc-André Lureau:guest address: a 64-bit guest address of the region
180ed1be66bSMarc-André Lureau
181ed1be66bSMarc-André Lureau:size: a 64-bit size
182ed1be66bSMarc-André Lureau
183ed1be66bSMarc-André Lureau:user address: a 64-bit user address
184ed1be66bSMarc-André Lureau
185*b87ea798Sluzhixing12345:mmap offset: a 64-bit offset where region starts in the mapped memory
186ed1be66bSMarc-André Lureau
187f21e95eeSViresh KumarWhen the ``VHOST_USER_PROTOCOL_F_XEN_MMAP`` protocol feature has been
188f21e95eeSViresh Kumarsuccessfully negotiated, the memory region description contains two extra
189f21e95eeSViresh Kumarfields at the end.
190f21e95eeSViresh Kumar
191f21e95eeSViresh Kumar+---------------+------+--------------+-------------+----------------+-------+
192f21e95eeSViresh Kumar| guest address | size | user address | mmap offset | xen mmap flags | domid |
193f21e95eeSViresh Kumar+---------------+------+--------------+-------------+----------------+-------+
194f21e95eeSViresh Kumar
195*b87ea798Sluzhixing12345:xen mmap flags: a 32-bit bit field
196f21e95eeSViresh Kumar
197f21e95eeSViresh Kumar- Bit 0 is set for Xen foreign memory mapping.
198f21e95eeSViresh Kumar- Bit 1 is set for Xen grant memory mapping.
199f21e95eeSViresh Kumar- Bit 8 is set if the memory region can not be mapped in advance, and memory
200f21e95eeSViresh Kumar  areas within this region must be mapped / unmapped only when required by the
201f21e95eeSViresh Kumar  back-end. The back-end shouldn't try to map the entire region at once, as the
202f21e95eeSViresh Kumar  front-end may not allow it. The back-end should rather map only the required
203f21e95eeSViresh Kumar  amount of memory at once and unmap it after it is used.
204f21e95eeSViresh Kumar
205f21e95eeSViresh Kumar:domid: a 32-bit Xen hypervisor specific domain id.
206f21e95eeSViresh Kumar
2073009edffSStefan HajnocziSingle memory region description
2083009edffSStefan Hajnoczi^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2093009edffSStefan Hajnoczi
21017c67f68SViresh Kumar+---------+--------+
21117c67f68SViresh Kumar| padding | region |
21217c67f68SViresh Kumar+---------+--------+
2133009edffSStefan Hajnoczi
2143009edffSStefan Hajnoczi:padding: 64-bit
2153009edffSStefan Hajnoczi
216*b87ea798Sluzhixing12345:region: region is represented by :ref:`Memory region description <memory_region_description>`.
2173009edffSStefan Hajnoczi
21817c67f68SViresh KumarMultiple Memory regions description
21917c67f68SViresh Kumar^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2203009edffSStefan Hajnoczi
22117c67f68SViresh Kumar+-------------+---------+---------+-----+---------+
22217c67f68SViresh Kumar| num regions | padding | region0 | ... | region7 |
22317c67f68SViresh Kumar+-------------+---------+---------+-----+---------+
2243009edffSStefan Hajnoczi
22517c67f68SViresh Kumar:num regions: a 32-bit number of regions
22617c67f68SViresh Kumar
22717c67f68SViresh Kumar:padding: 32-bit
22817c67f68SViresh Kumar
229*b87ea798Sluzhixing12345:regions: regions field contains 8 regions of type :ref:`Memory region description <memory_region_description>`.
2303009edffSStefan Hajnoczi
231ed1be66bSMarc-André LureauLog description
232ed1be66bSMarc-André Lureau^^^^^^^^^^^^^^^
233ed1be66bSMarc-André Lureau
234ed1be66bSMarc-André Lureau+----------+------------+
235ed1be66bSMarc-André Lureau| log size | log offset |
236ed1be66bSMarc-André Lureau+----------+------------+
237ed1be66bSMarc-André Lureau
238*b87ea798Sluzhixing12345:log size: a 64-bit size of area used for logging
239ed1be66bSMarc-André Lureau
240*b87ea798Sluzhixing12345:log offset: a 64-bit offset from start of supplied file descriptor where
241ed1be66bSMarc-André Lureau             logging starts (i.e. where guest address 0 would be
242ed1be66bSMarc-André Lureau             logged)
243ed1be66bSMarc-André Lureau
244ed1be66bSMarc-André LureauAn IOTLB message
245ed1be66bSMarc-André Lureau^^^^^^^^^^^^^^^^
246ed1be66bSMarc-André Lureau
247ed1be66bSMarc-André Lureau+------+------+--------------+-------------------+------+
248ed1be66bSMarc-André Lureau| iova | size | user address | permissions flags | type |
249ed1be66bSMarc-André Lureau+------+------+--------------+-------------------+------+
250ed1be66bSMarc-André Lureau
251ed1be66bSMarc-André Lureau:iova: a 64-bit I/O virtual address programmed by the guest
252ed1be66bSMarc-André Lureau
253ed1be66bSMarc-André Lureau:size: a 64-bit size
254ed1be66bSMarc-André Lureau
255ed1be66bSMarc-André Lureau:user address: a 64-bit user address
256ed1be66bSMarc-André Lureau
257ed1be66bSMarc-André Lureau:permissions flags: an 8-bit value:
258ed1be66bSMarc-André Lureau  - 0: No access
259ed1be66bSMarc-André Lureau  - 1: Read access
260ed1be66bSMarc-André Lureau  - 2: Write access
261ed1be66bSMarc-André Lureau  - 3: Read/Write access
262ed1be66bSMarc-André Lureau
263ed1be66bSMarc-André Lureau:type: an 8-bit IOTLB message type:
264ed1be66bSMarc-André Lureau  - 1: IOTLB miss
265ed1be66bSMarc-André Lureau  - 2: IOTLB update
266ed1be66bSMarc-André Lureau  - 3: IOTLB invalidate
267ed1be66bSMarc-André Lureau  - 4: IOTLB access fail
268ed1be66bSMarc-André Lureau
269ed1be66bSMarc-André LureauVirtio device config space
270ed1be66bSMarc-André Lureau^^^^^^^^^^^^^^^^^^^^^^^^^^
271ed1be66bSMarc-André Lureau
272ed1be66bSMarc-André Lureau+--------+------+-------+---------+
273ed1be66bSMarc-André Lureau| offset | size | flags | payload |
274ed1be66bSMarc-André Lureau+--------+------+-------+---------+
275ed1be66bSMarc-André Lureau
276ed1be66bSMarc-André Lureau:offset: a 32-bit offset of virtio device's configuration space
277ed1be66bSMarc-André Lureau
278ed1be66bSMarc-André Lureau:size: a 32-bit configuration space access size in bytes
279ed1be66bSMarc-André Lureau
280ed1be66bSMarc-André Lureau:flags: a 32-bit value:
2819323e79fSPeter Maydell  - 0: Vhost front-end messages used for writable fields
282bd59f2a1SPaolo Bonzini  - 1: Vhost front-end messages used for live migration
283ed1be66bSMarc-André Lureau
284ed1be66bSMarc-André Lureau:payload: Size bytes array holding the contents of the virtio
285ed1be66bSMarc-André Lureau          device's configuration space
286ed1be66bSMarc-André Lureau
287ed1be66bSMarc-André LureauVring area description
288ed1be66bSMarc-André Lureau^^^^^^^^^^^^^^^^^^^^^^
289ed1be66bSMarc-André Lureau
290ed1be66bSMarc-André Lureau+-----+------+--------+
291ed1be66bSMarc-André Lureau| u64 | size | offset |
292ed1be66bSMarc-André Lureau+-----+------+--------+
293ed1be66bSMarc-André Lureau
294ed1be66bSMarc-André Lureau:u64: a 64-bit integer contains vring index and flags
295ed1be66bSMarc-André Lureau
296ed1be66bSMarc-André Lureau:size: a 64-bit size of this area
297ed1be66bSMarc-André Lureau
298ed1be66bSMarc-André Lureau:offset: a 64-bit offset of this area from the start of the
299ed1be66bSMarc-André Lureau         supplied file descriptor
300ed1be66bSMarc-André Lureau
301ed1be66bSMarc-André LureauInflight description
302ed1be66bSMarc-André Lureau^^^^^^^^^^^^^^^^^^^^
303ed1be66bSMarc-André Lureau
304ed1be66bSMarc-André Lureau+-----------+-------------+------------+------------+
305ed1be66bSMarc-André Lureau| mmap size | mmap offset | num queues | queue size |
306ed1be66bSMarc-André Lureau+-----------+-------------+------------+------------+
307ed1be66bSMarc-André Lureau
308ed1be66bSMarc-André Lureau:mmap size: a 64-bit size of area to track inflight I/O
309ed1be66bSMarc-André Lureau
310ed1be66bSMarc-André Lureau:mmap offset: a 64-bit offset of this area from the start
311ed1be66bSMarc-André Lureau              of the supplied file descriptor
312ed1be66bSMarc-André Lureau
313ed1be66bSMarc-André Lureau:num queues: a 16-bit number of virtqueues
314ed1be66bSMarc-André Lureau
315ed1be66bSMarc-André Lureau:queue size: a 16-bit size of virtqueues
316ed1be66bSMarc-André Lureau
317d4eb5038SHanna CzenczekVhostUserShared
318d4eb5038SHanna Czenczek^^^^^^^^^^^^^^^
319d4eb5038SHanna Czenczek
320d4eb5038SHanna Czenczek+------+
321d4eb5038SHanna Czenczek| UUID |
322d4eb5038SHanna Czenczek+------+
323d4eb5038SHanna Czenczek
324d4eb5038SHanna Czenczek:UUID: 16 bytes UUID, whose first three components (a 32-bit value, then
325d4eb5038SHanna Czenczek  two 16-bit values) are stored in big endian.
326d4eb5038SHanna Czenczek
32701923309SHanna CzenczekDevice state transfer parameters
32801923309SHanna Czenczek^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
32901923309SHanna Czenczek
33001923309SHanna Czenczek+--------------------+-----------------+
33101923309SHanna Czenczek| transfer direction | migration phase |
33201923309SHanna Czenczek+--------------------+-----------------+
33301923309SHanna Czenczek
33401923309SHanna Czenczek:transfer direction: a 32-bit enum, describing the direction in which
33501923309SHanna Czenczek  the state is transferred:
33601923309SHanna Czenczek
33701923309SHanna Czenczek  - 0: Save: Transfer the state from the back-end to the front-end,
33801923309SHanna Czenczek    which happens on the source side of migration
33901923309SHanna Czenczek  - 1: Load: Transfer the state from the front-end to the back-end,
34001923309SHanna Czenczek    which happens on the destination side of migration
34101923309SHanna Czenczek
34201923309SHanna Czenczek:migration phase: a 32-bit enum, describing the state in which the VM
34301923309SHanna Czenczek  guest and devices are:
34401923309SHanna Czenczek
34501923309SHanna Czenczek  - 0: Stopped (in the period after the transfer of memory-mapped
34601923309SHanna Czenczek    regions before switch-over to the destination): The VM guest is
34701923309SHanna Czenczek    stopped, and the vhost-user device is suspended (see
34801923309SHanna Czenczek    :ref:`Suspended device state <suspended_device_state>`).
34901923309SHanna Czenczek
35001923309SHanna Czenczek  In the future, additional phases might be added e.g. to allow
35101923309SHanna Czenczek  iterative migration while the device is running.
35201923309SHanna Czenczek
353ed1be66bSMarc-André LureauC structure
354ed1be66bSMarc-André Lureau-----------
355ed1be66bSMarc-André Lureau
356ed1be66bSMarc-André LureauIn QEMU the vhost-user message is implemented with the following struct:
357ed1be66bSMarc-André Lureau
358ed1be66bSMarc-André Lureau.. code:: c
359ed1be66bSMarc-André Lureau
360ed1be66bSMarc-André Lureau  typedef struct VhostUserMsg {
361ed1be66bSMarc-André Lureau      VhostUserRequest request;
362ed1be66bSMarc-André Lureau      uint32_t flags;
363ed1be66bSMarc-André Lureau      uint32_t size;
364ed1be66bSMarc-André Lureau      union {
365ed1be66bSMarc-André Lureau          uint64_t u64;
366ed1be66bSMarc-André Lureau          struct vhost_vring_state state;
367ed1be66bSMarc-André Lureau          struct vhost_vring_addr addr;
368ed1be66bSMarc-André Lureau          VhostUserMemory memory;
369ed1be66bSMarc-André Lureau          VhostUserLog log;
370ed1be66bSMarc-André Lureau          struct vhost_iotlb_msg iotlb;
371ed1be66bSMarc-André Lureau          VhostUserConfig config;
372ed1be66bSMarc-André Lureau          VhostUserVringArea area;
373ed1be66bSMarc-André Lureau          VhostUserInflight inflight;
374ed1be66bSMarc-André Lureau      };
375ed1be66bSMarc-André Lureau  } QEMU_PACKED VhostUserMsg;
376ed1be66bSMarc-André Lureau
377ed1be66bSMarc-André LureauCommunication
378ed1be66bSMarc-André Lureau=============
379ed1be66bSMarc-André Lureau
380ed1be66bSMarc-André LureauThe protocol for vhost-user is based on the existing implementation of
381ed1be66bSMarc-André Lureauvhost for the Linux Kernel. Most messages that can be sent via the
382ed1be66bSMarc-André LureauUnix domain socket implementing vhost-user have an equivalent ioctl to
383ed1be66bSMarc-André Lureauthe kernel implementation.
384ed1be66bSMarc-André Lureau
385bd59f2a1SPaolo BonziniThe communication consists of the *front-end* sending message requests and
386bd59f2a1SPaolo Bonzinithe *back-end* sending message replies. Most of the requests don't require
387*b87ea798Sluzhixing12345replies, except for the following requests:
388ed1be66bSMarc-André Lureau
389ed1be66bSMarc-André Lureau* ``VHOST_USER_GET_FEATURES``
390ed1be66bSMarc-André Lureau* ``VHOST_USER_GET_PROTOCOL_FEATURES``
391ed1be66bSMarc-André Lureau* ``VHOST_USER_GET_VRING_BASE``
392ed1be66bSMarc-André Lureau* ``VHOST_USER_SET_LOG_BASE`` (if ``VHOST_USER_PROTOCOL_F_LOG_SHMFD``)
393ed1be66bSMarc-André Lureau* ``VHOST_USER_GET_INFLIGHT_FD`` (if ``VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD``)
394ed1be66bSMarc-André Lureau
395ed1be66bSMarc-André Lureau.. seealso::
396ed1be66bSMarc-André Lureau
397ed1be66bSMarc-André Lureau   :ref:`REPLY_ACK <reply_ack>`
398ed1be66bSMarc-André Lureau       The section on ``REPLY_ACK`` protocol extension.
399ed1be66bSMarc-André Lureau
400bd59f2a1SPaolo BonziniThere are several messages that the front-end sends with file descriptors passed
401ed1be66bSMarc-André Lureauin the ancillary data:
402ed1be66bSMarc-André Lureau
40331009d13SKevin Wolf* ``VHOST_USER_ADD_MEM_REG``
404ed1be66bSMarc-André Lureau* ``VHOST_USER_SET_MEM_TABLE``
405ed1be66bSMarc-André Lureau* ``VHOST_USER_SET_LOG_BASE`` (if ``VHOST_USER_PROTOCOL_F_LOG_SHMFD``)
406ed1be66bSMarc-André Lureau* ``VHOST_USER_SET_LOG_FD``
407ed1be66bSMarc-André Lureau* ``VHOST_USER_SET_VRING_KICK``
408ed1be66bSMarc-André Lureau* ``VHOST_USER_SET_VRING_CALL``
409ed1be66bSMarc-André Lureau* ``VHOST_USER_SET_VRING_ERR``
410e9ca9f33SMaxime Coquelin* ``VHOST_USER_SET_BACKEND_REQ_FD`` (previous name ``VHOST_USER_SET_SLAVE_REQ_FD``)
411ed1be66bSMarc-André Lureau* ``VHOST_USER_SET_INFLIGHT_FD`` (if ``VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD``)
41201923309SHanna Czenczek* ``VHOST_USER_SET_DEVICE_STATE_FD``
413ed1be66bSMarc-André Lureau
414bd59f2a1SPaolo BonziniIf *front-end* is unable to send the full message or receives a wrong
415ed1be66bSMarc-André Lureaureply it will close the connection. An optional reconnection mechanism
416ed1be66bSMarc-André Lureaucan be implemented.
417ed1be66bSMarc-André Lureau
418bd59f2a1SPaolo BonziniIf *back-end* detects some error such as incompatible features, it may also
4193348e7e3SJohannes Bergclose the connection. This should only happen in exceptional circumstances.
4203348e7e3SJohannes Berg
421ed1be66bSMarc-André LureauAny protocol extensions are gated by protocol feature bits, which
422bd59f2a1SPaolo Bonziniallows full backwards compatibility on both front-end and back-end.  As
423bd59f2a1SPaolo Bonziniolder back-ends don't support negotiating protocol features, a feature
424ed1be66bSMarc-André Lureaubit was dedicated for this purpose::
425ed1be66bSMarc-André Lureau
426ed1be66bSMarc-André Lureau  #define VHOST_USER_F_PROTOCOL_FEATURES 30
427ed1be66bSMarc-André Lureau
428fa997266SAlex BennéeNote that VHOST_USER_F_PROTOCOL_FEATURES is the UNUSED (30) feature
429fa997266SAlex Bennéebit defined in `VIRTIO 1.1 6.3 Legacy Interface: Reserved Feature Bits
430fa997266SAlex Bennée<https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#x1-4130003>`_.
431fa997266SAlex BennéeVIRTIO devices do not advertise this feature bit and therefore VIRTIO
432fa997266SAlex Bennéedrivers cannot negotiate it.
433fa997266SAlex Bennée
434fa997266SAlex BennéeThis reserved feature bit was reused by the vhost-user protocol to add
435fa997266SAlex Bennéevhost-user protocol feature negotiation in a backwards compatible
43637cbfcebSMichael S. Tsirkinfashion. Old vhost-user front-end and back-end implementations continue to
437fa997266SAlex Bennéework even though they are not aware of vhost-user protocol feature
438fa997266SAlex Bennéenegotiation.
439fa997266SAlex Bennée
4400647098dSPaolo BonziniRing states
4410647098dSPaolo Bonzini-----------
442ed1be66bSMarc-André Lureau
443eae69cc3SHanna CzenczekRings have two independent states: started/stopped, and enabled/disabled.
444ed1be66bSMarc-André Lureau
445eae69cc3SHanna Czenczek* While a ring is stopped, the back-end must not process the ring at
446eae69cc3SHanna Czenczek  all, regardless of whether it is enabled or disabled.  The
447eae69cc3SHanna Czenczek  enabled/disabled state should still be tracked, though, so it can come
448eae69cc3SHanna Czenczek  into effect once the ring is started.
449ed1be66bSMarc-André Lureau
450eae69cc3SHanna Czenczek* started and disabled: The back-end must process the ring without
4510647098dSPaolo Bonzini  causing any side effects.  For example, for a networking device,
452bd59f2a1SPaolo Bonzini  in the disabled state the back-end must not supply any new RX packets,
4530647098dSPaolo Bonzini  but must process and discard any TX packets.
454ed1be66bSMarc-André Lureau
455eae69cc3SHanna Czenczek* started and enabled: The back-end must process the ring normally, i.e.
456eae69cc3SHanna Czenczek  process all requests and execute them.
4570647098dSPaolo Bonzini
458eae69cc3SHanna CzenczekEach ring is initialized in a stopped and disabled state.  The back-end
459eae69cc3SHanna Czenczekmust start a ring upon receiving a kick (that is, detecting that file
460eae69cc3SHanna Czenczekdescriptor is readable) on the descriptor specified by
461eae69cc3SHanna Czenczek``VHOST_USER_SET_VRING_KICK`` or receiving the in-band message
462eae69cc3SHanna Czenczek``VHOST_USER_VRING_KICK`` if negotiated, and stop a ring upon receiving
463eae69cc3SHanna Czenczek``VHOST_USER_GET_VRING_BASE``.
4640647098dSPaolo Bonzini
4650647098dSPaolo BonziniRings can be enabled or disabled by ``VHOST_USER_SET_VRING_ENABLE``.
466ed1be66bSMarc-André Lureau
467eae69cc3SHanna CzenczekIn addition, upon receiving a ``VHOST_USER_SET_FEATURES`` message from
468eae69cc3SHanna Czenczekthe front-end without ``VHOST_USER_F_PROTOCOL_FEATURES`` set, the
469eae69cc3SHanna Czenczekback-end must enable all rings immediately.
470ed1be66bSMarc-André Lureau
471bd59f2a1SPaolo BonziniWhile processing the rings (whether they are enabled or not), the back-end
472ed1be66bSMarc-André Lureaumust support changing some configuration aspects on the fly.
473ed1be66bSMarc-André Lureau
474a6e76dd3SHanna Czenczek.. _suspended_device_state:
475a6e76dd3SHanna Czenczek
476a6e76dd3SHanna CzenczekSuspended device state
477a6e76dd3SHanna Czenczek^^^^^^^^^^^^^^^^^^^^^^
478a6e76dd3SHanna Czenczek
479a6e76dd3SHanna CzenczekWhile all vrings are stopped, the device is *suspended*.  In addition to
480a6e76dd3SHanna Czenczeknot processing any vring (because they are stopped), the device must:
481a6e76dd3SHanna Czenczek
482a6e76dd3SHanna Czenczek* not write to any guest memory regions,
483a6e76dd3SHanna Czenczek* not send any notifications to the guest,
484a6e76dd3SHanna Czenczek* not send any messages to the front-end,
485a6e76dd3SHanna Czenczek* still process and reply to messages from the front-end.
486a6e76dd3SHanna Czenczek
487ed1be66bSMarc-André LureauMultiple queue support
488ed1be66bSMarc-André Lureau----------------------
489ed1be66bSMarc-André Lureau
490bd59f2a1SPaolo BonziniMany devices have a fixed number of virtqueues.  In this case the front-end
491df98d7ccSStefan Hajnoczialready knows the number of available virtqueues without communicating with the
492bd59f2a1SPaolo Bonziniback-end.
493df98d7ccSStefan Hajnoczi
494df98d7ccSStefan HajnocziSome devices do not have a fixed number of virtqueues.  Instead the maximum
495bd59f2a1SPaolo Bonzininumber of virtqueues is chosen by the back-end.  The number can depend on host
496bd59f2a1SPaolo Bonziniresource availability or back-end implementation details.  Such devices are called
497df98d7ccSStefan Hajnoczimultiple queue devices.
498df98d7ccSStefan Hajnoczi
499bd59f2a1SPaolo BonziniMultiple queue support allows the back-end to advertise the maximum number of
500bd59f2a1SPaolo Bonziniqueues.  This is treated as a protocol extension, hence the back-end has to
5013ef4dff2SStefan Hajnocziimplement protocol features first. The multiple queues feature is supported
5023ef4dff2SStefan Hajnoczionly when the protocol feature ``VHOST_USER_PROTOCOL_F_MQ`` (bit 0) is set.
503ed1be66bSMarc-André Lureau
504bd59f2a1SPaolo BonziniThe max number of queues the back-end supports can be queried with message
505bd59f2a1SPaolo Bonzini``VHOST_USER_GET_QUEUE_NUM``. Front-end should stop when the number of requested
5063ef4dff2SStefan Hajnocziqueues is bigger than that.
507ed1be66bSMarc-André Lureau
508bd59f2a1SPaolo BonziniAs all queues share one connection, the front-end uses a unique index for each
5093ef4dff2SStefan Hajnocziqueue in the sent message to identify a specified queue.
5103ef4dff2SStefan Hajnoczi
511bd59f2a1SPaolo BonziniThe front-end enables queues by sending message ``VHOST_USER_SET_VRING_ENABLE``.
5123ef4dff2SStefan Hajnoczivhost-user-net has historically automatically enabled the first queue pair.
513ed1be66bSMarc-André Lureau
514bd59f2a1SPaolo BonziniBack-ends should always implement the ``VHOST_USER_PROTOCOL_F_MQ`` protocol
515df98d7ccSStefan Hajnoczifeature, even for devices with a fixed number of virtqueues, since it is simple
516df98d7ccSStefan Hajnoczito implement and offers a degree of introspection.
517df98d7ccSStefan Hajnoczi
518bd59f2a1SPaolo BonziniFront-ends must not rely on the ``VHOST_USER_PROTOCOL_F_MQ`` protocol feature for
519df98d7ccSStefan Hajnoczidevices with a fixed number of virtqueues.  Only true multiqueue devices
520df98d7ccSStefan Hajnoczirequire this protocol feature.
521df98d7ccSStefan Hajnoczi
522ed1be66bSMarc-André LureauMigration
523ed1be66bSMarc-André Lureau---------
524ed1be66bSMarc-André Lureau
525bd59f2a1SPaolo BonziniDuring live migration, the front-end may need to track the modifications
526bd59f2a1SPaolo Bonzinithe back-end makes to the memory mapped regions. The front-end should mark
527ed1be66bSMarc-André Lureauthe dirty pages in a log. Once it complies to this logging, it may
528ed1be66bSMarc-André Lureaudeclare the ``VHOST_F_LOG_ALL`` vhost feature.
529ed1be66bSMarc-André Lureau
530bd59f2a1SPaolo BonziniTo start/stop logging of data/used ring writes, the front-end may send
531ed1be66bSMarc-André Lureaumessages ``VHOST_USER_SET_FEATURES`` with ``VHOST_F_LOG_ALL`` and
532ed1be66bSMarc-André Lureau``VHOST_USER_SET_VRING_ADDR`` with ``VHOST_VRING_F_LOG`` in ring's
533ed1be66bSMarc-André Lureauflags set to 1/0, respectively.
534ed1be66bSMarc-André Lureau
535ed1be66bSMarc-André LureauAll the modifications to memory pointed by vring "descriptor" should
536ed1be66bSMarc-André Lureaube marked. Modifications to "used" vring should be marked if
537ed1be66bSMarc-André Lureau``VHOST_VRING_F_LOG`` is part of ring's flags.
538ed1be66bSMarc-André Lureau
539ed1be66bSMarc-André LureauDirty pages are of size::
540ed1be66bSMarc-André Lureau
541ed1be66bSMarc-André Lureau  #define VHOST_LOG_PAGE 0x1000
542ed1be66bSMarc-André Lureau
543ed1be66bSMarc-André LureauThe log memory fd is provided in the ancillary data of
544bd59f2a1SPaolo Bonzini``VHOST_USER_SET_LOG_BASE`` message when the back-end has
545ed1be66bSMarc-André Lureau``VHOST_USER_PROTOCOL_F_LOG_SHMFD`` protocol feature.
546ed1be66bSMarc-André Lureau
547ed1be66bSMarc-André LureauThe size of the log is supplied as part of ``VhostUserMsg`` which
548ed1be66bSMarc-André Lureaushould be large enough to cover all known guest addresses. Log starts
549ed1be66bSMarc-André Lureauat the supplied offset in the supplied file descriptor.  The log
550ed1be66bSMarc-André Lureaucovers from address 0 to the maximum of guest regions. In pseudo-code,
551ed1be66bSMarc-André Lureauto mark page at ``addr`` as dirty::
552ed1be66bSMarc-André Lureau
553ed1be66bSMarc-André Lureau  page = addr / VHOST_LOG_PAGE
554ed1be66bSMarc-André Lureau  log[page / 8] |= 1 << page % 8
555ed1be66bSMarc-André Lureau
556ed1be66bSMarc-André LureauWhere ``addr`` is the guest physical address.
557ed1be66bSMarc-André Lureau
558ed1be66bSMarc-André LureauUse atomic operations, as the log may be concurrently manipulated.
559ed1be66bSMarc-André Lureau
560ed1be66bSMarc-André LureauNote that when logging modifications to the used ring (when
561ed1be66bSMarc-André Lureau``VHOST_VRING_F_LOG`` is set for this ring), ``log_guest_addr`` should
562ed1be66bSMarc-André Lureaube used to calculate the log offset: the write to first byte of the
563ed1be66bSMarc-André Lureauused ring is logged at this offset from log start. Also note that this
564ed1be66bSMarc-André Lureauvalue might be outside the legal guest physical address range
565ed1be66bSMarc-André Lureau(i.e. does not have to be covered by the ``VhostUserMemory`` table), but
566ed1be66bSMarc-André Lureauthe bit offset of the last byte of the ring must fall within the size
567ed1be66bSMarc-André Lureausupplied by ``VhostUserLog``.
568ed1be66bSMarc-André Lureau
569ed1be66bSMarc-André Lureau``VHOST_USER_SET_LOG_FD`` is an optional message with an eventfd in
570bd59f2a1SPaolo Bonziniancillary data, it may be used to inform the front-end that the log has
571ed1be66bSMarc-André Lureaubeen modified.
572ed1be66bSMarc-André Lureau
573ed1be66bSMarc-André LureauOnce the source has finished migration, rings will be stopped by the
574a6e76dd3SHanna Czenczeksource (:ref:`Suspended device state <suspended_device_state>`). No
575a6e76dd3SHanna Czenczekfurther update must be done before rings are restarted.
576ed1be66bSMarc-André Lureau
577bd59f2a1SPaolo BonziniIn postcopy migration the back-end is started before all the memory has
578ed1be66bSMarc-André Lureaubeen received from the source host, and care must be taken to avoid
579bd59f2a1SPaolo Bonziniaccessing pages that have yet to be received.  The back-end opens a
580ed1be66bSMarc-André Lureau'userfault'-fd and registers the memory with it; this fd is then
581bd59f2a1SPaolo Bonzinipassed back over to the front-end.  The front-end services requests on the
582ed1be66bSMarc-André Lureauuserfaultfd for pages that are accessed and when the page is available
583ed1be66bSMarc-André Lureauit performs WAKE ioctl's on the userfaultfd to wake the stalled
584bd59f2a1SPaolo Bonziniback-end.  The front-end indicates support for this via the
585ed1be66bSMarc-André Lureau``VHOST_USER_PROTOCOL_F_PAGEFAULT`` feature.
586ed1be66bSMarc-André Lureau
58701923309SHanna Czenczek.. _migrating_backend_state:
58801923309SHanna Czenczek
58901923309SHanna CzenczekMigrating back-end state
59001923309SHanna Czenczek^^^^^^^^^^^^^^^^^^^^^^^^
59101923309SHanna Czenczek
59201923309SHanna CzenczekMigrating device state involves transferring the state from one
59301923309SHanna Czenczekback-end, called the source, to another back-end, called the
59401923309SHanna Czenczekdestination.  After migration, the destination transparently resumes
59501923309SHanna Czenczekoperation without requiring the driver to re-initialize the device at
59601923309SHanna Czenczekthe VIRTIO level.  If the migration fails, then the source can
59701923309SHanna Czenczektransparently resume operation until another migration attempt is made.
59801923309SHanna Czenczek
59901923309SHanna CzenczekGenerally, the front-end is connected to a virtual machine guest (which
60001923309SHanna Czenczekcontains the driver), which has its own state to transfer between source
60101923309SHanna Czenczekand destination, and therefore will have an implementation-specific
60201923309SHanna Czenczekmechanism to do so.  The ``VHOST_USER_PROTOCOL_F_DEVICE_STATE`` feature
60301923309SHanna Czenczekprovides functionality to have the front-end include the back-end's
60401923309SHanna Czenczekstate in this transfer operation so the back-end does not need to
60501923309SHanna Czenczekimplement its own mechanism, and so the virtual machine may have its
60601923309SHanna Czenczekcomplete state, including vhost-user devices' states, contained within a
60701923309SHanna Czenczeksingle stream of data.
60801923309SHanna Czenczek
60901923309SHanna CzenczekTo do this, the back-end state is transferred from back-end to front-end
61001923309SHanna Czenczekon the source side, and vice versa on the destination side.  This
61101923309SHanna Czenczektransfer happens over a channel that is negotiated using the
61201923309SHanna Czenczek``VHOST_USER_SET_DEVICE_STATE_FD`` message.  This message has two
61301923309SHanna Czenczekparameters:
61401923309SHanna Czenczek
61501923309SHanna Czenczek* Direction of transfer: On the source, the data is saved, transferring
61601923309SHanna Czenczek  it from the back-end to the front-end.  On the destination, the data
61701923309SHanna Czenczek  is loaded, transferring it from the front-end to the back-end.
61801923309SHanna Czenczek
61901923309SHanna Czenczek* Migration phase: Currently, the only supported phase is the period
62001923309SHanna Czenczek  after the transfer of memory-mapped regions before switch-over to the
62101923309SHanna Czenczek  destination, when both the source and destination devices are
62201923309SHanna Czenczek  suspended (:ref:`Suspended device state <suspended_device_state>`).
62301923309SHanna Czenczek  In the future, additional phases might be supported to allow iterative
62401923309SHanna Czenczek  migration while the device is running.
62501923309SHanna Czenczek
62601923309SHanna CzenczekThe nature of the channel is implementation-defined, but it must
62701923309SHanna Czenczekgenerally behave like a pipe: The writing end will write all the data it
62801923309SHanna Czenczekhas into it, signalling the end of data by closing its end.  The reading
62901923309SHanna Czenczekend must read all of this data (until encountering the end of file) and
63001923309SHanna Czenczekprocess it.
63101923309SHanna Czenczek
63201923309SHanna Czenczek* When saving, the writing end is the source back-end, and the reading
63301923309SHanna Czenczek  end is the source front-end.  After reading the state data from the
63401923309SHanna Czenczek  channel, the source front-end must transfer it to the destination
63501923309SHanna Czenczek  front-end through an implementation-defined mechanism.
63601923309SHanna Czenczek
63701923309SHanna Czenczek* When loading, the writing end is the destination front-end, and the
63801923309SHanna Czenczek  reading end is the destination back-end.  After reading the state data
63901923309SHanna Czenczek  from the channel, the destination back-end must deserialize its
64001923309SHanna Czenczek  internal state from that data and set itself up to allow the driver to
64101923309SHanna Czenczek  seamlessly resume operation on the VIRTIO level.
64201923309SHanna Czenczek
64301923309SHanna CzenczekSeamlessly resuming operation means that the migration must be
64401923309SHanna Czenczektransparent to the guest driver, which operates on the VIRTIO level.
64501923309SHanna CzenczekThis driver will not perform any re-initialization steps, but continue
64601923309SHanna Czenczekto use the device as if no migration had occurred.  The vhost-user
64701923309SHanna Czenczekfront-end, however, will re-initialize the vhost state on the
64801923309SHanna Czenczekdestination, following the usual protocol for establishing a connection
64901923309SHanna Czenczekto a vhost-user back-end: This includes, for example, setting up memory
65001923309SHanna Czenczekmappings and kick and call FDs as necessary, negotiating protocol
65101923309SHanna Czenczekfeatures, or setting the initial vring base indices (to the same value
65201923309SHanna Czenczekas on the source side, so that operation can resume).
65301923309SHanna Czenczek
65401923309SHanna CzenczekBoth on the source and on the destination side, after the respective
65501923309SHanna Czenczekfront-end has seen all data transferred (when the transfer FD has been
65601923309SHanna Czenczekclosed), it sends the ``VHOST_USER_CHECK_DEVICE_STATE`` message to
65701923309SHanna Czenczekverify that data transfer was successful in the back-end, too.  The
65801923309SHanna Czenczekback-end responds once it knows whether the transfer and processing was
65901923309SHanna Czenczeksuccessful or not.
66001923309SHanna Czenczek
661ed1be66bSMarc-André LureauMemory access
662ed1be66bSMarc-André Lureau-------------
663ed1be66bSMarc-André Lureau
664bd59f2a1SPaolo BonziniThe front-end sends a list of vhost memory regions to the back-end using the
665ed1be66bSMarc-André Lureau``VHOST_USER_SET_MEM_TABLE`` message.  Each region has two base
666ed1be66bSMarc-André Lureauaddresses: a guest address and a user address.
667ed1be66bSMarc-André Lureau
668ed1be66bSMarc-André LureauMessages contain guest addresses and/or user addresses to reference locations
669ed1be66bSMarc-André Lureauwithin the shared memory.  The mapping of these addresses works as follows.
670ed1be66bSMarc-André Lureau
671ed1be66bSMarc-André LureauUser addresses map to the vhost memory region containing that user address.
672ed1be66bSMarc-André Lureau
673ed1be66bSMarc-André LureauWhen the ``VIRTIO_F_IOMMU_PLATFORM`` feature has not been negotiated:
674ed1be66bSMarc-André Lureau
675ed1be66bSMarc-André Lureau* Guest addresses map to the vhost memory region containing that guest
676ed1be66bSMarc-André Lureau  address.
677ed1be66bSMarc-André Lureau
678ed1be66bSMarc-André LureauWhen the ``VIRTIO_F_IOMMU_PLATFORM`` feature has been negotiated:
679ed1be66bSMarc-André Lureau
680ed1be66bSMarc-André Lureau* Guest addresses are also called I/O virtual addresses (IOVAs).  They are
681ed1be66bSMarc-André Lureau  translated to user addresses via the IOTLB.
682ed1be66bSMarc-André Lureau
683ed1be66bSMarc-André Lureau* The vhost memory region guest address is not used.
684ed1be66bSMarc-André Lureau
685ed1be66bSMarc-André LureauIOMMU support
686ed1be66bSMarc-André Lureau-------------
687ed1be66bSMarc-André Lureau
688ed1be66bSMarc-André LureauWhen the ``VIRTIO_F_IOMMU_PLATFORM`` feature has been negotiated, the
689bd59f2a1SPaolo Bonzinifront-end sends IOTLB entries update & invalidation by sending
690bd59f2a1SPaolo Bonzini``VHOST_USER_IOTLB_MSG`` requests to the back-end with a ``struct
691ed1be66bSMarc-André Lureauvhost_iotlb_msg`` as payload. For update events, the ``iotlb`` payload
692ed1be66bSMarc-André Lureauhas to be filled with the update message type (2), the I/O virtual
693ed1be66bSMarc-André Lureauaddress, the size, the user virtual address, and the permissions
694ed1be66bSMarc-André Lureauflags. Addresses and size must be within vhost memory regions set via
695ed1be66bSMarc-André Lureauthe ``VHOST_USER_SET_MEM_TABLE`` request. For invalidation events, the
696ed1be66bSMarc-André Lureau``iotlb`` payload has to be filled with the invalidation message type
697bd59f2a1SPaolo Bonzini(3), the I/O virtual address and the size. On success, the back-end is
698ed1be66bSMarc-André Lureauexpected to reply with a zero payload, non-zero otherwise.
699ed1be66bSMarc-André Lureau
700bd59f2a1SPaolo BonziniThe back-end relies on the back-end communication channel (see :ref:`Back-end
701bd59f2a1SPaolo Bonzinicommunication <backend_communication>` section below) to send IOTLB miss
702e9ca9f33SMaxime Coquelinand access failure events, by sending ``VHOST_USER_BACKEND_IOTLB_MSG``
703bd59f2a1SPaolo Bonzinirequests to the front-end with a ``struct vhost_iotlb_msg`` as
704ed1be66bSMarc-André Lureaupayload. For miss events, the iotlb payload has to be filled with the
705ed1be66bSMarc-André Lureaumiss message type (1), the I/O virtual address and the permissions
706ed1be66bSMarc-André Lureauflags. For access failure event, the iotlb payload has to be filled
707ed1be66bSMarc-André Lureauwith the access failure message type (4), the I/O virtual address and
708bd59f2a1SPaolo Bonzinithe permissions flags.  For synchronization purpose, the back-end may
709bd59f2a1SPaolo Bonzinirely on the reply-ack feature, so the front-end may send a reply when
710ed1be66bSMarc-André Lureauoperation is completed if the reply-ack feature is negotiated and
711bd59f2a1SPaolo Bonziniback-ends requests a reply. For miss events, completed operation means
712bd59f2a1SPaolo Bonzinieither front-end sent an update message containing the IOTLB entry
713bd59f2a1SPaolo Bonzinicontaining requested address and permission, or front-end sent nothing if
714ed1be66bSMarc-André Lureauthe IOTLB miss message is invalid (invalid IOVA or permission).
715ed1be66bSMarc-André Lureau
716bd59f2a1SPaolo BonziniThe front-end isn't expected to take the initiative to send IOTLB update
717bd59f2a1SPaolo Bonzinimessages, as the back-end sends IOTLB miss messages for the guest virtual
718ed1be66bSMarc-André Lureaumemory areas it needs to access.
719ed1be66bSMarc-André Lureau
720bd59f2a1SPaolo Bonzini.. _backend_communication:
721ed1be66bSMarc-André Lureau
722bd59f2a1SPaolo BonziniBack-end communication
723bd59f2a1SPaolo Bonzini----------------------
724ed1be66bSMarc-André Lureau
725bd59f2a1SPaolo BonziniAn optional communication channel is provided if the back-end declares
726e9ca9f33SMaxime Coquelin``VHOST_USER_PROTOCOL_F_BACKEND_REQ`` protocol feature, to allow the
727bd59f2a1SPaolo Bonziniback-end to make requests to the front-end.
728ed1be66bSMarc-André Lureau
729e9ca9f33SMaxime CoquelinThe fd is provided via ``VHOST_USER_SET_BACKEND_REQ_FD`` ancillary data.
730ed1be66bSMarc-André Lureau
731e9ca9f33SMaxime CoquelinA back-end may then send ``VHOST_USER_BACKEND_*`` messages to the front-end
732ed1be66bSMarc-André Lureauusing this fd communication channel.
733ed1be66bSMarc-André Lureau
734e9ca9f33SMaxime CoquelinIf ``VHOST_USER_PROTOCOL_F_BACKEND_SEND_FD`` protocol feature is
735bd59f2a1SPaolo Bonzininegotiated, back-end can send file descriptors (at most 8 descriptors in
736bd59f2a1SPaolo Bonzinieach message) to front-end via ancillary data using this fd communication
737ed1be66bSMarc-André Lureauchannel.
738ed1be66bSMarc-André Lureau
739ed1be66bSMarc-André LureauInflight I/O tracking
740ed1be66bSMarc-André Lureau---------------------
741ed1be66bSMarc-André Lureau
742bd59f2a1SPaolo BonziniTo support reconnecting after restart or crash, back-end may need to
743ed1be66bSMarc-André Lureauresubmit inflight I/Os. If virtqueue is processed in order, we can
744ed1be66bSMarc-André Lureaueasily achieve that by getting the inflight descriptors from
745ed1be66bSMarc-André Lureaudescriptor table (split virtqueue) or descriptor ring (packed
746ed1be66bSMarc-André Lureauvirtqueue). However, it can't work when we process descriptors
747ed1be66bSMarc-André Lureauout-of-order because some entries which store the information of
748ed1be66bSMarc-André Lureauinflight descriptors in available ring (split virtqueue) or descriptor
749ac9574bcSStefan Weilring (packed virtqueue) might be overridden by new entries. To solve
750bd59f2a1SPaolo Bonzinithis problem, the back-end need to allocate an extra buffer to store this
751bd59f2a1SPaolo Bonziniinformation of inflight descriptors and share it with front-end for
752ed1be66bSMarc-André Lureaupersistent. ``VHOST_USER_GET_INFLIGHT_FD`` and
753ed1be66bSMarc-André Lureau``VHOST_USER_SET_INFLIGHT_FD`` are used to transfer this buffer
754bd59f2a1SPaolo Bonzinibetween front-end and back-end. And the format of this buffer is described
755ed1be66bSMarc-André Lureaubelow:
756ed1be66bSMarc-André Lureau
757ed1be66bSMarc-André Lureau+---------------+---------------+-----+---------------+
758ed1be66bSMarc-André Lureau| queue0 region | queue1 region | ... | queueN region |
759ed1be66bSMarc-André Lureau+---------------+---------------+-----+---------------+
760ed1be66bSMarc-André Lureau
761bd59f2a1SPaolo BonziniN is the number of available virtqueues. The back-end could get it from num
762ed1be66bSMarc-André Lureauqueues field of ``VhostUserInflight``.
763ed1be66bSMarc-André Lureau
764ed1be66bSMarc-André LureauFor split virtqueue, queue region can be implemented as:
765ed1be66bSMarc-André Lureau
766ed1be66bSMarc-André Lureau.. code:: c
767ed1be66bSMarc-André Lureau
768ed1be66bSMarc-André Lureau  typedef struct DescStateSplit {
769ed1be66bSMarc-André Lureau      /* Indicate whether this descriptor is inflight or not.
770ed1be66bSMarc-André Lureau       * Only available for head-descriptor. */
771ed1be66bSMarc-André Lureau      uint8_t inflight;
772ed1be66bSMarc-André Lureau
773ed1be66bSMarc-André Lureau      /* Padding */
774ed1be66bSMarc-André Lureau      uint8_t padding[5];
775ed1be66bSMarc-André Lureau
776ed1be66bSMarc-André Lureau      /* Maintain a list for the last batch of used descriptors.
777ed1be66bSMarc-André Lureau       * Only available when batching is used for submitting */
778ed1be66bSMarc-André Lureau      uint16_t next;
779ed1be66bSMarc-André Lureau
780ed1be66bSMarc-André Lureau      /* Used to preserve the order of fetching available descriptors.
781ed1be66bSMarc-André Lureau       * Only available for head-descriptor. */
782ed1be66bSMarc-André Lureau      uint64_t counter;
783ed1be66bSMarc-André Lureau  } DescStateSplit;
784ed1be66bSMarc-André Lureau
785ed1be66bSMarc-André Lureau  typedef struct QueueRegionSplit {
786ed1be66bSMarc-André Lureau      /* The feature flags of this region. Now it's initialized to 0. */
787ed1be66bSMarc-André Lureau      uint64_t features;
788ed1be66bSMarc-André Lureau
789ed1be66bSMarc-André Lureau      /* The version of this region. It's 1 currently.
790ed1be66bSMarc-André Lureau       * Zero value indicates an uninitialized buffer */
791ed1be66bSMarc-André Lureau      uint16_t version;
792ed1be66bSMarc-André Lureau
793bd59f2a1SPaolo Bonzini      /* The size of DescStateSplit array. It's equal to the virtqueue size.
794bd59f2a1SPaolo Bonzini       * The back-end could get it from queue size field of VhostUserInflight. */
795ed1be66bSMarc-André Lureau      uint16_t desc_num;
796ed1be66bSMarc-André Lureau
797ed1be66bSMarc-André Lureau      /* The head of list that track the last batch of used descriptors. */
798ed1be66bSMarc-André Lureau      uint16_t last_batch_head;
799ed1be66bSMarc-André Lureau
800ed1be66bSMarc-André Lureau      /* Store the idx value of used ring */
801ed1be66bSMarc-André Lureau      uint16_t used_idx;
802ed1be66bSMarc-André Lureau
803ed1be66bSMarc-André Lureau      /* Used to track the state of each descriptor in descriptor table */
804880a7817SPhilippe Mathieu-Daudé      DescStateSplit desc[];
805ed1be66bSMarc-André Lureau  } QueueRegionSplit;
806ed1be66bSMarc-André Lureau
807ed1be66bSMarc-André LureauTo track inflight I/O, the queue region should be processed as follows:
808ed1be66bSMarc-André Lureau
809ed1be66bSMarc-André LureauWhen receiving available buffers from the driver:
810ed1be66bSMarc-André Lureau
811ed1be66bSMarc-André Lureau#. Get the next available head-descriptor index from available ring, ``i``
812ed1be66bSMarc-André Lureau
813ed1be66bSMarc-André Lureau#. Set ``desc[i].counter`` to the value of global counter
814ed1be66bSMarc-André Lureau
815ed1be66bSMarc-André Lureau#. Increase global counter by 1
816ed1be66bSMarc-André Lureau
817ed1be66bSMarc-André Lureau#. Set ``desc[i].inflight`` to 1
818ed1be66bSMarc-André Lureau
819ed1be66bSMarc-André LureauWhen supplying used buffers to the driver:
820ed1be66bSMarc-André Lureau
821ed1be66bSMarc-André Lureau1. Get corresponding used head-descriptor index, i
822ed1be66bSMarc-André Lureau
823ed1be66bSMarc-André Lureau2. Set ``desc[i].next`` to ``last_batch_head``
824ed1be66bSMarc-André Lureau
825ed1be66bSMarc-André Lureau3. Set ``last_batch_head`` to ``i``
826ed1be66bSMarc-André Lureau
827ed1be66bSMarc-André Lureau#. Steps 1,2,3 may be performed repeatedly if batching is possible
828ed1be66bSMarc-André Lureau
829ed1be66bSMarc-André Lureau#. Increase the ``idx`` value of used ring by the size of the batch
830ed1be66bSMarc-André Lureau
831ed1be66bSMarc-André Lureau#. Set the ``inflight`` field of each ``DescStateSplit`` entry in the batch to 0
832ed1be66bSMarc-André Lureau
833ed1be66bSMarc-André Lureau#. Set ``used_idx`` to the ``idx`` value of used ring
834ed1be66bSMarc-André Lureau
835ed1be66bSMarc-André LureauWhen reconnecting:
836ed1be66bSMarc-André Lureau
837ed1be66bSMarc-André Lureau#. If the value of ``used_idx`` does not match the ``idx`` value of
838ed1be66bSMarc-André Lureau   used ring (means the inflight field of ``DescStateSplit`` entries in
839ed1be66bSMarc-André Lureau   last batch may be incorrect),
840ed1be66bSMarc-André Lureau
841ed1be66bSMarc-André Lureau   a. Subtract the value of ``used_idx`` from the ``idx`` value of
842ed1be66bSMarc-André Lureau      used ring to get last batch size of ``DescStateSplit`` entries
843ed1be66bSMarc-André Lureau
844ed1be66bSMarc-André Lureau   #. Set the ``inflight`` field of each ``DescStateSplit`` entry to 0 in last batch
845ed1be66bSMarc-André Lureau      list which starts from ``last_batch_head``
846ed1be66bSMarc-André Lureau
847ed1be66bSMarc-André Lureau   #. Set ``used_idx`` to the ``idx`` value of used ring
848ed1be66bSMarc-André Lureau
849ed1be66bSMarc-André Lureau#. Resubmit inflight ``DescStateSplit`` entries in order of their
850ed1be66bSMarc-André Lureau   counter value
851ed1be66bSMarc-André Lureau
852ed1be66bSMarc-André LureauFor packed virtqueue, queue region can be implemented as:
853ed1be66bSMarc-André Lureau
854ed1be66bSMarc-André Lureau.. code:: c
855ed1be66bSMarc-André Lureau
856ed1be66bSMarc-André Lureau  typedef struct DescStatePacked {
857ed1be66bSMarc-André Lureau      /* Indicate whether this descriptor is inflight or not.
858ed1be66bSMarc-André Lureau       * Only available for head-descriptor. */
859ed1be66bSMarc-André Lureau      uint8_t inflight;
860ed1be66bSMarc-André Lureau
861ed1be66bSMarc-André Lureau      /* Padding */
862ed1be66bSMarc-André Lureau      uint8_t padding;
863ed1be66bSMarc-André Lureau
864ed1be66bSMarc-André Lureau      /* Link to the next free entry */
865ed1be66bSMarc-André Lureau      uint16_t next;
866ed1be66bSMarc-André Lureau
867ed1be66bSMarc-André Lureau      /* Link to the last entry of descriptor list.
868ed1be66bSMarc-André Lureau       * Only available for head-descriptor. */
869ed1be66bSMarc-André Lureau      uint16_t last;
870ed1be66bSMarc-André Lureau
871ed1be66bSMarc-André Lureau      /* The length of descriptor list.
872ed1be66bSMarc-André Lureau       * Only available for head-descriptor. */
873ed1be66bSMarc-André Lureau      uint16_t num;
874ed1be66bSMarc-André Lureau
875ed1be66bSMarc-André Lureau      /* Used to preserve the order of fetching available descriptors.
876ed1be66bSMarc-André Lureau       * Only available for head-descriptor. */
877ed1be66bSMarc-André Lureau      uint64_t counter;
878ed1be66bSMarc-André Lureau
879ed1be66bSMarc-André Lureau      /* The buffer id */
880ed1be66bSMarc-André Lureau      uint16_t id;
881ed1be66bSMarc-André Lureau
882ed1be66bSMarc-André Lureau      /* The descriptor flags */
883ed1be66bSMarc-André Lureau      uint16_t flags;
884ed1be66bSMarc-André Lureau
885ed1be66bSMarc-André Lureau      /* The buffer length */
886ed1be66bSMarc-André Lureau      uint32_t len;
887ed1be66bSMarc-André Lureau
888ed1be66bSMarc-André Lureau      /* The buffer address */
889ed1be66bSMarc-André Lureau      uint64_t addr;
890ed1be66bSMarc-André Lureau  } DescStatePacked;
891ed1be66bSMarc-André Lureau
892ed1be66bSMarc-André Lureau  typedef struct QueueRegionPacked {
893ed1be66bSMarc-André Lureau      /* The feature flags of this region. Now it's initialized to 0. */
894ed1be66bSMarc-André Lureau      uint64_t features;
895ed1be66bSMarc-André Lureau
896ed1be66bSMarc-André Lureau      /* The version of this region. It's 1 currently.
897ed1be66bSMarc-André Lureau       * Zero value indicates an uninitialized buffer */
898ed1be66bSMarc-André Lureau      uint16_t version;
899ed1be66bSMarc-André Lureau
900bd59f2a1SPaolo Bonzini      /* The size of DescStatePacked array. It's equal to the virtqueue size.
901bd59f2a1SPaolo Bonzini       * The back-end could get it from queue size field of VhostUserInflight. */
902ed1be66bSMarc-André Lureau      uint16_t desc_num;
903ed1be66bSMarc-André Lureau
904ed1be66bSMarc-André Lureau      /* The head of free DescStatePacked entry list */
905ed1be66bSMarc-André Lureau      uint16_t free_head;
906ed1be66bSMarc-André Lureau
907ed1be66bSMarc-André Lureau      /* The old head of free DescStatePacked entry list */
908ed1be66bSMarc-André Lureau      uint16_t old_free_head;
909ed1be66bSMarc-André Lureau
910ed1be66bSMarc-André Lureau      /* The used index of descriptor ring */
911ed1be66bSMarc-André Lureau      uint16_t used_idx;
912ed1be66bSMarc-André Lureau
913ed1be66bSMarc-André Lureau      /* The old used index of descriptor ring */
914ed1be66bSMarc-André Lureau      uint16_t old_used_idx;
915ed1be66bSMarc-André Lureau
916ed1be66bSMarc-André Lureau      /* Device ring wrap counter */
917ed1be66bSMarc-André Lureau      uint8_t used_wrap_counter;
918ed1be66bSMarc-André Lureau
919ed1be66bSMarc-André Lureau      /* The old device ring wrap counter */
920ed1be66bSMarc-André Lureau      uint8_t old_used_wrap_counter;
921ed1be66bSMarc-André Lureau
922ed1be66bSMarc-André Lureau      /* Padding */
923ed1be66bSMarc-André Lureau      uint8_t padding[7];
924ed1be66bSMarc-André Lureau
925ed1be66bSMarc-André Lureau      /* Used to track the state of each descriptor fetched from descriptor ring */
926880a7817SPhilippe Mathieu-Daudé      DescStatePacked desc[];
927ed1be66bSMarc-André Lureau  } QueueRegionPacked;
928ed1be66bSMarc-André Lureau
929ed1be66bSMarc-André LureauTo track inflight I/O, the queue region should be processed as follows:
930ed1be66bSMarc-André Lureau
931ed1be66bSMarc-André LureauWhen receiving available buffers from the driver:
932ed1be66bSMarc-André Lureau
933ed1be66bSMarc-André Lureau#. Get the next available descriptor entry from descriptor ring, ``d``
934ed1be66bSMarc-André Lureau
935ed1be66bSMarc-André Lureau#. If ``d`` is head descriptor,
936ed1be66bSMarc-André Lureau
937ed1be66bSMarc-André Lureau   a. Set ``desc[old_free_head].num`` to 0
938ed1be66bSMarc-André Lureau
939ed1be66bSMarc-André Lureau   #. Set ``desc[old_free_head].counter`` to the value of global counter
940ed1be66bSMarc-André Lureau
941ed1be66bSMarc-André Lureau   #. Increase global counter by 1
942ed1be66bSMarc-André Lureau
943ed1be66bSMarc-André Lureau   #. Set ``desc[old_free_head].inflight`` to 1
944ed1be66bSMarc-André Lureau
945ed1be66bSMarc-André Lureau#. If ``d`` is last descriptor, set ``desc[old_free_head].last`` to
946ed1be66bSMarc-André Lureau   ``free_head``
947ed1be66bSMarc-André Lureau
948ed1be66bSMarc-André Lureau#. Increase ``desc[old_free_head].num`` by 1
949ed1be66bSMarc-André Lureau
950ed1be66bSMarc-André Lureau#. Set ``desc[free_head].addr``, ``desc[free_head].len``,
951ed1be66bSMarc-André Lureau   ``desc[free_head].flags``, ``desc[free_head].id`` to ``d.addr``,
952ed1be66bSMarc-André Lureau   ``d.len``, ``d.flags``, ``d.id``
953ed1be66bSMarc-André Lureau
954ed1be66bSMarc-André Lureau#. Set ``free_head`` to ``desc[free_head].next``
955ed1be66bSMarc-André Lureau
956ed1be66bSMarc-André Lureau#. If ``d`` is last descriptor, set ``old_free_head`` to ``free_head``
957ed1be66bSMarc-André Lureau
958ed1be66bSMarc-André LureauWhen supplying used buffers to the driver:
959ed1be66bSMarc-André Lureau
960ed1be66bSMarc-André Lureau1. Get corresponding used head-descriptor entry from descriptor ring,
961ed1be66bSMarc-André Lureau   ``d``
962ed1be66bSMarc-André Lureau
963ed1be66bSMarc-André Lureau2. Get corresponding ``DescStatePacked`` entry, ``e``
964ed1be66bSMarc-André Lureau
965ed1be66bSMarc-André Lureau3. Set ``desc[e.last].next`` to ``free_head``
966ed1be66bSMarc-André Lureau
967ed1be66bSMarc-André Lureau4. Set ``free_head`` to the index of ``e``
968ed1be66bSMarc-André Lureau
969ed1be66bSMarc-André Lureau#. Steps 1,2,3,4 may be performed repeatedly if batching is possible
970ed1be66bSMarc-André Lureau
971ed1be66bSMarc-André Lureau#. Increase ``used_idx`` by the size of the batch and update
972ed1be66bSMarc-André Lureau   ``used_wrap_counter`` if needed
973ed1be66bSMarc-André Lureau
974ed1be66bSMarc-André Lureau#. Update ``d.flags``
975ed1be66bSMarc-André Lureau
976ed1be66bSMarc-André Lureau#. Set the ``inflight`` field of each head ``DescStatePacked`` entry
977ed1be66bSMarc-André Lureau   in the batch to 0
978ed1be66bSMarc-André Lureau
979ed1be66bSMarc-André Lureau#. Set ``old_free_head``,  ``old_used_idx``, ``old_used_wrap_counter``
980ed1be66bSMarc-André Lureau   to ``free_head``, ``used_idx``, ``used_wrap_counter``
981ed1be66bSMarc-André Lureau
982ed1be66bSMarc-André LureauWhen reconnecting:
983ed1be66bSMarc-André Lureau
984ed1be66bSMarc-André Lureau#. If ``used_idx`` does not match ``old_used_idx`` (means the
985ed1be66bSMarc-André Lureau   ``inflight`` field of ``DescStatePacked`` entries in last batch may
986ed1be66bSMarc-André Lureau   be incorrect),
987ed1be66bSMarc-André Lureau
988ed1be66bSMarc-André Lureau   a. Get the next descriptor ring entry through ``old_used_idx``, ``d``
989ed1be66bSMarc-André Lureau
990ed1be66bSMarc-André Lureau   #. Use ``old_used_wrap_counter`` to calculate the available flags
991ed1be66bSMarc-André Lureau
992ed1be66bSMarc-André Lureau   #. If ``d.flags`` is not equal to the calculated flags value (means
993bd59f2a1SPaolo Bonzini      back-end has submitted the buffer to guest driver before crash, so
994c555b504SManos Pitsidianakis      it has to commit the in-progress update), set ``old_free_head``,
995ed1be66bSMarc-André Lureau      ``old_used_idx``, ``old_used_wrap_counter`` to ``free_head``,
996ed1be66bSMarc-André Lureau      ``used_idx``, ``used_wrap_counter``
997ed1be66bSMarc-André Lureau
998ed1be66bSMarc-André Lureau#. Set ``free_head``, ``used_idx``, ``used_wrap_counter`` to
999ed1be66bSMarc-André Lureau   ``old_free_head``, ``old_used_idx``, ``old_used_wrap_counter``
1000ed1be66bSMarc-André Lureau   (roll back any in-progress update)
1001ed1be66bSMarc-André Lureau
1002ed1be66bSMarc-André Lureau#. Set the ``inflight`` field of each ``DescStatePacked`` entry in
1003ed1be66bSMarc-André Lureau   free list to 0
1004ed1be66bSMarc-André Lureau
1005ed1be66bSMarc-André Lureau#. Resubmit inflight ``DescStatePacked`` entries in order of their
1006ed1be66bSMarc-André Lureau   counter value
1007ed1be66bSMarc-André Lureau
10083348e7e3SJohannes BergIn-band notifications
10093348e7e3SJohannes Berg---------------------
10103348e7e3SJohannes Berg
10113348e7e3SJohannes BergIn some limited situations (e.g. for simulation) it is desirable to
10123348e7e3SJohannes Berghave the kick, call and error (if used) signals done via in-band
10133348e7e3SJohannes Bergmessages instead of asynchronous eventfd notifications. This can be
10143348e7e3SJohannes Bergdone by negotiating the ``VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS``
10153348e7e3SJohannes Bergprotocol feature.
10163348e7e3SJohannes Berg
10173348e7e3SJohannes BergNote that due to the fact that too many messages on the sockets can
10183348e7e3SJohannes Bergcause the sending application(s) to block, it is not advised to use
10193348e7e3SJohannes Bergthis feature unless absolutely necessary. It is also considered an
10203348e7e3SJohannes Bergerror to negotiate this feature without also negotiating
1021e9ca9f33SMaxime Coquelin``VHOST_USER_PROTOCOL_F_BACKEND_REQ`` and ``VHOST_USER_PROTOCOL_F_REPLY_ACK``,
1022bd59f2a1SPaolo Bonzinithe former is necessary for getting a message channel from the back-end
1023bd59f2a1SPaolo Bonzinito the front-end, while the latter needs to be used with the in-band
10243348e7e3SJohannes Bergnotification messages to block until they are processed, both to avoid
10253348e7e3SJohannes Bergblocking later and for proper processing (at least in the simulation
1026bd59f2a1SPaolo Bonziniuse case.) As it has no other way of signalling this error, the back-end
10273348e7e3SJohannes Bergshould close the connection as a response to a
10283348e7e3SJohannes Berg``VHOST_USER_SET_PROTOCOL_FEATURES`` message that sets the in-band
10293348e7e3SJohannes Bergnotifications feature flag without the other two.
10303348e7e3SJohannes Berg
1031ed1be66bSMarc-André LureauProtocol features
1032ed1be66bSMarc-André Lureau-----------------
1033ed1be66bSMarc-André Lureau
1034ed1be66bSMarc-André Lureau.. code:: c
1035ed1be66bSMarc-André Lureau
1036ed1be66bSMarc-André Lureau  #define VHOST_USER_PROTOCOL_F_MQ                    0
1037ed1be66bSMarc-André Lureau  #define VHOST_USER_PROTOCOL_F_LOG_SHMFD             1
1038ed1be66bSMarc-André Lureau  #define VHOST_USER_PROTOCOL_F_RARP                  2
1039ed1be66bSMarc-André Lureau  #define VHOST_USER_PROTOCOL_F_REPLY_ACK             3
1040ed1be66bSMarc-André Lureau  #define VHOST_USER_PROTOCOL_F_MTU                   4
1041e9ca9f33SMaxime Coquelin  #define VHOST_USER_PROTOCOL_F_BACKEND_REQ           5
1042ed1be66bSMarc-André Lureau  #define VHOST_USER_PROTOCOL_F_CROSS_ENDIAN          6
1043ed1be66bSMarc-André Lureau  #define VHOST_USER_PROTOCOL_F_CRYPTO_SESSION        7
1044ed1be66bSMarc-André Lureau  #define VHOST_USER_PROTOCOL_F_PAGEFAULT             8
1045ed1be66bSMarc-André Lureau  #define VHOST_USER_PROTOCOL_F_CONFIG                9
1046e9ca9f33SMaxime Coquelin  #define VHOST_USER_PROTOCOL_F_BACKEND_SEND_FD      10
1047ed1be66bSMarc-André Lureau  #define VHOST_USER_PROTOCOL_F_HOST_NOTIFIER        11
1048ed1be66bSMarc-André Lureau  #define VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD       12
1049d91d57e6SRaphael Norwitz  #define VHOST_USER_PROTOCOL_F_RESET_DEVICE         13
10503348e7e3SJohannes Berg  #define VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS 14
10516b0eff1aSRaphael Norwitz  #define VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS  15
1052553dc662SMaxime Coquelin  #define VHOST_USER_PROTOCOL_F_STATUS               16
1053f21e95eeSViresh Kumar  #define VHOST_USER_PROTOCOL_F_XEN_MMAP             17
1054d4eb5038SHanna Czenczek  #define VHOST_USER_PROTOCOL_F_SHARED_OBJECT        18
105501923309SHanna Czenczek  #define VHOST_USER_PROTOCOL_F_DEVICE_STATE         19
1056ed1be66bSMarc-André Lureau
1057bd59f2a1SPaolo BonziniFront-end message types
1058bd59f2a1SPaolo Bonzini-----------------------
1059ed1be66bSMarc-André Lureau
1060ed1be66bSMarc-André Lureau``VHOST_USER_GET_FEATURES``
1061ed1be66bSMarc-André Lureau  :id: 1
1062ed1be66bSMarc-André Lureau  :equivalent ioctl: ``VHOST_GET_FEATURES``
106343725d48SPaolo Bonzini  :request payload: N/A
106443725d48SPaolo Bonzini  :reply payload: ``u64``
1065ed1be66bSMarc-André Lureau
1066ed1be66bSMarc-André Lureau  Get from the underlying vhost implementation the features bitmask.
1067bd59f2a1SPaolo Bonzini  Feature bit ``VHOST_USER_F_PROTOCOL_FEATURES`` signals back-end support
1068ed1be66bSMarc-André Lureau  for ``VHOST_USER_GET_PROTOCOL_FEATURES`` and
1069ed1be66bSMarc-André Lureau  ``VHOST_USER_SET_PROTOCOL_FEATURES``.
1070ed1be66bSMarc-André Lureau
1071ed1be66bSMarc-André Lureau``VHOST_USER_SET_FEATURES``
1072ed1be66bSMarc-André Lureau  :id: 2
1073ed1be66bSMarc-André Lureau  :equivalent ioctl: ``VHOST_SET_FEATURES``
107443725d48SPaolo Bonzini  :request payload: ``u64``
107543725d48SPaolo Bonzini  :reply payload: N/A
1076ed1be66bSMarc-André Lureau
1077ed1be66bSMarc-André Lureau  Enable features in the underlying vhost implementation using a
1078ed1be66bSMarc-André Lureau  bitmask.  Feature bit ``VHOST_USER_F_PROTOCOL_FEATURES`` signals
1079bd59f2a1SPaolo Bonzini  back-end support for ``VHOST_USER_GET_PROTOCOL_FEATURES`` and
1080ed1be66bSMarc-André Lureau  ``VHOST_USER_SET_PROTOCOL_FEATURES``.
1081ed1be66bSMarc-André Lureau
1082ed1be66bSMarc-André Lureau``VHOST_USER_GET_PROTOCOL_FEATURES``
1083ed1be66bSMarc-André Lureau  :id: 15
1084ed1be66bSMarc-André Lureau  :equivalent ioctl: ``VHOST_GET_FEATURES``
108543725d48SPaolo Bonzini  :request payload: N/A
108643725d48SPaolo Bonzini  :reply payload: ``u64``
1087ed1be66bSMarc-André Lureau
1088ed1be66bSMarc-André Lureau  Get the protocol feature bitmask from the underlying vhost
1089ed1be66bSMarc-André Lureau  implementation.  Only legal if feature bit
1090ed1be66bSMarc-André Lureau  ``VHOST_USER_F_PROTOCOL_FEATURES`` is present in
1091fa997266SAlex Bennée  ``VHOST_USER_GET_FEATURES``.  It does not need to be acknowledged by
1092fa997266SAlex Bennée  ``VHOST_USER_SET_FEATURES``.
1093ed1be66bSMarc-André Lureau
1094ed1be66bSMarc-André Lureau.. Note::
1095bd59f2a1SPaolo Bonzini   Back-ends that report ``VHOST_USER_F_PROTOCOL_FEATURES`` must
1096ed1be66bSMarc-André Lureau   support this message even before ``VHOST_USER_SET_FEATURES`` was
1097ed1be66bSMarc-André Lureau   called.
1098ed1be66bSMarc-André Lureau
1099ed1be66bSMarc-André Lureau``VHOST_USER_SET_PROTOCOL_FEATURES``
1100ed1be66bSMarc-André Lureau  :id: 16
1101ed1be66bSMarc-André Lureau  :equivalent ioctl: ``VHOST_SET_FEATURES``
110243725d48SPaolo Bonzini  :request payload: ``u64``
110343725d48SPaolo Bonzini  :reply payload: N/A
1104ed1be66bSMarc-André Lureau
1105ed1be66bSMarc-André Lureau  Enable protocol features in the underlying vhost implementation.
1106ed1be66bSMarc-André Lureau
1107ed1be66bSMarc-André Lureau  Only legal if feature bit ``VHOST_USER_F_PROTOCOL_FEATURES`` is present in
1108fa997266SAlex Bennée  ``VHOST_USER_GET_FEATURES``.  It does not need to be acknowledged by
1109fa997266SAlex Bennée  ``VHOST_USER_SET_FEATURES``.
1110ed1be66bSMarc-André Lureau
1111ed1be66bSMarc-André Lureau.. Note::
1112bd59f2a1SPaolo Bonzini   Back-ends that report ``VHOST_USER_F_PROTOCOL_FEATURES`` must support
1113ed1be66bSMarc-André Lureau   this message even before ``VHOST_USER_SET_FEATURES`` was called.
1114ed1be66bSMarc-André Lureau
1115ed1be66bSMarc-André Lureau``VHOST_USER_SET_OWNER``
1116ed1be66bSMarc-André Lureau  :id: 3
1117ed1be66bSMarc-André Lureau  :equivalent ioctl: ``VHOST_SET_OWNER``
111843725d48SPaolo Bonzini  :request payload: N/A
111943725d48SPaolo Bonzini  :reply payload: N/A
1120ed1be66bSMarc-André Lureau
1121bd59f2a1SPaolo Bonzini  Issued when a new connection is established. It marks the sender
1122bd59f2a1SPaolo Bonzini  as the front-end that owns of the session. This can be used on the *back-end*
1123ed1be66bSMarc-André Lureau  as a "session start" flag.
1124ed1be66bSMarc-André Lureau
1125ed1be66bSMarc-André Lureau``VHOST_USER_RESET_OWNER``
1126ed1be66bSMarc-André Lureau  :id: 4
112743725d48SPaolo Bonzini  :request payload: N/A
112843725d48SPaolo Bonzini  :reply payload: N/A
1129ed1be66bSMarc-André Lureau
1130ed1be66bSMarc-André Lureau.. admonition:: Deprecated
1131ed1be66bSMarc-André Lureau
1132ed1be66bSMarc-André Lureau   This is no longer used. Used to be sent to request disabling all
1133bd59f2a1SPaolo Bonzini   rings, but some back-ends interpreted it to also discard connection
1134ed1be66bSMarc-André Lureau   state (this interpretation would lead to bugs).  It is recommended
1135bd59f2a1SPaolo Bonzini   that back-ends either ignore this message, or use it to disable all
1136ed1be66bSMarc-André Lureau   rings.
1137ed1be66bSMarc-André Lureau
1138ed1be66bSMarc-André Lureau``VHOST_USER_SET_MEM_TABLE``
1139ed1be66bSMarc-André Lureau  :id: 5
1140ed1be66bSMarc-André Lureau  :equivalent ioctl: ``VHOST_SET_MEM_TABLE``
114117c67f68SViresh Kumar  :request payload: multiple memory regions description
114217c67f68SViresh Kumar  :reply payload: (postcopy only) multiple memory regions description
1143ed1be66bSMarc-André Lureau
1144bd59f2a1SPaolo Bonzini  Sets the memory map regions on the back-end so it can translate the
1145ed1be66bSMarc-André Lureau  vring addresses. In the ancillary data there is an array of file
1146ed1be66bSMarc-André Lureau  descriptors for each memory mapped region. The size and ordering of
1147ed1be66bSMarc-André Lureau  the fds matches the number and ordering of memory regions.
1148ed1be66bSMarc-André Lureau
1149ed1be66bSMarc-André Lureau  When ``VHOST_USER_POSTCOPY_LISTEN`` has been received,
1150ed1be66bSMarc-André Lureau  ``SET_MEM_TABLE`` replies with the bases of the memory mapped
1151bd59f2a1SPaolo Bonzini  regions to the front-end.  The back-end must have mmap'd the regions but
1152ed1be66bSMarc-André Lureau  not yet accessed them and should not yet generate a userfault
1153ed1be66bSMarc-André Lureau  event.
1154ed1be66bSMarc-André Lureau
1155ed1be66bSMarc-André Lureau.. Note::
1156ed1be66bSMarc-André Lureau   ``NEED_REPLY_MASK`` is not set in this case.  QEMU will then
1157ed1be66bSMarc-André Lureau   reply back to the list of mappings with an empty
1158ed1be66bSMarc-André Lureau   ``VHOST_USER_SET_MEM_TABLE`` as an acknowledgement; only upon
1159ed1be66bSMarc-André Lureau   reception of this message may the guest start accessing the memory
1160ed1be66bSMarc-André Lureau   and generating faults.
1161ed1be66bSMarc-André Lureau
1162ed1be66bSMarc-André Lureau``VHOST_USER_SET_LOG_BASE``
1163ed1be66bSMarc-André Lureau  :id: 6
1164ed1be66bSMarc-André Lureau  :equivalent ioctl: ``VHOST_SET_LOG_BASE``
116543725d48SPaolo Bonzini  :request payload: u64
116643725d48SPaolo Bonzini  :reply payload: N/A
1167ed1be66bSMarc-André Lureau
1168ed1be66bSMarc-André Lureau  Sets logging shared memory space.
1169ed1be66bSMarc-André Lureau
1170bd59f2a1SPaolo Bonzini  When the back-end has ``VHOST_USER_PROTOCOL_F_LOG_SHMFD`` protocol feature,
1171ed1be66bSMarc-André Lureau  the log memory fd is provided in the ancillary data of
1172ed1be66bSMarc-André Lureau  ``VHOST_USER_SET_LOG_BASE`` message, the size and offset of shared
1173ed1be66bSMarc-André Lureau  memory area provided in the message.
1174ed1be66bSMarc-André Lureau
1175ed1be66bSMarc-André Lureau``VHOST_USER_SET_LOG_FD``
1176ed1be66bSMarc-André Lureau  :id: 7
1177ed1be66bSMarc-André Lureau  :equivalent ioctl: ``VHOST_SET_LOG_FD``
117843725d48SPaolo Bonzini  :request payload: N/A
117943725d48SPaolo Bonzini  :reply payload: N/A
1180ed1be66bSMarc-André Lureau
1181ed1be66bSMarc-André Lureau  Sets the logging file descriptor, which is passed as ancillary data.
1182ed1be66bSMarc-André Lureau
1183ed1be66bSMarc-André Lureau``VHOST_USER_SET_VRING_NUM``
1184ed1be66bSMarc-André Lureau  :id: 8
1185ed1be66bSMarc-André Lureau  :equivalent ioctl: ``VHOST_SET_VRING_NUM``
118643725d48SPaolo Bonzini  :request payload: vring state description
118743725d48SPaolo Bonzini  :reply payload: N/A
1188ed1be66bSMarc-André Lureau
1189ed1be66bSMarc-André Lureau  Set the size of the queue.
1190ed1be66bSMarc-André Lureau
1191ed1be66bSMarc-André Lureau``VHOST_USER_SET_VRING_ADDR``
1192ed1be66bSMarc-André Lureau  :id: 9
1193ed1be66bSMarc-André Lureau  :equivalent ioctl: ``VHOST_SET_VRING_ADDR``
119443725d48SPaolo Bonzini  :request payload: vring address description
119543725d48SPaolo Bonzini  :reply payload: N/A
1196ed1be66bSMarc-André Lureau
1197ed1be66bSMarc-André Lureau  Sets the addresses of the different aspects of the vring.
1198ed1be66bSMarc-André Lureau
1199ed1be66bSMarc-André Lureau``VHOST_USER_SET_VRING_BASE``
1200ed1be66bSMarc-André Lureau  :id: 10
1201ed1be66bSMarc-André Lureau  :equivalent ioctl: ``VHOST_SET_VRING_BASE``
1202c7f21816SHanna Czenczek  :request payload: vring descriptor index/indices
120343725d48SPaolo Bonzini  :reply payload: N/A
1204ed1be66bSMarc-André Lureau
1205c7f21816SHanna Czenczek  Sets the next index to use for descriptors in this vring:
1206c7f21816SHanna Czenczek
1207c7f21816SHanna Czenczek  * For a split virtqueue, sets only the next descriptor index to
1208c7f21816SHanna Czenczek    process in the *Available Ring*.  The device is supposed to read the
1209c7f21816SHanna Czenczek    next index in the *Used Ring* from the respective vring structure in
1210c7f21816SHanna Czenczek    guest memory.
1211c7f21816SHanna Czenczek
1212c7f21816SHanna Czenczek  * For a packed virtqueue, both indices are supplied, as they are not
1213c7f21816SHanna Czenczek    explicitly available in memory.
1214c7f21816SHanna Czenczek
1215c7f21816SHanna Czenczek  Consequently, the payload type is specific to the type of virt queue
1216c7f21816SHanna Czenczek  (*a vring descriptor index for split virtqueues* vs. *vring descriptor
1217c7f21816SHanna Czenczek  indices for packed virtqueues*).
1218ed1be66bSMarc-André Lureau
1219ed1be66bSMarc-André Lureau``VHOST_USER_GET_VRING_BASE``
1220ed1be66bSMarc-André Lureau  :id: 11
1221ed1be66bSMarc-André Lureau  :equivalent ioctl: ``VHOST_USER_GET_VRING_BASE``
122243725d48SPaolo Bonzini  :request payload: vring state description
1223c7f21816SHanna Czenczek  :reply payload: vring descriptor index/indices
1224ed1be66bSMarc-André Lureau
1225c7f21816SHanna Czenczek  Stops the vring and returns the current descriptor index or indices:
1226c7f21816SHanna Czenczek
1227c7f21816SHanna Czenczek    * For a split virtqueue, returns only the 16-bit next descriptor
1228c7f21816SHanna Czenczek      index to process in the *Available Ring*.  Note that this may
1229c7f21816SHanna Czenczek      differ from the available ring index in the vring structure in
1230c7f21816SHanna Czenczek      memory, which points to where the driver will put new available
1231c7f21816SHanna Czenczek      descriptors.  For the *Used Ring*, the device only needs the next
1232c7f21816SHanna Czenczek      descriptor index at which to put new descriptors, which is the
1233c7f21816SHanna Czenczek      value in the vring structure in memory, so this value is not
1234c7f21816SHanna Czenczek      covered by this message.
1235c7f21816SHanna Czenczek
1236c7f21816SHanna Czenczek    * For a packed virtqueue, neither index is explicitly available to
1237c7f21816SHanna Czenczek      read from memory, so both indices (as maintained by the device) are
1238c7f21816SHanna Czenczek      returned.
1239c7f21816SHanna Czenczek
1240c7f21816SHanna Czenczek  Consequently, the payload type is specific to the type of virt queue
1241c7f21816SHanna Czenczek  (*a vring descriptor index for split virtqueues* vs. *vring descriptor
1242c7f21816SHanna Czenczek  indices for packed virtqueues*).
1243c7f21816SHanna Czenczek
1244*b87ea798Sluzhixing12345  When and as long as all of a device's vrings are stopped, it is
1245a6e76dd3SHanna Czenczek  *suspended*, see :ref:`Suspended device state
1246a6e76dd3SHanna Czenczek  <suspended_device_state>`.
1247a6e76dd3SHanna Czenczek
1248*b87ea798Sluzhixing12345  The request payload's *num* field is currently reserved and must be
1249c7f21816SHanna Czenczek  set to 0.
1250ed1be66bSMarc-André Lureau
1251ed1be66bSMarc-André Lureau``VHOST_USER_SET_VRING_KICK``
1252ed1be66bSMarc-André Lureau  :id: 12
1253ed1be66bSMarc-André Lureau  :equivalent ioctl: ``VHOST_SET_VRING_KICK``
125443725d48SPaolo Bonzini  :request payload: ``u64``
125543725d48SPaolo Bonzini  :reply payload: N/A
1256ed1be66bSMarc-André Lureau
1257ed1be66bSMarc-André Lureau  Set the event file descriptor for adding buffers to the vring. It is
1258ed1be66bSMarc-André Lureau  passed in the ancillary data.
1259ed1be66bSMarc-André Lureau
1260ed1be66bSMarc-André Lureau  Bits (0-7) of the payload contain the vring index. Bit 8 is the
1261ed1be66bSMarc-André Lureau  invalid FD flag. This flag is set when there is no file descriptor
1262ed1be66bSMarc-André Lureau  in the ancillary data. This signals that polling should be used
12633348e7e3SJohannes Berg  instead of waiting for the kick. Note that if the protocol feature
12643348e7e3SJohannes Berg  ``VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS`` has been negotiated
12653348e7e3SJohannes Berg  this message isn't necessary as the ring is also started on the
12663348e7e3SJohannes Berg  ``VHOST_USER_VRING_KICK`` message, it may however still be used to
12673348e7e3SJohannes Berg  set an event file descriptor (which will be preferred over the
12683348e7e3SJohannes Berg  message) or to enable polling.
1269ed1be66bSMarc-André Lureau
1270ed1be66bSMarc-André Lureau``VHOST_USER_SET_VRING_CALL``
1271ed1be66bSMarc-André Lureau  :id: 13
1272ed1be66bSMarc-André Lureau  :equivalent ioctl: ``VHOST_SET_VRING_CALL``
127343725d48SPaolo Bonzini  :request payload: ``u64``
127443725d48SPaolo Bonzini  :reply payload: N/A
1275ed1be66bSMarc-André Lureau
1276ed1be66bSMarc-André Lureau  Set the event file descriptor to signal when buffers are used. It is
1277ed1be66bSMarc-André Lureau  passed in the ancillary data.
1278ed1be66bSMarc-André Lureau
1279ed1be66bSMarc-André Lureau  Bits (0-7) of the payload contain the vring index. Bit 8 is the
1280ed1be66bSMarc-André Lureau  invalid FD flag. This flag is set when there is no file descriptor
1281ed1be66bSMarc-André Lureau  in the ancillary data. This signals that polling will be used
12823348e7e3SJohannes Berg  instead of waiting for the call. Note that if the protocol features
12833348e7e3SJohannes Berg  ``VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS`` and
1284e9ca9f33SMaxime Coquelin  ``VHOST_USER_PROTOCOL_F_BACKEND_REQ`` have been negotiated this message
1285e9ca9f33SMaxime Coquelin  isn't necessary as the ``VHOST_USER_BACKEND_VRING_CALL`` message can be
12863348e7e3SJohannes Berg  used, it may however still be used to set an event file descriptor
12873348e7e3SJohannes Berg  or to enable polling.
1288ed1be66bSMarc-André Lureau
1289ed1be66bSMarc-André Lureau``VHOST_USER_SET_VRING_ERR``
1290ed1be66bSMarc-André Lureau  :id: 14
1291ed1be66bSMarc-André Lureau  :equivalent ioctl: ``VHOST_SET_VRING_ERR``
129243725d48SPaolo Bonzini  :request payload: ``u64``
129343725d48SPaolo Bonzini  :reply payload: N/A
1294ed1be66bSMarc-André Lureau
1295ed1be66bSMarc-André Lureau  Set the event file descriptor to signal when error occurs. It is
1296ed1be66bSMarc-André Lureau  passed in the ancillary data.
1297ed1be66bSMarc-André Lureau
1298ed1be66bSMarc-André Lureau  Bits (0-7) of the payload contain the vring index. Bit 8 is the
1299ed1be66bSMarc-André Lureau  invalid FD flag. This flag is set when there is no file descriptor
13003348e7e3SJohannes Berg  in the ancillary data. Note that if the protocol features
13013348e7e3SJohannes Berg  ``VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS`` and
1302e9ca9f33SMaxime Coquelin  ``VHOST_USER_PROTOCOL_F_BACKEND_REQ`` have been negotiated this message
1303e9ca9f33SMaxime Coquelin  isn't necessary as the ``VHOST_USER_BACKEND_VRING_ERR`` message can be
13043348e7e3SJohannes Berg  used, it may however still be used to set an event file descriptor
13053348e7e3SJohannes Berg  (which will be preferred over the message).
1306ed1be66bSMarc-André Lureau
1307ed1be66bSMarc-André Lureau``VHOST_USER_GET_QUEUE_NUM``
1308ed1be66bSMarc-André Lureau  :id: 17
1309ed1be66bSMarc-André Lureau  :equivalent ioctl: N/A
131043725d48SPaolo Bonzini  :request payload: N/A
131143725d48SPaolo Bonzini  :reply payload: u64
1312ed1be66bSMarc-André Lureau
1313bd59f2a1SPaolo Bonzini  Query how many queues the back-end supports.
1314ed1be66bSMarc-André Lureau
1315ed1be66bSMarc-André Lureau  This request should be sent only when ``VHOST_USER_PROTOCOL_F_MQ``
1316ed1be66bSMarc-André Lureau  is set in queried protocol features by
1317ed1be66bSMarc-André Lureau  ``VHOST_USER_GET_PROTOCOL_FEATURES``.
1318ed1be66bSMarc-André Lureau
1319ed1be66bSMarc-André Lureau``VHOST_USER_SET_VRING_ENABLE``
1320ed1be66bSMarc-André Lureau  :id: 18
1321ed1be66bSMarc-André Lureau  :equivalent ioctl: N/A
132243725d48SPaolo Bonzini  :request payload: vring state description
132343725d48SPaolo Bonzini  :reply payload: N/A
1324ed1be66bSMarc-André Lureau
1325bd59f2a1SPaolo Bonzini  Signal the back-end to enable or disable corresponding vring.
1326ed1be66bSMarc-André Lureau
1327ed1be66bSMarc-André Lureau  This request should be sent only when
1328ed1be66bSMarc-André Lureau  ``VHOST_USER_F_PROTOCOL_FEATURES`` has been negotiated.
1329ed1be66bSMarc-André Lureau
1330ed1be66bSMarc-André Lureau``VHOST_USER_SEND_RARP``
1331ed1be66bSMarc-André Lureau  :id: 19
1332ed1be66bSMarc-André Lureau  :equivalent ioctl: N/A
133343725d48SPaolo Bonzini  :request payload: ``u64``
133443725d48SPaolo Bonzini  :reply payload: N/A
1335ed1be66bSMarc-André Lureau
1336bd59f2a1SPaolo Bonzini  Ask vhost user back-end to broadcast a fake RARP to notify the migration
1337ed1be66bSMarc-André Lureau  is terminated for guest that does not support GUEST_ANNOUNCE.
1338ed1be66bSMarc-André Lureau
1339ed1be66bSMarc-André Lureau  Only legal if feature bit ``VHOST_USER_F_PROTOCOL_FEATURES`` is
1340ed1be66bSMarc-André Lureau  present in ``VHOST_USER_GET_FEATURES`` and protocol feature bit
1341ed1be66bSMarc-André Lureau  ``VHOST_USER_PROTOCOL_F_RARP`` is present in
1342ed1be66bSMarc-André Lureau  ``VHOST_USER_GET_PROTOCOL_FEATURES``.  The first 6 bytes of the
1343ed1be66bSMarc-André Lureau  payload contain the mac address of the guest to allow the vhost user
1344bd59f2a1SPaolo Bonzini  back-end to construct and broadcast the fake RARP.
1345ed1be66bSMarc-André Lureau
1346ed1be66bSMarc-André Lureau``VHOST_USER_NET_SET_MTU``
1347ed1be66bSMarc-André Lureau  :id: 20
1348ed1be66bSMarc-André Lureau  :equivalent ioctl: N/A
134943725d48SPaolo Bonzini  :request payload: ``u64``
135043725d48SPaolo Bonzini  :reply payload: N/A
1351ed1be66bSMarc-André Lureau
1352ed1be66bSMarc-André Lureau  Set host MTU value exposed to the guest.
1353ed1be66bSMarc-André Lureau
1354ed1be66bSMarc-André Lureau  This request should be sent only when ``VIRTIO_NET_F_MTU`` feature
1355ed1be66bSMarc-André Lureau  has been successfully negotiated, ``VHOST_USER_F_PROTOCOL_FEATURES``
1356ed1be66bSMarc-André Lureau  is present in ``VHOST_USER_GET_FEATURES`` and protocol feature bit
1357ed1be66bSMarc-André Lureau  ``VHOST_USER_PROTOCOL_F_NET_MTU`` is present in
1358ed1be66bSMarc-André Lureau  ``VHOST_USER_GET_PROTOCOL_FEATURES``.
1359ed1be66bSMarc-André Lureau
1360bd59f2a1SPaolo Bonzini  If ``VHOST_USER_PROTOCOL_F_REPLY_ACK`` is negotiated, the back-end must
1361ed1be66bSMarc-André Lureau  respond with zero in case the specified MTU is valid, or non-zero
1362ed1be66bSMarc-André Lureau  otherwise.
1363ed1be66bSMarc-André Lureau
1364e9ca9f33SMaxime Coquelin``VHOST_USER_SET_BACKEND_REQ_FD`` (previous name ``VHOST_USER_SET_SLAVE_REQ_FD``)
1365ed1be66bSMarc-André Lureau  :id: 21
1366ed1be66bSMarc-André Lureau  :equivalent ioctl: N/A
136743725d48SPaolo Bonzini  :request payload: N/A
136843725d48SPaolo Bonzini  :reply payload: N/A
1369ed1be66bSMarc-André Lureau
1370bd59f2a1SPaolo Bonzini  Set the socket file descriptor for back-end initiated requests. It is passed
1371ed1be66bSMarc-André Lureau  in the ancillary data.
1372ed1be66bSMarc-André Lureau
1373ed1be66bSMarc-André Lureau  This request should be sent only when
1374ed1be66bSMarc-André Lureau  ``VHOST_USER_F_PROTOCOL_FEATURES`` has been negotiated, and protocol
1375e9ca9f33SMaxime Coquelin  feature bit ``VHOST_USER_PROTOCOL_F_BACKEND_REQ`` bit is present in
1376ed1be66bSMarc-André Lureau  ``VHOST_USER_GET_PROTOCOL_FEATURES``.  If
1377bd59f2a1SPaolo Bonzini  ``VHOST_USER_PROTOCOL_F_REPLY_ACK`` is negotiated, the back-end must
1378ed1be66bSMarc-André Lureau  respond with zero for success, non-zero otherwise.
1379ed1be66bSMarc-André Lureau
1380ed1be66bSMarc-André Lureau``VHOST_USER_IOTLB_MSG``
1381ed1be66bSMarc-André Lureau  :id: 22
1382ed1be66bSMarc-André Lureau  :equivalent ioctl: N/A (equivalent to ``VHOST_IOTLB_MSG`` message type)
138343725d48SPaolo Bonzini  :request payload: ``struct vhost_iotlb_msg``
138443725d48SPaolo Bonzini  :reply payload: ``u64``
1385ed1be66bSMarc-André Lureau
1386ed1be66bSMarc-André Lureau  Send IOTLB messages with ``struct vhost_iotlb_msg`` as payload.
1387ed1be66bSMarc-André Lureau
1388bd59f2a1SPaolo Bonzini  The front-end sends such requests to update and invalidate entries in the
1389bd59f2a1SPaolo Bonzini  device IOTLB. The back-end has to acknowledge the request with sending
1390ed1be66bSMarc-André Lureau  zero as ``u64`` payload for success, non-zero otherwise.
1391ed1be66bSMarc-André Lureau
1392ed1be66bSMarc-André Lureau  This request should be send only when ``VIRTIO_F_IOMMU_PLATFORM``
1393ed1be66bSMarc-André Lureau  feature has been successfully negotiated.
1394ed1be66bSMarc-André Lureau
1395ed1be66bSMarc-André Lureau``VHOST_USER_SET_VRING_ENDIAN``
1396ed1be66bSMarc-André Lureau  :id: 23
1397ed1be66bSMarc-André Lureau  :equivalent ioctl: ``VHOST_SET_VRING_ENDIAN``
139843725d48SPaolo Bonzini  :request payload: vring state description
139943725d48SPaolo Bonzini  :reply payload: N/A
1400ed1be66bSMarc-André Lureau
1401ed1be66bSMarc-André Lureau  Set the endianness of a VQ for legacy devices. Little-endian is
1402ed1be66bSMarc-André Lureau  indicated with state.num set to 0 and big-endian is indicated with
1403ed1be66bSMarc-André Lureau  state.num set to 1. Other values are invalid.
1404ed1be66bSMarc-André Lureau
1405ed1be66bSMarc-André Lureau  This request should be sent only when
1406ed1be66bSMarc-André Lureau  ``VHOST_USER_PROTOCOL_F_CROSS_ENDIAN`` has been negotiated.
1407ed1be66bSMarc-André Lureau  Backends that negotiated this feature should handle both
1408ed1be66bSMarc-André Lureau  endiannesses and expect this message once (per VQ) during device
1409bd59f2a1SPaolo Bonzini  configuration (ie. before the front-end starts the VQ).
1410ed1be66bSMarc-André Lureau
1411ed1be66bSMarc-André Lureau``VHOST_USER_GET_CONFIG``
1412ed1be66bSMarc-André Lureau  :id: 24
1413ed1be66bSMarc-André Lureau  :equivalent ioctl: N/A
141443725d48SPaolo Bonzini  :request payload: virtio device config space
141543725d48SPaolo Bonzini  :reply payload: virtio device config space
1416ed1be66bSMarc-André Lureau
1417ed1be66bSMarc-André Lureau  When ``VHOST_USER_PROTOCOL_F_CONFIG`` is negotiated, this message is
1418bd59f2a1SPaolo Bonzini  submitted by the vhost-user front-end to fetch the contents of the
1419bd59f2a1SPaolo Bonzini  virtio device configuration space, vhost-user back-end's payload size
1420bd59f2a1SPaolo Bonzini  MUST match the front-end's request, vhost-user back-end uses zero length of
1421bd59f2a1SPaolo Bonzini  payload to indicate an error to the vhost-user front-end. The vhost-user
1422bd59f2a1SPaolo Bonzini  front-end may cache the contents to avoid repeated
1423ed1be66bSMarc-André Lureau  ``VHOST_USER_GET_CONFIG`` calls.
1424ed1be66bSMarc-André Lureau
1425ed1be66bSMarc-André Lureau``VHOST_USER_SET_CONFIG``
1426ed1be66bSMarc-André Lureau  :id: 25
1427ed1be66bSMarc-André Lureau  :equivalent ioctl: N/A
142843725d48SPaolo Bonzini  :request payload: virtio device config space
142943725d48SPaolo Bonzini  :reply payload: N/A
1430ed1be66bSMarc-André Lureau
1431ed1be66bSMarc-André Lureau  When ``VHOST_USER_PROTOCOL_F_CONFIG`` is negotiated, this message is
1432bd59f2a1SPaolo Bonzini  submitted by the vhost-user front-end when the Guest changes the virtio
1433ed1be66bSMarc-André Lureau  device configuration space and also can be used for live migration
1434bd59f2a1SPaolo Bonzini  on the destination host. The vhost-user back-end must check the flags
1435bd59f2a1SPaolo Bonzini  field, and back-ends MUST NOT accept SET_CONFIG for read-only
1436ed1be66bSMarc-André Lureau  configuration space fields unless the live migration bit is set.
1437ed1be66bSMarc-André Lureau
1438ed1be66bSMarc-André Lureau``VHOST_USER_CREATE_CRYPTO_SESSION``
1439ed1be66bSMarc-André Lureau  :id: 26
1440ed1be66bSMarc-André Lureau  :equivalent ioctl: N/A
144143725d48SPaolo Bonzini  :request payload: crypto session description
144243725d48SPaolo Bonzini  :reply payload: crypto session description
1443ed1be66bSMarc-André Lureau
1444bd59f2a1SPaolo Bonzini  Create a session for crypto operation. The back-end must return
1445ed1be66bSMarc-André Lureau  the session id, 0 or positive for success, negative for failure.
1446ed1be66bSMarc-André Lureau  This request should be sent only when
1447ed1be66bSMarc-André Lureau  ``VHOST_USER_PROTOCOL_F_CRYPTO_SESSION`` feature has been
1448ed1be66bSMarc-André Lureau  successfully negotiated.  It's a required feature for crypto
1449ed1be66bSMarc-André Lureau  devices.
1450ed1be66bSMarc-André Lureau
1451ed1be66bSMarc-André Lureau``VHOST_USER_CLOSE_CRYPTO_SESSION``
1452ed1be66bSMarc-André Lureau  :id: 27
1453ed1be66bSMarc-André Lureau  :equivalent ioctl: N/A
145443725d48SPaolo Bonzini  :request payload: ``u64``
145543725d48SPaolo Bonzini  :reply payload: N/A
1456ed1be66bSMarc-André Lureau
1457ed1be66bSMarc-André Lureau  Close a session for crypto operation which was previously
1458ed1be66bSMarc-André Lureau  created by ``VHOST_USER_CREATE_CRYPTO_SESSION``.
1459ed1be66bSMarc-André Lureau
1460ed1be66bSMarc-André Lureau  This request should be sent only when
1461ed1be66bSMarc-André Lureau  ``VHOST_USER_PROTOCOL_F_CRYPTO_SESSION`` feature has been
1462ed1be66bSMarc-André Lureau  successfully negotiated.  It's a required feature for crypto
1463ed1be66bSMarc-André Lureau  devices.
1464ed1be66bSMarc-André Lureau
1465ed1be66bSMarc-André Lureau``VHOST_USER_POSTCOPY_ADVISE``
1466ed1be66bSMarc-André Lureau  :id: 28
146743725d48SPaolo Bonzini  :request payload: N/A
146843725d48SPaolo Bonzini  :reply payload: userfault fd
1469ed1be66bSMarc-André Lureau
1470bd59f2a1SPaolo Bonzini  When ``VHOST_USER_PROTOCOL_F_PAGEFAULT`` is supported, the front-end
1471bd59f2a1SPaolo Bonzini  advises back-end that a migration with postcopy enabled is underway,
1472bd59f2a1SPaolo Bonzini  the back-end must open a userfaultfd for later use.  Note that at this
1473ed1be66bSMarc-André Lureau  stage the migration is still in precopy mode.
1474ed1be66bSMarc-André Lureau
1475ed1be66bSMarc-André Lureau``VHOST_USER_POSTCOPY_LISTEN``
1476ed1be66bSMarc-André Lureau  :id: 29
147743725d48SPaolo Bonzini  :request payload: N/A
147843725d48SPaolo Bonzini  :reply payload: N/A
1479ed1be66bSMarc-André Lureau
1480bd59f2a1SPaolo Bonzini  The front-end advises back-end that a transition to postcopy mode has
1481bd59f2a1SPaolo Bonzini  happened.  The back-end must ensure that shared memory is registered
1482ed1be66bSMarc-André Lureau  with userfaultfd to cause faulting of non-present pages.
1483ed1be66bSMarc-André Lureau
1484ed1be66bSMarc-André Lureau  This is always sent sometime after a ``VHOST_USER_POSTCOPY_ADVISE``,
1485ed1be66bSMarc-André Lureau  and thus only when ``VHOST_USER_PROTOCOL_F_PAGEFAULT`` is supported.
1486ed1be66bSMarc-André Lureau
1487ed1be66bSMarc-André Lureau``VHOST_USER_POSTCOPY_END``
1488ed1be66bSMarc-André Lureau  :id: 30
148943725d48SPaolo Bonzini  :request payload: N/A
149043725d48SPaolo Bonzini  :reply payload: ``u64``
1491ed1be66bSMarc-André Lureau
1492bd59f2a1SPaolo Bonzini  The front-end advises that postcopy migration has now completed.  The back-end
149343725d48SPaolo Bonzini  must disable the userfaultfd. The reply is an acknowledgement
1494ed1be66bSMarc-André Lureau  only.
1495ed1be66bSMarc-André Lureau
1496ed1be66bSMarc-André Lureau  When ``VHOST_USER_PROTOCOL_F_PAGEFAULT`` is supported, this message
1497ed1be66bSMarc-André Lureau  is sent at the end of the migration, after
1498ed1be66bSMarc-André Lureau  ``VHOST_USER_POSTCOPY_LISTEN`` was previously sent.
1499ed1be66bSMarc-André Lureau
1500ed1be66bSMarc-André Lureau  The value returned is an error indication; 0 is success.
1501ed1be66bSMarc-André Lureau
1502ed1be66bSMarc-André Lureau``VHOST_USER_GET_INFLIGHT_FD``
1503ed1be66bSMarc-André Lureau  :id: 31
1504ed1be66bSMarc-André Lureau  :equivalent ioctl: N/A
150543725d48SPaolo Bonzini  :request payload: inflight description
150643725d48SPaolo Bonzini  :reply payload: N/A
1507ed1be66bSMarc-André Lureau
1508ed1be66bSMarc-André Lureau  When ``VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD`` protocol feature has
1509bd59f2a1SPaolo Bonzini  been successfully negotiated, this message is submitted by the front-end to
1510bd59f2a1SPaolo Bonzini  get a shared buffer from back-end. The shared buffer will be used to
1511bd59f2a1SPaolo Bonzini  track inflight I/O by back-end. QEMU should retrieve a new one when vm
1512ed1be66bSMarc-André Lureau  reset.
1513ed1be66bSMarc-André Lureau
1514ed1be66bSMarc-André Lureau``VHOST_USER_SET_INFLIGHT_FD``
1515ed1be66bSMarc-André Lureau  :id: 32
1516ed1be66bSMarc-André Lureau  :equivalent ioctl: N/A
151743725d48SPaolo Bonzini  :request payload: inflight description
151843725d48SPaolo Bonzini  :reply payload: N/A
1519ed1be66bSMarc-André Lureau
1520ed1be66bSMarc-André Lureau  When ``VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD`` protocol feature has
1521bd59f2a1SPaolo Bonzini  been successfully negotiated, this message is submitted by the front-end to
1522bd59f2a1SPaolo Bonzini  send the shared inflight buffer back to the back-end so that the back-end
1523bd59f2a1SPaolo Bonzini  could get inflight I/O after a crash or restart.
1524ed1be66bSMarc-André Lureau
1525bd2e44feSMarc-André Lureau``VHOST_USER_GPU_SET_SOCKET``
1526bd2e44feSMarc-André Lureau  :id: 33
1527bd2e44feSMarc-André Lureau  :equivalent ioctl: N/A
152843725d48SPaolo Bonzini  :request payload: N/A
152943725d48SPaolo Bonzini  :reply payload: N/A
1530bd2e44feSMarc-André Lureau
1531bd2e44feSMarc-André Lureau  Sets the GPU protocol socket file descriptor, which is passed as
1532bd59f2a1SPaolo Bonzini  ancillary data. The GPU protocol is used to inform the front-end of
1533bd2e44feSMarc-André Lureau  rendering state and updates. See vhost-user-gpu.rst for details.
1534bd2e44feSMarc-André Lureau
1535d91d57e6SRaphael Norwitz``VHOST_USER_RESET_DEVICE``
1536d91d57e6SRaphael Norwitz  :id: 34
1537d91d57e6SRaphael Norwitz  :equivalent ioctl: N/A
153843725d48SPaolo Bonzini  :request payload: N/A
153943725d48SPaolo Bonzini  :reply payload: N/A
1540d91d57e6SRaphael Norwitz
1541bd59f2a1SPaolo Bonzini  Ask the vhost user back-end to disable all rings and reset all
1542d91d57e6SRaphael Norwitz  internal device state to the initial state, ready to be
1543bd59f2a1SPaolo Bonzini  reinitialized. The back-end retains ownership of the device
1544d91d57e6SRaphael Norwitz  throughout the reset operation.
1545d91d57e6SRaphael Norwitz
1546d91d57e6SRaphael Norwitz  Only valid if the ``VHOST_USER_PROTOCOL_F_RESET_DEVICE`` protocol
1547bd59f2a1SPaolo Bonzini  feature is set by the back-end.
1548d91d57e6SRaphael Norwitz
15493348e7e3SJohannes Berg``VHOST_USER_VRING_KICK``
15503348e7e3SJohannes Berg  :id: 35
15513348e7e3SJohannes Berg  :equivalent ioctl: N/A
155243725d48SPaolo Bonzini  :request payload: vring state description
155343725d48SPaolo Bonzini  :reply payload: N/A
15543348e7e3SJohannes Berg
15553348e7e3SJohannes Berg  When the ``VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS`` protocol
15563348e7e3SJohannes Berg  feature has been successfully negotiated, this message may be
1557bd59f2a1SPaolo Bonzini  submitted by the front-end to indicate that a buffer was added to
15583348e7e3SJohannes Berg  the vring instead of signalling it using the vring's kick file
1559bd59f2a1SPaolo Bonzini  descriptor or having the back-end rely on polling.
15603348e7e3SJohannes Berg
15613348e7e3SJohannes Berg  The state.num field is currently reserved and must be set to 0.
15623348e7e3SJohannes Berg
15636b0eff1aSRaphael Norwitz``VHOST_USER_GET_MAX_MEM_SLOTS``
15646b0eff1aSRaphael Norwitz  :id: 36
15656b0eff1aSRaphael Norwitz  :equivalent ioctl: N/A
156643725d48SPaolo Bonzini  :request payload: N/A
156743725d48SPaolo Bonzini  :reply payload: u64
15686b0eff1aSRaphael Norwitz
15696b0eff1aSRaphael Norwitz  When the ``VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS`` protocol
15706b0eff1aSRaphael Norwitz  feature has been successfully negotiated, this message is submitted
1571bd59f2a1SPaolo Bonzini  by the front-end to the back-end. The back-end should return the message with a
15726b0eff1aSRaphael Norwitz  u64 payload containing the maximum number of memory slots for
1573bd59f2a1SPaolo Bonzini  QEMU to expose to the guest. The value returned by the back-end
157427598393SRaphael Norwitz  will be capped at the maximum number of ram slots which can be
157527598393SRaphael Norwitz  supported by the target platform.
1576f1aeb14bSRaphael Norwitz
1577f1aeb14bSRaphael Norwitz``VHOST_USER_ADD_MEM_REG``
1578f1aeb14bSRaphael Norwitz  :id: 37
1579f1aeb14bSRaphael Norwitz  :equivalent ioctl: N/A
158043725d48SPaolo Bonzini  :request payload: N/A
158143725d48SPaolo Bonzini  :reply payload: single memory region description
1582f1aeb14bSRaphael Norwitz
1583f1aeb14bSRaphael Norwitz  When the ``VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS`` protocol
1584f1aeb14bSRaphael Norwitz  feature has been successfully negotiated, this message is submitted
1585bd59f2a1SPaolo Bonzini  by the front-end to the back-end. The message payload contains a memory
1586f1aeb14bSRaphael Norwitz  region descriptor struct, describing a region of guest memory which
1587bd59f2a1SPaolo Bonzini  the back-end device must map in. When the
1588f1aeb14bSRaphael Norwitz  ``VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS`` protocol feature has
1589f1aeb14bSRaphael Norwitz  been successfully negotiated, along with the
1590f1aeb14bSRaphael Norwitz  ``VHOST_USER_REM_MEM_REG`` message, this message is used to set and
1591bd59f2a1SPaolo Bonzini  update the memory tables of the back-end device.
1592f1aeb14bSRaphael Norwitz
159331009d13SKevin Wolf  Exactly one file descriptor from which the memory is mapped is
159431009d13SKevin Wolf  passed in the ancillary data.
159531009d13SKevin Wolf
1596bd59f2a1SPaolo Bonzini  In postcopy mode (see ``VHOST_USER_POSTCOPY_LISTEN``), the back-end
1597bd59f2a1SPaolo Bonzini  replies with the bases of the memory mapped region to the front-end.
159831009d13SKevin Wolf  For further details on postcopy, see ``VHOST_USER_SET_MEM_TABLE``.
159931009d13SKevin Wolf  They apply to ``VHOST_USER_ADD_MEM_REG`` accordingly.
160031009d13SKevin Wolf
1601f1aeb14bSRaphael Norwitz``VHOST_USER_REM_MEM_REG``
1602f1aeb14bSRaphael Norwitz  :id: 38
1603f1aeb14bSRaphael Norwitz  :equivalent ioctl: N/A
160443725d48SPaolo Bonzini  :request payload: N/A
160543725d48SPaolo Bonzini  :reply payload: single memory region description
1606f1aeb14bSRaphael Norwitz
1607f1aeb14bSRaphael Norwitz  When the ``VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS`` protocol
1608f1aeb14bSRaphael Norwitz  feature has been successfully negotiated, this message is submitted
1609bd59f2a1SPaolo Bonzini  by the front-end to the back-end. The message payload contains a memory
1610f1aeb14bSRaphael Norwitz  region descriptor struct, describing a region of guest memory which
1611bd59f2a1SPaolo Bonzini  the back-end device must unmap. When the
1612f1aeb14bSRaphael Norwitz  ``VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS`` protocol feature has
1613f1aeb14bSRaphael Norwitz  been successfully negotiated, along with the
1614f1aeb14bSRaphael Norwitz  ``VHOST_USER_ADD_MEM_REG`` message, this message is used to set and
1615bd59f2a1SPaolo Bonzini  update the memory tables of the back-end device.
16166b0eff1aSRaphael Norwitz
161731009d13SKevin Wolf  The memory region to be removed is identified by its guest address,
161831009d13SKevin Wolf  user address and size. The mmap offset is ignored.
161931009d13SKevin Wolf
162031009d13SKevin Wolf  No file descriptors SHOULD be passed in the ancillary data. For
1621bd59f2a1SPaolo Bonzini  compatibility with existing incorrect implementations, the back-end MAY
162231009d13SKevin Wolf  accept messages with one file descriptor. If a file descriptor is
1623bd59f2a1SPaolo Bonzini  passed, the back-end MUST close it without using it otherwise.
162431009d13SKevin Wolf
1625553dc662SMaxime Coquelin``VHOST_USER_SET_STATUS``
1626553dc662SMaxime Coquelin  :id: 39
1627553dc662SMaxime Coquelin  :equivalent ioctl: VHOST_VDPA_SET_STATUS
162843725d48SPaolo Bonzini  :request payload: ``u64``
162943725d48SPaolo Bonzini  :reply payload: N/A
1630553dc662SMaxime Coquelin
1631553dc662SMaxime Coquelin  When the ``VHOST_USER_PROTOCOL_F_STATUS`` protocol feature has been
1632bd59f2a1SPaolo Bonzini  successfully negotiated, this message is submitted by the front-end to
1633bd59f2a1SPaolo Bonzini  notify the back-end with updated device status as defined in the Virtio
1634553dc662SMaxime Coquelin  specification.
1635553dc662SMaxime Coquelin
1636553dc662SMaxime Coquelin``VHOST_USER_GET_STATUS``
1637553dc662SMaxime Coquelin  :id: 40
1638553dc662SMaxime Coquelin  :equivalent ioctl: VHOST_VDPA_GET_STATUS
163943725d48SPaolo Bonzini  :request payload: N/A
164043725d48SPaolo Bonzini  :reply payload: ``u64``
1641553dc662SMaxime Coquelin
1642553dc662SMaxime Coquelin  When the ``VHOST_USER_PROTOCOL_F_STATUS`` protocol feature has been
1643bd59f2a1SPaolo Bonzini  successfully negotiated, this message is submitted by the front-end to
1644bd59f2a1SPaolo Bonzini  query the back-end for its device status as defined in the Virtio
1645553dc662SMaxime Coquelin  specification.
1646553dc662SMaxime Coquelin
164716094766SAlbert Esteve``VHOST_USER_GET_SHARED_OBJECT``
164816094766SAlbert Esteve  :id: 41
164916094766SAlbert Esteve  :equivalent ioctl: N/A
165016094766SAlbert Esteve  :request payload: ``struct VhostUserShared``
165116094766SAlbert Esteve  :reply payload: dmabuf fd
165216094766SAlbert Esteve
165316094766SAlbert Esteve  When the ``VHOST_USER_PROTOCOL_F_SHARED_OBJECT`` protocol
165416094766SAlbert Esteve  feature has been successfully negotiated, and the UUID is found
165516094766SAlbert Esteve  in the exporters cache, this message is submitted by the front-end
165616094766SAlbert Esteve  to retrieve a given dma-buf fd from a given back-end, determined by
165716094766SAlbert Esteve  the requested UUID. Back-end will reply passing the fd when the operation
165816094766SAlbert Esteve  is successful, or no fd otherwise.
1659553dc662SMaxime Coquelin
166001923309SHanna Czenczek``VHOST_USER_SET_DEVICE_STATE_FD``
166101923309SHanna Czenczek  :id: 42
166201923309SHanna Czenczek  :equivalent ioctl: N/A
166301923309SHanna Czenczek  :request payload: device state transfer parameters
166401923309SHanna Czenczek  :reply payload: ``u64``
166501923309SHanna Czenczek
166601923309SHanna Czenczek  Front-end and back-end negotiate a channel over which to transfer the
1667*b87ea798Sluzhixing12345  back-end's internal state during migration.  Either side (front-end or
166801923309SHanna Czenczek  back-end) may create the channel.  The nature of this channel is not
166901923309SHanna Czenczek  restricted or defined in this document, but whichever side creates it
167001923309SHanna Czenczek  must create a file descriptor that is provided to the respectively
167101923309SHanna Czenczek  other side, allowing access to the channel.  This FD must behave as
167201923309SHanna Czenczek  follows:
167301923309SHanna Czenczek
167401923309SHanna Czenczek  * For the writing end, it must allow writing the whole back-end state
167501923309SHanna Czenczek    sequentially.  Closing the file descriptor signals the end of
167601923309SHanna Czenczek    transfer.
167701923309SHanna Czenczek
167801923309SHanna Czenczek  * For the reading end, it must allow reading the whole back-end state
167901923309SHanna Czenczek    sequentially.  The end of file signals the end of the transfer.
168001923309SHanna Czenczek
168101923309SHanna Czenczek  For example, the channel may be a pipe, in which case the two ends of
168201923309SHanna Czenczek  the pipe fulfill these requirements respectively.
168301923309SHanna Czenczek
168401923309SHanna Czenczek  Initially, the front-end creates a channel along with such an FD.  It
168501923309SHanna Czenczek  passes the FD to the back-end as ancillary data of a
168601923309SHanna Czenczek  ``VHOST_USER_SET_DEVICE_STATE_FD`` message.  The back-end may create a
168701923309SHanna Czenczek  different transfer channel, passing the respective FD back to the
168801923309SHanna Czenczek  front-end as ancillary data of the reply.  If so, the front-end must
168901923309SHanna Czenczek  then discard its channel and use the one provided by the back-end.
169001923309SHanna Czenczek
169101923309SHanna Czenczek  Whether the back-end should decide to use its own channel is decided
169201923309SHanna Czenczek  based on efficiency: If the channel is a pipe, both ends will most
169301923309SHanna Czenczek  likely need to copy data into and out of it.  Any channel that allows
169401923309SHanna Czenczek  for more efficient processing on at least one end, e.g. through
169501923309SHanna Czenczek  zero-copy, is considered more efficient and thus preferred.  If the
169601923309SHanna Czenczek  back-end can provide such a channel, it should decide to use it.
169701923309SHanna Czenczek
169801923309SHanna Czenczek  The request payload contains parameters for the subsequent data
169901923309SHanna Czenczek  transfer, as described in the :ref:`Migrating back-end state
170001923309SHanna Czenczek  <migrating_backend_state>` section.
170101923309SHanna Czenczek
170201923309SHanna Czenczek  The value returned is both an indication for success, and whether a
170301923309SHanna Czenczek  file descriptor for a back-end-provided channel is returned: Bits 0–7
170401923309SHanna Czenczek  are 0 on success, and non-zero on error.  Bit 8 is the invalid FD
170501923309SHanna Czenczek  flag; this flag is set when there is no file descriptor returned.
170601923309SHanna Czenczek  When this flag is not set, the front-end must use the returned file
170701923309SHanna Czenczek  descriptor as its end of the transfer channel.  The back-end must not
170801923309SHanna Czenczek  both indicate an error and return a file descriptor.
170901923309SHanna Czenczek
171001923309SHanna Czenczek  Using this function requires prior negotiation of the
171101923309SHanna Czenczek  ``VHOST_USER_PROTOCOL_F_DEVICE_STATE`` feature.
171201923309SHanna Czenczek
171301923309SHanna Czenczek``VHOST_USER_CHECK_DEVICE_STATE``
171401923309SHanna Czenczek  :id: 43
171501923309SHanna Czenczek  :equivalent ioctl: N/A
171601923309SHanna Czenczek  :request payload: N/A
171701923309SHanna Czenczek  :reply payload: ``u64``
171801923309SHanna Czenczek
1719*b87ea798Sluzhixing12345  After transferring the back-end's internal state during migration (see
172001923309SHanna Czenczek  the :ref:`Migrating back-end state <migrating_backend_state>`
172101923309SHanna Czenczek  section), check whether the back-end was able to successfully fully
172201923309SHanna Czenczek  process the state.
172301923309SHanna Czenczek
172401923309SHanna Czenczek  The value returned indicates success or error; 0 is success, any
172501923309SHanna Czenczek  non-zero value is an error.
172601923309SHanna Czenczek
172701923309SHanna Czenczek  Using this function requires prior negotiation of the
172801923309SHanna Czenczek  ``VHOST_USER_PROTOCOL_F_DEVICE_STATE`` feature.
172901923309SHanna Czenczek
1730bd59f2a1SPaolo BonziniBack-end message types
1731bd59f2a1SPaolo Bonzini----------------------
1732ed1be66bSMarc-André Lureau
1733bd59f2a1SPaolo BonziniFor this type of message, the request is sent by the back-end and the reply
1734bd59f2a1SPaolo Bonziniis sent by the front-end.
173543725d48SPaolo Bonzini
1736e9ca9f33SMaxime Coquelin``VHOST_USER_BACKEND_IOTLB_MSG`` (previous name ``VHOST_USER_SLAVE_IOTLB_MSG``)
1737ed1be66bSMarc-André Lureau  :id: 1
1738ed1be66bSMarc-André Lureau  :equivalent ioctl: N/A (equivalent to ``VHOST_IOTLB_MSG`` message type)
173943725d48SPaolo Bonzini  :request payload: ``struct vhost_iotlb_msg``
174043725d48SPaolo Bonzini  :reply payload: N/A
1741ed1be66bSMarc-André Lureau
1742ed1be66bSMarc-André Lureau  Send IOTLB messages with ``struct vhost_iotlb_msg`` as payload.
1743bd59f2a1SPaolo Bonzini  The back-end sends such requests to notify of an IOTLB miss, or an IOTLB
1744ed1be66bSMarc-André Lureau  access failure. If ``VHOST_USER_PROTOCOL_F_REPLY_ACK`` is
1745bd59f2a1SPaolo Bonzini  negotiated, and back-end set the ``VHOST_USER_NEED_REPLY`` flag, the front-end
1746ed1be66bSMarc-André Lureau  must respond with zero when operation is successfully completed, or
1747ed1be66bSMarc-André Lureau  non-zero otherwise.  This request should be send only when
1748ed1be66bSMarc-André Lureau  ``VIRTIO_F_IOMMU_PLATFORM`` feature has been successfully
1749ed1be66bSMarc-André Lureau  negotiated.
1750ed1be66bSMarc-André Lureau
1751e9ca9f33SMaxime Coquelin``VHOST_USER_BACKEND_CONFIG_CHANGE_MSG`` (previous name ``VHOST_USER_SLAVE_CONFIG_CHANGE_MSG``)
1752ed1be66bSMarc-André Lureau  :id: 2
1753ed1be66bSMarc-André Lureau  :equivalent ioctl: N/A
175443725d48SPaolo Bonzini  :request payload: N/A
175543725d48SPaolo Bonzini  :reply payload: N/A
1756ed1be66bSMarc-André Lureau
1757ed1be66bSMarc-André Lureau  When ``VHOST_USER_PROTOCOL_F_CONFIG`` is negotiated, vhost-user
1758bd59f2a1SPaolo Bonzini  back-end sends such messages to notify that the virtio device's
1759ed1be66bSMarc-André Lureau  configuration space has changed, for those host devices which can
1760ed1be66bSMarc-André Lureau  support such feature, host driver can send ``VHOST_USER_GET_CONFIG``
1761bd59f2a1SPaolo Bonzini  message to the back-end to get the latest content. If
1762bd59f2a1SPaolo Bonzini  ``VHOST_USER_PROTOCOL_F_REPLY_ACK`` is negotiated, and the back-end sets the
1763bd59f2a1SPaolo Bonzini  ``VHOST_USER_NEED_REPLY`` flag, the front-end must respond with zero when
1764ed1be66bSMarc-André Lureau  operation is successfully completed, or non-zero otherwise.
1765ed1be66bSMarc-André Lureau
1766e9ca9f33SMaxime Coquelin``VHOST_USER_BACKEND_VRING_HOST_NOTIFIER_MSG`` (previous name ``VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG``)
1767ed1be66bSMarc-André Lureau  :id: 3
1768ed1be66bSMarc-André Lureau  :equivalent ioctl: N/A
176943725d48SPaolo Bonzini  :request payload: vring area description
177043725d48SPaolo Bonzini  :reply payload: N/A
1771ed1be66bSMarc-André Lureau
1772ed1be66bSMarc-André Lureau  Sets host notifier for a specified queue. The queue index is
1773ed1be66bSMarc-André Lureau  contained in the ``u64`` field of the vring area description. The
1774ed1be66bSMarc-André Lureau  host notifier is described by the file descriptor (typically it's a
1775ed1be66bSMarc-André Lureau  VFIO device fd) which is passed as ancillary data and the size
1776ed1be66bSMarc-André Lureau  (which is mmap size and should be the same as host page size) and
1777ed1be66bSMarc-André Lureau  offset (which is mmap offset) carried in the vring area
1778ed1be66bSMarc-André Lureau  description. QEMU can mmap the file descriptor based on the size and
1779ed1be66bSMarc-André Lureau  offset to get a memory range. Registering a host notifier means
1780ed1be66bSMarc-André Lureau  mapping this memory range to the VM as the specified queue's notify
1781bd59f2a1SPaolo Bonzini  MMIO region. The back-end sends this request to tell QEMU to de-register
1782ed1be66bSMarc-André Lureau  the existing notifier if any and register the new notifier if the
1783ed1be66bSMarc-André Lureau  request is sent with a file descriptor.
1784ed1be66bSMarc-André Lureau
1785ed1be66bSMarc-André Lureau  This request should be sent only when
1786ed1be66bSMarc-André Lureau  ``VHOST_USER_PROTOCOL_F_HOST_NOTIFIER`` protocol feature has been
1787ed1be66bSMarc-André Lureau  successfully negotiated.
1788ed1be66bSMarc-André Lureau
1789e9ca9f33SMaxime Coquelin``VHOST_USER_BACKEND_VRING_CALL`` (previous name ``VHOST_USER_SLAVE_VRING_CALL``)
17903348e7e3SJohannes Berg  :id: 4
17913348e7e3SJohannes Berg  :equivalent ioctl: N/A
179243725d48SPaolo Bonzini  :request payload: vring state description
179343725d48SPaolo Bonzini  :reply payload: N/A
17943348e7e3SJohannes Berg
17953348e7e3SJohannes Berg  When the ``VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS`` protocol
17963348e7e3SJohannes Berg  feature has been successfully negotiated, this message may be
1797bd59f2a1SPaolo Bonzini  submitted by the back-end to indicate that a buffer was used from
17983348e7e3SJohannes Berg  the vring instead of signalling this using the vring's call file
1799bd59f2a1SPaolo Bonzini  descriptor or having the front-end relying on polling.
18003348e7e3SJohannes Berg
18013348e7e3SJohannes Berg  The state.num field is currently reserved and must be set to 0.
18023348e7e3SJohannes Berg
1803e9ca9f33SMaxime Coquelin``VHOST_USER_BACKEND_VRING_ERR`` (previous name ``VHOST_USER_SLAVE_VRING_ERR``)
18043348e7e3SJohannes Berg  :id: 5
18053348e7e3SJohannes Berg  :equivalent ioctl: N/A
180643725d48SPaolo Bonzini  :request payload: vring state description
180743725d48SPaolo Bonzini  :reply payload: N/A
18083348e7e3SJohannes Berg
18093348e7e3SJohannes Berg  When the ``VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS`` protocol
18103348e7e3SJohannes Berg  feature has been successfully negotiated, this message may be
1811bd59f2a1SPaolo Bonzini  submitted by the back-end to indicate that an error occurred on the
18123348e7e3SJohannes Berg  specific vring, instead of signalling the error file descriptor
1813bd59f2a1SPaolo Bonzini  set by the front-end via ``VHOST_USER_SET_VRING_ERR``.
18143348e7e3SJohannes Berg
18153348e7e3SJohannes Berg  The state.num field is currently reserved and must be set to 0.
18163348e7e3SJohannes Berg
181716094766SAlbert Esteve``VHOST_USER_BACKEND_SHARED_OBJECT_ADD``
181816094766SAlbert Esteve  :id: 6
181916094766SAlbert Esteve  :equivalent ioctl: N/A
182016094766SAlbert Esteve  :request payload: ``struct VhostUserShared``
182116094766SAlbert Esteve  :reply payload: N/A
182216094766SAlbert Esteve
182316094766SAlbert Esteve  When the ``VHOST_USER_PROTOCOL_F_SHARED_OBJECT`` protocol
182416094766SAlbert Esteve  feature has been successfully negotiated, this message can be submitted
182516094766SAlbert Esteve  by the backends to add themselves as exporters to the virtio shared lookup
182616094766SAlbert Esteve  table. The back-end device gets associated with a UUID in the shared table.
182716094766SAlbert Esteve  The back-end is responsible of keeping its own table with exported dma-buf fds.
182816094766SAlbert Esteve  When another back-end tries to import the resource associated with the UUID,
182916094766SAlbert Esteve  it will send a message to the front-end, which will act as a proxy to the
183016094766SAlbert Esteve  exporter back-end. If ``VHOST_USER_PROTOCOL_F_REPLY_ACK`` is negotiated, and
183116094766SAlbert Esteve  the back-end sets the ``VHOST_USER_NEED_REPLY`` flag, the front-end must
183216094766SAlbert Esteve  respond with zero when operation is successfully completed, or non-zero
183316094766SAlbert Esteve  otherwise.
183416094766SAlbert Esteve
183516094766SAlbert Esteve``VHOST_USER_BACKEND_SHARED_OBJECT_REMOVE``
183616094766SAlbert Esteve  :id: 7
183716094766SAlbert Esteve  :equivalent ioctl: N/A
183816094766SAlbert Esteve  :request payload: ``struct VhostUserShared``
183916094766SAlbert Esteve  :reply payload: N/A
184016094766SAlbert Esteve
184116094766SAlbert Esteve  When the ``VHOST_USER_PROTOCOL_F_SHARED_OBJECT`` protocol
184216094766SAlbert Esteve  feature has been successfully negotiated, this message can be submitted
184316094766SAlbert Esteve  by the backend to remove themselves from to the virtio-dmabuf shared
1844043e127aSAlbert Esteve  table API. Only the back-end owning the entry (i.e., the one that first added
1845043e127aSAlbert Esteve  it) will have permission to remove it. Otherwise, the message is ignored.
1846043e127aSAlbert Esteve  The shared table will remove the back-end device associated with
184716094766SAlbert Esteve  the UUID. If ``VHOST_USER_PROTOCOL_F_REPLY_ACK`` is negotiated, and the
184816094766SAlbert Esteve  back-end sets the ``VHOST_USER_NEED_REPLY`` flag, the front-end must respond
184916094766SAlbert Esteve  with zero when operation is successfully completed, or non-zero otherwise.
185016094766SAlbert Esteve
185116094766SAlbert Esteve``VHOST_USER_BACKEND_SHARED_OBJECT_LOOKUP``
185216094766SAlbert Esteve  :id: 8
185316094766SAlbert Esteve  :equivalent ioctl: N/A
185416094766SAlbert Esteve  :request payload: ``struct VhostUserShared``
185516094766SAlbert Esteve  :reply payload: dmabuf fd and ``u64``
185616094766SAlbert Esteve
185716094766SAlbert Esteve  When the ``VHOST_USER_PROTOCOL_F_SHARED_OBJECT`` protocol
185816094766SAlbert Esteve  feature has been successfully negotiated, this message can be submitted
185916094766SAlbert Esteve  by the backends to retrieve a given dma-buf fd from the virtio-dmabuf
186016094766SAlbert Esteve  shared table given a UUID. Frontend will reply passing the fd and a zero
186116094766SAlbert Esteve  when the operation is successful, or non-zero otherwise. Note that if the
186216094766SAlbert Esteve  operation fails, no fd is sent to the backend.
186316094766SAlbert Esteve
1864ed1be66bSMarc-André Lureau.. _reply_ack:
1865ed1be66bSMarc-André Lureau
1866ed1be66bSMarc-André LureauVHOST_USER_PROTOCOL_F_REPLY_ACK
1867ed1be66bSMarc-André Lureau-------------------------------
1868ed1be66bSMarc-André Lureau
1869ed1be66bSMarc-André LureauThe original vhost-user specification only demands replies for certain
1870ed1be66bSMarc-André Lureaucommands. This differs from the vhost protocol implementation where
1871bd59f2a1SPaolo Bonzinicommands are sent over an ``ioctl()`` call and block until the back-end
1872ed1be66bSMarc-André Lureauhas completed.
1873ed1be66bSMarc-André Lureau
1874ed1be66bSMarc-André LureauWith this protocol extension negotiated, the sender (QEMU) can set the
1875ed1be66bSMarc-André Lureau``need_reply`` [Bit 3] flag to any command. This indicates that the
1876bd59f2a1SPaolo Bonziniback-end MUST respond with a Payload ``VhostUserMsg`` indicating success
1877ed1be66bSMarc-André Lureauor failure. The payload should be set to zero on success or non-zero
1878ed1be66bSMarc-André Lureauon failure, unless the message already has an explicit reply body.
1879ed1be66bSMarc-André Lureau
188043725d48SPaolo BonziniThe reply payload gives QEMU a deterministic indication of the result
1881ed1be66bSMarc-André Lureauof the command. Today, QEMU is expected to terminate the main vhost-user
1882ed1be66bSMarc-André Lureauloop upon receiving such errors. In future, qemu could be taught to be more
1883ed1be66bSMarc-André Lureauresilient for selective requests.
1884ed1be66bSMarc-André Lureau
1885bd59f2a1SPaolo BonziniFor the message types that already solicit a reply from the back-end,
1886ed1be66bSMarc-André Lureauthe presence of ``VHOST_USER_PROTOCOL_F_REPLY_ACK`` or need_reply bit
1887ed1be66bSMarc-André Lureaubeing set brings no behavioural change. (See the Communication_
1888ed1be66bSMarc-André Lureausection for details.)
1889ed1be66bSMarc-André Lureau
1890ed1be66bSMarc-André Lureau.. _backend_conventions:
1891ed1be66bSMarc-André Lureau
1892ed1be66bSMarc-André LureauBackend program conventions
1893ed1be66bSMarc-André Lureau===========================
1894ed1be66bSMarc-André Lureau
1895bd59f2a1SPaolo Bonzinivhost-user back-ends can provide various devices & services and may
1896ed1be66bSMarc-André Lureauneed to be configured manually depending on the use case. However, it
1897ed1be66bSMarc-André Lureauis a good idea to follow the conventions listed here when
1898ed1be66bSMarc-André Lureaupossible. Users, QEMU or libvirt, can then rely on some common
189976ca4b58Szhaolichangbehaviour to avoid heterogeneous configuration and management of the
1900bd59f2a1SPaolo Bonziniback-end programs and facilitate interoperability.
1901ed1be66bSMarc-André Lureau
1902bd59f2a1SPaolo BonziniEach back-end installed on a host system should come with at least one
1903ed1be66bSMarc-André LureauJSON file that conforms to the vhost-user.json schema. Each file
1904bd59f2a1SPaolo Bonziniinforms the management applications about the back-end type, and binary
1905ed1be66bSMarc-André Lureaulocation. In addition, it defines rules for management apps for
1906bd59f2a1SPaolo Bonzinipicking the highest priority back-end when multiple match the search
1907ed1be66bSMarc-André Lureaucriteria (see ``@VhostUserBackend`` documentation in the schema file).
1908ed1be66bSMarc-André Lureau
1909bd59f2a1SPaolo BonziniIf the back-end is not capable of enabling a requested feature on the
1910ed1be66bSMarc-André Lureauhost (such as 3D acceleration with virgl), or the initialization
1911bd59f2a1SPaolo Bonzinifailed, the back-end should fail to start early and exit with a status
1912ed1be66bSMarc-André Lureau!= 0. It may also print a message to stderr for further details.
1913ed1be66bSMarc-André Lureau
1914bd59f2a1SPaolo BonziniThe back-end program must not daemonize itself, but it may be
1915ed1be66bSMarc-André Lureaudaemonized by the management layer. It may also have a restricted
1916ed1be66bSMarc-André Lureauaccess to the system.
1917ed1be66bSMarc-André Lureau
1918ed1be66bSMarc-André LureauFile descriptors 0, 1 and 2 will exist, and have regular
1919ed1be66bSMarc-André Lureaustdin/stdout/stderr usage (they may have been redirected to /dev/null
1920ed1be66bSMarc-André Lureauby the management layer, or to a log handler).
1921ed1be66bSMarc-André Lureau
1922bd59f2a1SPaolo BonziniThe back-end program must end (as quickly and cleanly as possible) when
1923ed1be66bSMarc-André Lureauthe SIGTERM signal is received. Eventually, it may receive SIGKILL by
1924ed1be66bSMarc-André Lureauthe management layer after a few seconds.
1925ed1be66bSMarc-André Lureau
1926ed1be66bSMarc-André LureauThe following command line options have an expected behaviour. They
1927ed1be66bSMarc-André Lureauare mandatory, unless explicitly said differently:
1928ed1be66bSMarc-André Lureau
1929ed1be66bSMarc-André Lureau--socket-path=PATH
1930ed1be66bSMarc-André Lureau
1931ed1be66bSMarc-André Lureau  This option specify the location of the vhost-user Unix domain socket.
1932ed1be66bSMarc-André Lureau  It is incompatible with --fd.
1933ed1be66bSMarc-André Lureau
1934ed1be66bSMarc-André Lureau--fd=FDNUM
1935ed1be66bSMarc-André Lureau
1936bd59f2a1SPaolo Bonzini  When this argument is given, the back-end program is started with the
1937ed1be66bSMarc-André Lureau  vhost-user socket as file descriptor FDNUM. It is incompatible with
1938ed1be66bSMarc-André Lureau  --socket-path.
1939ed1be66bSMarc-André Lureau
1940ed1be66bSMarc-André Lureau--print-capabilities
1941ed1be66bSMarc-André Lureau
1942bd59f2a1SPaolo Bonzini  Output to stdout the back-end capabilities in JSON format, and then
1943ed1be66bSMarc-André Lureau  exit successfully. Other options and arguments should be ignored, and
1944bd59f2a1SPaolo Bonzini  the back-end program should not perform its normal function.  The
1945ed1be66bSMarc-André Lureau  capabilities can be reported dynamically depending on the host
1946ed1be66bSMarc-André Lureau  capabilities.
1947ed1be66bSMarc-André Lureau
1948ed1be66bSMarc-André LureauThe JSON output is described in the ``vhost-user.json`` schema, by
1949ed1be66bSMarc-André Lureau```@VHostUserBackendCapabilities``.  Example:
1950ed1be66bSMarc-André Lureau
1951ed1be66bSMarc-André Lureau.. code:: json
1952ed1be66bSMarc-André Lureau
1953ed1be66bSMarc-André Lureau  {
1954ed1be66bSMarc-André Lureau    "type": "foo",
1955ed1be66bSMarc-André Lureau    "features": [
1956ed1be66bSMarc-André Lureau      "feature-a",
1957ed1be66bSMarc-André Lureau      "feature-b"
1958ed1be66bSMarc-André Lureau    ]
1959ed1be66bSMarc-André Lureau  }
1960ed1be66bSMarc-André Lureau
1961ed1be66bSMarc-André Lureauvhost-user-input
1962ed1be66bSMarc-André Lureau----------------
1963ed1be66bSMarc-André Lureau
1964ed1be66bSMarc-André LureauCommand line options:
1965ed1be66bSMarc-André Lureau
1966ed1be66bSMarc-André Lureau--evdev-path=PATH
1967ed1be66bSMarc-André Lureau
1968ed1be66bSMarc-André Lureau  Specify the linux input device.
1969ed1be66bSMarc-André Lureau
1970ed1be66bSMarc-André Lureau  (optional)
1971ed1be66bSMarc-André Lureau
1972ed1be66bSMarc-André Lureau--no-grab
1973ed1be66bSMarc-André Lureau
1974ed1be66bSMarc-André Lureau  Do no request exclusive access to the input device.
1975ed1be66bSMarc-André Lureau
1976ed1be66bSMarc-André Lureau  (optional)
1977ed1be66bSMarc-André Lureau
1978ed1be66bSMarc-André Lureauvhost-user-gpu
1979ed1be66bSMarc-André Lureau--------------
1980ed1be66bSMarc-André Lureau
1981ed1be66bSMarc-André LureauCommand line options:
1982ed1be66bSMarc-André Lureau
1983ed1be66bSMarc-André Lureau--render-node=PATH
1984ed1be66bSMarc-André Lureau
1985ed1be66bSMarc-André Lureau  Specify the GPU DRM render node.
1986ed1be66bSMarc-André Lureau
1987ed1be66bSMarc-André Lureau  (optional)
1988ed1be66bSMarc-André Lureau
1989ed1be66bSMarc-André Lureau--virgl
1990ed1be66bSMarc-André Lureau
1991ed1be66bSMarc-André Lureau  Enable virgl rendering support.
1992ed1be66bSMarc-André Lureau
1993ed1be66bSMarc-André Lureau  (optional)
19946620801fSMicky Yun Chan
19956620801fSMicky Yun Chanvhost-user-blk
19966620801fSMicky Yun Chan--------------
19976620801fSMicky Yun Chan
19986620801fSMicky Yun ChanCommand line options:
19996620801fSMicky Yun Chan
20006620801fSMicky Yun Chan--blk-file=PATH
20016620801fSMicky Yun Chan
20026620801fSMicky Yun Chan  Specify block device or file path.
20036620801fSMicky Yun Chan
20046620801fSMicky Yun Chan  (optional)
20056620801fSMicky Yun Chan
20066620801fSMicky Yun Chan--read-only
20076620801fSMicky Yun Chan
20086620801fSMicky Yun Chan  Enable read-only.
20096620801fSMicky Yun Chan
20106620801fSMicky Yun Chan  (optional)
2011