1=================== 2Vhost-user Protocol 3=================== 4:Copyright: 2014 Virtual Open Systems Sarl. 5:Licence: This work is licensed under the terms of the GNU GPL, 6 version 2 or later. See the COPYING file in the top-level 7 directory. 8 9.. contents:: Table of Contents 10 11Introduction 12============ 13 14This protocol is aiming to complement the ``ioctl`` interface used to 15control the vhost implementation in the Linux kernel. It implements 16the control plane needed to establish virtqueue sharing with a user 17space process on the same host. It uses communication over a Unix 18domain socket to share file descriptors in the ancillary data of the 19message. 20 21The protocol defines 2 sides of the communication, *master* and 22*slave*. *Master* is the application that shares its virtqueues, in 23our case QEMU. *Slave* is the consumer of the virtqueues. 24 25In the current implementation QEMU is the *master*, and the *slave* is 26the external process consuming the virtio queues, for example a 27software Ethernet switch running in user space, such as Snabbswitch, 28or a block device backend processing read & write to a virtual 29disk. In order to facilitate interoperability between various backend 30implementations, it is recommended to follow the :ref:`Backend program 31conventions <backend_conventions>`. 32 33*Master* and *slave* can be either a client (i.e. connecting) or 34server (listening) in the socket communication. 35 36Message Specification 37===================== 38 39.. Note:: All numbers are in the machine native byte order. 40 41A vhost-user message consists of 3 header fields and a payload. 42 43+---------+-------+------+---------+ 44| request | flags | size | payload | 45+---------+-------+------+---------+ 46 47Header 48------ 49 50:request: 32-bit type of the request 51 52:flags: 32-bit bit field 53 54- Lower 2 bits are the version (currently 0x01) 55- Bit 2 is the reply flag - needs to be sent on each reply from the slave 56- Bit 3 is the need_reply flag - see :ref:`REPLY_ACK <reply_ack>` for 57 details. 58 59:size: 32-bit size of the payload 60 61Payload 62------- 63 64Depending on the request type, **payload** can be: 65 66A single 64-bit integer 67^^^^^^^^^^^^^^^^^^^^^^^ 68 69+-----+ 70| u64 | 71+-----+ 72 73:u64: a 64-bit unsigned integer 74 75A vring state description 76^^^^^^^^^^^^^^^^^^^^^^^^^ 77 78+-------+-----+ 79| index | num | 80+-------+-----+ 81 82:index: a 32-bit index 83 84:num: a 32-bit number 85 86A vring address description 87^^^^^^^^^^^^^^^^^^^^^^^^^^^ 88 89+-------+-------+------+------------+------+-----------+-----+ 90| index | flags | size | descriptor | used | available | log | 91+-------+-------+------+------------+------+-----------+-----+ 92 93:index: a 32-bit vring index 94 95:flags: a 32-bit vring flags 96 97:descriptor: a 64-bit ring address of the vring descriptor table 98 99:used: a 64-bit ring address of the vring used ring 100 101:available: a 64-bit ring address of the vring available ring 102 103:log: a 64-bit guest address for logging 104 105Note that a ring address is an IOVA if ``VIRTIO_F_IOMMU_PLATFORM`` has 106been negotiated. Otherwise it is a user address. 107 108Memory regions description 109^^^^^^^^^^^^^^^^^^^^^^^^^^ 110 111+-------------+---------+---------+-----+---------+ 112| num regions | padding | region0 | ... | region7 | 113+-------------+---------+---------+-----+---------+ 114 115:num regions: a 32-bit number of regions 116 117:padding: 32-bit 118 119A region is: 120 121+---------------+------+--------------+-------------+ 122| guest address | size | user address | mmap offset | 123+---------------+------+--------------+-------------+ 124 125:guest address: a 64-bit guest address of the region 126 127:size: a 64-bit size 128 129:user address: a 64-bit user address 130 131:mmap offset: 64-bit offset where region starts in the mapped memory 132 133Log description 134^^^^^^^^^^^^^^^ 135 136+----------+------------+ 137| log size | log offset | 138+----------+------------+ 139 140:log size: size of area used for logging 141 142:log offset: offset from start of supplied file descriptor where 143 logging starts (i.e. where guest address 0 would be 144 logged) 145 146An IOTLB message 147^^^^^^^^^^^^^^^^ 148 149+------+------+--------------+-------------------+------+ 150| iova | size | user address | permissions flags | type | 151+------+------+--------------+-------------------+------+ 152 153:iova: a 64-bit I/O virtual address programmed by the guest 154 155:size: a 64-bit size 156 157:user address: a 64-bit user address 158 159:permissions flags: an 8-bit value: 160 - 0: No access 161 - 1: Read access 162 - 2: Write access 163 - 3: Read/Write access 164 165:type: an 8-bit IOTLB message type: 166 - 1: IOTLB miss 167 - 2: IOTLB update 168 - 3: IOTLB invalidate 169 - 4: IOTLB access fail 170 171Virtio device config space 172^^^^^^^^^^^^^^^^^^^^^^^^^^ 173 174+--------+------+-------+---------+ 175| offset | size | flags | payload | 176+--------+------+-------+---------+ 177 178:offset: a 32-bit offset of virtio device's configuration space 179 180:size: a 32-bit configuration space access size in bytes 181 182:flags: a 32-bit value: 183 - 0: Vhost master messages used for writeable fields 184 - 1: Vhost master messages used for live migration 185 186:payload: Size bytes array holding the contents of the virtio 187 device's configuration space 188 189Vring area description 190^^^^^^^^^^^^^^^^^^^^^^ 191 192+-----+------+--------+ 193| u64 | size | offset | 194+-----+------+--------+ 195 196:u64: a 64-bit integer contains vring index and flags 197 198:size: a 64-bit size of this area 199 200:offset: a 64-bit offset of this area from the start of the 201 supplied file descriptor 202 203Inflight description 204^^^^^^^^^^^^^^^^^^^^ 205 206+-----------+-------------+------------+------------+ 207| mmap size | mmap offset | num queues | queue size | 208+-----------+-------------+------------+------------+ 209 210:mmap size: a 64-bit size of area to track inflight I/O 211 212:mmap offset: a 64-bit offset of this area from the start 213 of the supplied file descriptor 214 215:num queues: a 16-bit number of virtqueues 216 217:queue size: a 16-bit size of virtqueues 218 219C structure 220----------- 221 222In QEMU the vhost-user message is implemented with the following struct: 223 224.. code:: c 225 226 typedef struct VhostUserMsg { 227 VhostUserRequest request; 228 uint32_t flags; 229 uint32_t size; 230 union { 231 uint64_t u64; 232 struct vhost_vring_state state; 233 struct vhost_vring_addr addr; 234 VhostUserMemory memory; 235 VhostUserLog log; 236 struct vhost_iotlb_msg iotlb; 237 VhostUserConfig config; 238 VhostUserVringArea area; 239 VhostUserInflight inflight; 240 }; 241 } QEMU_PACKED VhostUserMsg; 242 243Communication 244============= 245 246The protocol for vhost-user is based on the existing implementation of 247vhost for the Linux Kernel. Most messages that can be sent via the 248Unix domain socket implementing vhost-user have an equivalent ioctl to 249the kernel implementation. 250 251The communication consists of *master* sending message requests and 252*slave* sending message replies. Most of the requests don't require 253replies. Here is a list of the ones that do: 254 255* ``VHOST_USER_GET_FEATURES`` 256* ``VHOST_USER_GET_PROTOCOL_FEATURES`` 257* ``VHOST_USER_GET_VRING_BASE`` 258* ``VHOST_USER_SET_LOG_BASE`` (if ``VHOST_USER_PROTOCOL_F_LOG_SHMFD``) 259* ``VHOST_USER_GET_INFLIGHT_FD`` (if ``VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD``) 260 261.. seealso:: 262 263 :ref:`REPLY_ACK <reply_ack>` 264 The section on ``REPLY_ACK`` protocol extension. 265 266There are several messages that the master sends with file descriptors passed 267in the ancillary data: 268 269* ``VHOST_USER_SET_MEM_TABLE`` 270* ``VHOST_USER_SET_LOG_BASE`` (if ``VHOST_USER_PROTOCOL_F_LOG_SHMFD``) 271* ``VHOST_USER_SET_LOG_FD`` 272* ``VHOST_USER_SET_VRING_KICK`` 273* ``VHOST_USER_SET_VRING_CALL`` 274* ``VHOST_USER_SET_VRING_ERR`` 275* ``VHOST_USER_SET_SLAVE_REQ_FD`` 276* ``VHOST_USER_SET_INFLIGHT_FD`` (if ``VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD``) 277 278If *master* is unable to send the full message or receives a wrong 279reply it will close the connection. An optional reconnection mechanism 280can be implemented. 281 282Any protocol extensions are gated by protocol feature bits, which 283allows full backwards compatibility on both master and slave. As 284older slaves don't support negotiating protocol features, a feature 285bit was dedicated for this purpose:: 286 287 #define VHOST_USER_F_PROTOCOL_FEATURES 30 288 289Starting and stopping rings 290--------------------------- 291 292Client must only process each ring when it is started. 293 294Client must only pass data between the ring and the backend, when the 295ring is enabled. 296 297If ring is started but disabled, client must process the ring without 298talking to the backend. 299 300For example, for a networking device, in the disabled state client 301must not supply any new RX packets, but must process and discard any 302TX packets. 303 304If ``VHOST_USER_F_PROTOCOL_FEATURES`` has not been negotiated, the 305ring is initialized in an enabled state. 306 307If ``VHOST_USER_F_PROTOCOL_FEATURES`` has been negotiated, the ring is 308initialized in a disabled state. Client must not pass data to/from the 309backend until ring is enabled by ``VHOST_USER_SET_VRING_ENABLE`` with 310parameter 1, or after it has been disabled by 311``VHOST_USER_SET_VRING_ENABLE`` with parameter 0. 312 313Each ring is initialized in a stopped state, client must not process 314it until ring is started, or after it has been stopped. 315 316Client must start ring upon receiving a kick (that is, detecting that 317file descriptor is readable) on the descriptor specified by 318``VHOST_USER_SET_VRING_KICK``, and stop ring upon receiving 319``VHOST_USER_GET_VRING_BASE``. 320 321While processing the rings (whether they are enabled or not), client 322must support changing some configuration aspects on the fly. 323 324Multiple queue support 325---------------------- 326 327Many devices have a fixed number of virtqueues. In this case the master 328already knows the number of available virtqueues without communicating with the 329slave. 330 331Some devices do not have a fixed number of virtqueues. Instead the maximum 332number of virtqueues is chosen by the slave. The number can depend on host 333resource availability or slave implementation details. Such devices are called 334multiple queue devices. 335 336Multiple queue support allows the slave to advertise the maximum number of 337queues. This is treated as a protocol extension, hence the slave has to 338implement protocol features first. The multiple queues feature is supported 339only when the protocol feature ``VHOST_USER_PROTOCOL_F_MQ`` (bit 0) is set. 340 341The max number of queues the slave supports can be queried with message 342``VHOST_USER_GET_QUEUE_NUM``. Master should stop when the number of requested 343queues is bigger than that. 344 345As all queues share one connection, the master uses a unique index for each 346queue in the sent message to identify a specified queue. 347 348The master enables queues by sending message ``VHOST_USER_SET_VRING_ENABLE``. 349vhost-user-net has historically automatically enabled the first queue pair. 350 351Slaves should always implement the ``VHOST_USER_PROTOCOL_F_MQ`` protocol 352feature, even for devices with a fixed number of virtqueues, since it is simple 353to implement and offers a degree of introspection. 354 355Masters must not rely on the ``VHOST_USER_PROTOCOL_F_MQ`` protocol feature for 356devices with a fixed number of virtqueues. Only true multiqueue devices 357require this protocol feature. 358 359Migration 360--------- 361 362During live migration, the master may need to track the modifications 363the slave makes to the memory mapped regions. The client should mark 364the dirty pages in a log. Once it complies to this logging, it may 365declare the ``VHOST_F_LOG_ALL`` vhost feature. 366 367To start/stop logging of data/used ring writes, server may send 368messages ``VHOST_USER_SET_FEATURES`` with ``VHOST_F_LOG_ALL`` and 369``VHOST_USER_SET_VRING_ADDR`` with ``VHOST_VRING_F_LOG`` in ring's 370flags set to 1/0, respectively. 371 372All the modifications to memory pointed by vring "descriptor" should 373be marked. Modifications to "used" vring should be marked if 374``VHOST_VRING_F_LOG`` is part of ring's flags. 375 376Dirty pages are of size:: 377 378 #define VHOST_LOG_PAGE 0x1000 379 380The log memory fd is provided in the ancillary data of 381``VHOST_USER_SET_LOG_BASE`` message when the slave has 382``VHOST_USER_PROTOCOL_F_LOG_SHMFD`` protocol feature. 383 384The size of the log is supplied as part of ``VhostUserMsg`` which 385should be large enough to cover all known guest addresses. Log starts 386at the supplied offset in the supplied file descriptor. The log 387covers from address 0 to the maximum of guest regions. In pseudo-code, 388to mark page at ``addr`` as dirty:: 389 390 page = addr / VHOST_LOG_PAGE 391 log[page / 8] |= 1 << page % 8 392 393Where ``addr`` is the guest physical address. 394 395Use atomic operations, as the log may be concurrently manipulated. 396 397Note that when logging modifications to the used ring (when 398``VHOST_VRING_F_LOG`` is set for this ring), ``log_guest_addr`` should 399be used to calculate the log offset: the write to first byte of the 400used ring is logged at this offset from log start. Also note that this 401value might be outside the legal guest physical address range 402(i.e. does not have to be covered by the ``VhostUserMemory`` table), but 403the bit offset of the last byte of the ring must fall within the size 404supplied by ``VhostUserLog``. 405 406``VHOST_USER_SET_LOG_FD`` is an optional message with an eventfd in 407ancillary data, it may be used to inform the master that the log has 408been modified. 409 410Once the source has finished migration, rings will be stopped by the 411source. No further update must be done before rings are restarted. 412 413In postcopy migration the slave is started before all the memory has 414been received from the source host, and care must be taken to avoid 415accessing pages that have yet to be received. The slave opens a 416'userfault'-fd and registers the memory with it; this fd is then 417passed back over to the master. The master services requests on the 418userfaultfd for pages that are accessed and when the page is available 419it performs WAKE ioctl's on the userfaultfd to wake the stalled 420slave. The client indicates support for this via the 421``VHOST_USER_PROTOCOL_F_PAGEFAULT`` feature. 422 423Memory access 424------------- 425 426The master sends a list of vhost memory regions to the slave using the 427``VHOST_USER_SET_MEM_TABLE`` message. Each region has two base 428addresses: a guest address and a user address. 429 430Messages contain guest addresses and/or user addresses to reference locations 431within the shared memory. The mapping of these addresses works as follows. 432 433User addresses map to the vhost memory region containing that user address. 434 435When the ``VIRTIO_F_IOMMU_PLATFORM`` feature has not been negotiated: 436 437* Guest addresses map to the vhost memory region containing that guest 438 address. 439 440When the ``VIRTIO_F_IOMMU_PLATFORM`` feature has been negotiated: 441 442* Guest addresses are also called I/O virtual addresses (IOVAs). They are 443 translated to user addresses via the IOTLB. 444 445* The vhost memory region guest address is not used. 446 447IOMMU support 448------------- 449 450When the ``VIRTIO_F_IOMMU_PLATFORM`` feature has been negotiated, the 451master sends IOTLB entries update & invalidation by sending 452``VHOST_USER_IOTLB_MSG`` requests to the slave with a ``struct 453vhost_iotlb_msg`` as payload. For update events, the ``iotlb`` payload 454has to be filled with the update message type (2), the I/O virtual 455address, the size, the user virtual address, and the permissions 456flags. Addresses and size must be within vhost memory regions set via 457the ``VHOST_USER_SET_MEM_TABLE`` request. For invalidation events, the 458``iotlb`` payload has to be filled with the invalidation message type 459(3), the I/O virtual address and the size. On success, the slave is 460expected to reply with a zero payload, non-zero otherwise. 461 462The slave relies on the slave communcation channel (see :ref:`Slave 463communication <slave_communication>` section below) to send IOTLB miss 464and access failure events, by sending ``VHOST_USER_SLAVE_IOTLB_MSG`` 465requests to the master with a ``struct vhost_iotlb_msg`` as 466payload. For miss events, the iotlb payload has to be filled with the 467miss message type (1), the I/O virtual address and the permissions 468flags. For access failure event, the iotlb payload has to be filled 469with the access failure message type (4), the I/O virtual address and 470the permissions flags. For synchronization purpose, the slave may 471rely on the reply-ack feature, so the master may send a reply when 472operation is completed if the reply-ack feature is negotiated and 473slaves requests a reply. For miss events, completed operation means 474either master sent an update message containing the IOTLB entry 475containing requested address and permission, or master sent nothing if 476the IOTLB miss message is invalid (invalid IOVA or permission). 477 478The master isn't expected to take the initiative to send IOTLB update 479messages, as the slave sends IOTLB miss messages for the guest virtual 480memory areas it needs to access. 481 482.. _slave_communication: 483 484Slave communication 485------------------- 486 487An optional communication channel is provided if the slave declares 488``VHOST_USER_PROTOCOL_F_SLAVE_REQ`` protocol feature, to allow the 489slave to make requests to the master. 490 491The fd is provided via ``VHOST_USER_SET_SLAVE_REQ_FD`` ancillary data. 492 493A slave may then send ``VHOST_USER_SLAVE_*`` messages to the master 494using this fd communication channel. 495 496If ``VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD`` protocol feature is 497negotiated, slave can send file descriptors (at most 8 descriptors in 498each message) to master via ancillary data using this fd communication 499channel. 500 501Inflight I/O tracking 502--------------------- 503 504To support reconnecting after restart or crash, slave may need to 505resubmit inflight I/Os. If virtqueue is processed in order, we can 506easily achieve that by getting the inflight descriptors from 507descriptor table (split virtqueue) or descriptor ring (packed 508virtqueue). However, it can't work when we process descriptors 509out-of-order because some entries which store the information of 510inflight descriptors in available ring (split virtqueue) or descriptor 511ring (packed virtqueue) might be overrided by new entries. To solve 512this problem, slave need to allocate an extra buffer to store this 513information of inflight descriptors and share it with master for 514persistent. ``VHOST_USER_GET_INFLIGHT_FD`` and 515``VHOST_USER_SET_INFLIGHT_FD`` are used to transfer this buffer 516between master and slave. And the format of this buffer is described 517below: 518 519+---------------+---------------+-----+---------------+ 520| queue0 region | queue1 region | ... | queueN region | 521+---------------+---------------+-----+---------------+ 522 523N is the number of available virtqueues. Slave could get it from num 524queues field of ``VhostUserInflight``. 525 526For split virtqueue, queue region can be implemented as: 527 528.. code:: c 529 530 typedef struct DescStateSplit { 531 /* Indicate whether this descriptor is inflight or not. 532 * Only available for head-descriptor. */ 533 uint8_t inflight; 534 535 /* Padding */ 536 uint8_t padding[5]; 537 538 /* Maintain a list for the last batch of used descriptors. 539 * Only available when batching is used for submitting */ 540 uint16_t next; 541 542 /* Used to preserve the order of fetching available descriptors. 543 * Only available for head-descriptor. */ 544 uint64_t counter; 545 } DescStateSplit; 546 547 typedef struct QueueRegionSplit { 548 /* The feature flags of this region. Now it's initialized to 0. */ 549 uint64_t features; 550 551 /* The version of this region. It's 1 currently. 552 * Zero value indicates an uninitialized buffer */ 553 uint16_t version; 554 555 /* The size of DescStateSplit array. It's equal to the virtqueue 556 * size. Slave could get it from queue size field of VhostUserInflight. */ 557 uint16_t desc_num; 558 559 /* The head of list that track the last batch of used descriptors. */ 560 uint16_t last_batch_head; 561 562 /* Store the idx value of used ring */ 563 uint16_t used_idx; 564 565 /* Used to track the state of each descriptor in descriptor table */ 566 DescStateSplit desc[0]; 567 } QueueRegionSplit; 568 569To track inflight I/O, the queue region should be processed as follows: 570 571When receiving available buffers from the driver: 572 573#. Get the next available head-descriptor index from available ring, ``i`` 574 575#. Set ``desc[i].counter`` to the value of global counter 576 577#. Increase global counter by 1 578 579#. Set ``desc[i].inflight`` to 1 580 581When supplying used buffers to the driver: 582 5831. Get corresponding used head-descriptor index, i 584 5852. Set ``desc[i].next`` to ``last_batch_head`` 586 5873. Set ``last_batch_head`` to ``i`` 588 589#. Steps 1,2,3 may be performed repeatedly if batching is possible 590 591#. Increase the ``idx`` value of used ring by the size of the batch 592 593#. Set the ``inflight`` field of each ``DescStateSplit`` entry in the batch to 0 594 595#. Set ``used_idx`` to the ``idx`` value of used ring 596 597When reconnecting: 598 599#. If the value of ``used_idx`` does not match the ``idx`` value of 600 used ring (means the inflight field of ``DescStateSplit`` entries in 601 last batch may be incorrect), 602 603 a. Subtract the value of ``used_idx`` from the ``idx`` value of 604 used ring to get last batch size of ``DescStateSplit`` entries 605 606 #. Set the ``inflight`` field of each ``DescStateSplit`` entry to 0 in last batch 607 list which starts from ``last_batch_head`` 608 609 #. Set ``used_idx`` to the ``idx`` value of used ring 610 611#. Resubmit inflight ``DescStateSplit`` entries in order of their 612 counter value 613 614For packed virtqueue, queue region can be implemented as: 615 616.. code:: c 617 618 typedef struct DescStatePacked { 619 /* Indicate whether this descriptor is inflight or not. 620 * Only available for head-descriptor. */ 621 uint8_t inflight; 622 623 /* Padding */ 624 uint8_t padding; 625 626 /* Link to the next free entry */ 627 uint16_t next; 628 629 /* Link to the last entry of descriptor list. 630 * Only available for head-descriptor. */ 631 uint16_t last; 632 633 /* The length of descriptor list. 634 * Only available for head-descriptor. */ 635 uint16_t num; 636 637 /* Used to preserve the order of fetching available descriptors. 638 * Only available for head-descriptor. */ 639 uint64_t counter; 640 641 /* The buffer id */ 642 uint16_t id; 643 644 /* The descriptor flags */ 645 uint16_t flags; 646 647 /* The buffer length */ 648 uint32_t len; 649 650 /* The buffer address */ 651 uint64_t addr; 652 } DescStatePacked; 653 654 typedef struct QueueRegionPacked { 655 /* The feature flags of this region. Now it's initialized to 0. */ 656 uint64_t features; 657 658 /* The version of this region. It's 1 currently. 659 * Zero value indicates an uninitialized buffer */ 660 uint16_t version; 661 662 /* The size of DescStatePacked array. It's equal to the virtqueue 663 * size. Slave could get it from queue size field of VhostUserInflight. */ 664 uint16_t desc_num; 665 666 /* The head of free DescStatePacked entry list */ 667 uint16_t free_head; 668 669 /* The old head of free DescStatePacked entry list */ 670 uint16_t old_free_head; 671 672 /* The used index of descriptor ring */ 673 uint16_t used_idx; 674 675 /* The old used index of descriptor ring */ 676 uint16_t old_used_idx; 677 678 /* Device ring wrap counter */ 679 uint8_t used_wrap_counter; 680 681 /* The old device ring wrap counter */ 682 uint8_t old_used_wrap_counter; 683 684 /* Padding */ 685 uint8_t padding[7]; 686 687 /* Used to track the state of each descriptor fetched from descriptor ring */ 688 DescStatePacked desc[0]; 689 } QueueRegionPacked; 690 691To track inflight I/O, the queue region should be processed as follows: 692 693When receiving available buffers from the driver: 694 695#. Get the next available descriptor entry from descriptor ring, ``d`` 696 697#. If ``d`` is head descriptor, 698 699 a. Set ``desc[old_free_head].num`` to 0 700 701 #. Set ``desc[old_free_head].counter`` to the value of global counter 702 703 #. Increase global counter by 1 704 705 #. Set ``desc[old_free_head].inflight`` to 1 706 707#. If ``d`` is last descriptor, set ``desc[old_free_head].last`` to 708 ``free_head`` 709 710#. Increase ``desc[old_free_head].num`` by 1 711 712#. Set ``desc[free_head].addr``, ``desc[free_head].len``, 713 ``desc[free_head].flags``, ``desc[free_head].id`` to ``d.addr``, 714 ``d.len``, ``d.flags``, ``d.id`` 715 716#. Set ``free_head`` to ``desc[free_head].next`` 717 718#. If ``d`` is last descriptor, set ``old_free_head`` to ``free_head`` 719 720When supplying used buffers to the driver: 721 7221. Get corresponding used head-descriptor entry from descriptor ring, 723 ``d`` 724 7252. Get corresponding ``DescStatePacked`` entry, ``e`` 726 7273. Set ``desc[e.last].next`` to ``free_head`` 728 7294. Set ``free_head`` to the index of ``e`` 730 731#. Steps 1,2,3,4 may be performed repeatedly if batching is possible 732 733#. Increase ``used_idx`` by the size of the batch and update 734 ``used_wrap_counter`` if needed 735 736#. Update ``d.flags`` 737 738#. Set the ``inflight`` field of each head ``DescStatePacked`` entry 739 in the batch to 0 740 741#. Set ``old_free_head``, ``old_used_idx``, ``old_used_wrap_counter`` 742 to ``free_head``, ``used_idx``, ``used_wrap_counter`` 743 744When reconnecting: 745 746#. If ``used_idx`` does not match ``old_used_idx`` (means the 747 ``inflight`` field of ``DescStatePacked`` entries in last batch may 748 be incorrect), 749 750 a. Get the next descriptor ring entry through ``old_used_idx``, ``d`` 751 752 #. Use ``old_used_wrap_counter`` to calculate the available flags 753 754 #. If ``d.flags`` is not equal to the calculated flags value (means 755 slave has submitted the buffer to guest driver before crash, so 756 it has to commit the in-progres update), set ``old_free_head``, 757 ``old_used_idx``, ``old_used_wrap_counter`` to ``free_head``, 758 ``used_idx``, ``used_wrap_counter`` 759 760#. Set ``free_head``, ``used_idx``, ``used_wrap_counter`` to 761 ``old_free_head``, ``old_used_idx``, ``old_used_wrap_counter`` 762 (roll back any in-progress update) 763 764#. Set the ``inflight`` field of each ``DescStatePacked`` entry in 765 free list to 0 766 767#. Resubmit inflight ``DescStatePacked`` entries in order of their 768 counter value 769 770Protocol features 771----------------- 772 773.. code:: c 774 775 #define VHOST_USER_PROTOCOL_F_MQ 0 776 #define VHOST_USER_PROTOCOL_F_LOG_SHMFD 1 777 #define VHOST_USER_PROTOCOL_F_RARP 2 778 #define VHOST_USER_PROTOCOL_F_REPLY_ACK 3 779 #define VHOST_USER_PROTOCOL_F_MTU 4 780 #define VHOST_USER_PROTOCOL_F_SLAVE_REQ 5 781 #define VHOST_USER_PROTOCOL_F_CROSS_ENDIAN 6 782 #define VHOST_USER_PROTOCOL_F_CRYPTO_SESSION 7 783 #define VHOST_USER_PROTOCOL_F_PAGEFAULT 8 784 #define VHOST_USER_PROTOCOL_F_CONFIG 9 785 #define VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD 10 786 #define VHOST_USER_PROTOCOL_F_HOST_NOTIFIER 11 787 #define VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD 12 788 789Master message types 790-------------------- 791 792``VHOST_USER_GET_FEATURES`` 793 :id: 1 794 :equivalent ioctl: ``VHOST_GET_FEATURES`` 795 :master payload: N/A 796 :slave payload: ``u64`` 797 798 Get from the underlying vhost implementation the features bitmask. 799 Feature bit ``VHOST_USER_F_PROTOCOL_FEATURES`` signals slave support 800 for ``VHOST_USER_GET_PROTOCOL_FEATURES`` and 801 ``VHOST_USER_SET_PROTOCOL_FEATURES``. 802 803``VHOST_USER_SET_FEATURES`` 804 :id: 2 805 :equivalent ioctl: ``VHOST_SET_FEATURES`` 806 :master payload: ``u64`` 807 808 Enable features in the underlying vhost implementation using a 809 bitmask. Feature bit ``VHOST_USER_F_PROTOCOL_FEATURES`` signals 810 slave support for ``VHOST_USER_GET_PROTOCOL_FEATURES`` and 811 ``VHOST_USER_SET_PROTOCOL_FEATURES``. 812 813``VHOST_USER_GET_PROTOCOL_FEATURES`` 814 :id: 15 815 :equivalent ioctl: ``VHOST_GET_FEATURES`` 816 :master payload: N/A 817 :slave payload: ``u64`` 818 819 Get the protocol feature bitmask from the underlying vhost 820 implementation. Only legal if feature bit 821 ``VHOST_USER_F_PROTOCOL_FEATURES`` is present in 822 ``VHOST_USER_GET_FEATURES``. 823 824.. Note:: 825 Slave that reported ``VHOST_USER_F_PROTOCOL_FEATURES`` must 826 support this message even before ``VHOST_USER_SET_FEATURES`` was 827 called. 828 829``VHOST_USER_SET_PROTOCOL_FEATURES`` 830 :id: 16 831 :equivalent ioctl: ``VHOST_SET_FEATURES`` 832 :master payload: ``u64`` 833 834 Enable protocol features in the underlying vhost implementation. 835 836 Only legal if feature bit ``VHOST_USER_F_PROTOCOL_FEATURES`` is present in 837 ``VHOST_USER_GET_FEATURES``. 838 839.. Note:: 840 Slave that reported ``VHOST_USER_F_PROTOCOL_FEATURES`` must support 841 this message even before ``VHOST_USER_SET_FEATURES`` was called. 842 843``VHOST_USER_SET_OWNER`` 844 :id: 3 845 :equivalent ioctl: ``VHOST_SET_OWNER`` 846 :master payload: N/A 847 848 Issued when a new connection is established. It sets the current 849 *master* as an owner of the session. This can be used on the *slave* 850 as a "session start" flag. 851 852``VHOST_USER_RESET_OWNER`` 853 :id: 4 854 :master payload: N/A 855 856.. admonition:: Deprecated 857 858 This is no longer used. Used to be sent to request disabling all 859 rings, but some clients interpreted it to also discard connection 860 state (this interpretation would lead to bugs). It is recommended 861 that clients either ignore this message, or use it to disable all 862 rings. 863 864``VHOST_USER_SET_MEM_TABLE`` 865 :id: 5 866 :equivalent ioctl: ``VHOST_SET_MEM_TABLE`` 867 :master payload: memory regions description 868 :slave payload: (postcopy only) memory regions description 869 870 Sets the memory map regions on the slave so it can translate the 871 vring addresses. In the ancillary data there is an array of file 872 descriptors for each memory mapped region. The size and ordering of 873 the fds matches the number and ordering of memory regions. 874 875 When ``VHOST_USER_POSTCOPY_LISTEN`` has been received, 876 ``SET_MEM_TABLE`` replies with the bases of the memory mapped 877 regions to the master. The slave must have mmap'd the regions but 878 not yet accessed them and should not yet generate a userfault 879 event. 880 881.. Note:: 882 ``NEED_REPLY_MASK`` is not set in this case. QEMU will then 883 reply back to the list of mappings with an empty 884 ``VHOST_USER_SET_MEM_TABLE`` as an acknowledgement; only upon 885 reception of this message may the guest start accessing the memory 886 and generating faults. 887 888``VHOST_USER_SET_LOG_BASE`` 889 :id: 6 890 :equivalent ioctl: ``VHOST_SET_LOG_BASE`` 891 :master payload: u64 892 :slave payload: N/A 893 894 Sets logging shared memory space. 895 896 When slave has ``VHOST_USER_PROTOCOL_F_LOG_SHMFD`` protocol feature, 897 the log memory fd is provided in the ancillary data of 898 ``VHOST_USER_SET_LOG_BASE`` message, the size and offset of shared 899 memory area provided in the message. 900 901``VHOST_USER_SET_LOG_FD`` 902 :id: 7 903 :equivalent ioctl: ``VHOST_SET_LOG_FD`` 904 :master payload: N/A 905 906 Sets the logging file descriptor, which is passed as ancillary data. 907 908``VHOST_USER_SET_VRING_NUM`` 909 :id: 8 910 :equivalent ioctl: ``VHOST_SET_VRING_NUM`` 911 :master payload: vring state description 912 913 Set the size of the queue. 914 915``VHOST_USER_SET_VRING_ADDR`` 916 :id: 9 917 :equivalent ioctl: ``VHOST_SET_VRING_ADDR`` 918 :master payload: vring address description 919 :slave payload: N/A 920 921 Sets the addresses of the different aspects of the vring. 922 923``VHOST_USER_SET_VRING_BASE`` 924 :id: 10 925 :equivalent ioctl: ``VHOST_SET_VRING_BASE`` 926 :master payload: vring state description 927 928 Sets the base offset in the available vring. 929 930``VHOST_USER_GET_VRING_BASE`` 931 :id: 11 932 :equivalent ioctl: ``VHOST_USER_GET_VRING_BASE`` 933 :master payload: vring state description 934 :slave payload: vring state description 935 936 Get the available vring base offset. 937 938``VHOST_USER_SET_VRING_KICK`` 939 :id: 12 940 :equivalent ioctl: ``VHOST_SET_VRING_KICK`` 941 :master payload: ``u64`` 942 943 Set the event file descriptor for adding buffers to the vring. It is 944 passed in the ancillary data. 945 946 Bits (0-7) of the payload contain the vring index. Bit 8 is the 947 invalid FD flag. This flag is set when there is no file descriptor 948 in the ancillary data. This signals that polling should be used 949 instead of waiting for a kick. 950 951``VHOST_USER_SET_VRING_CALL`` 952 :id: 13 953 :equivalent ioctl: ``VHOST_SET_VRING_CALL`` 954 :master payload: ``u64`` 955 956 Set the event file descriptor to signal when buffers are used. It is 957 passed in the ancillary data. 958 959 Bits (0-7) of the payload contain the vring index. Bit 8 is the 960 invalid FD flag. This flag is set when there is no file descriptor 961 in the ancillary data. This signals that polling will be used 962 instead of waiting for the call. 963 964``VHOST_USER_SET_VRING_ERR`` 965 :id: 14 966 :equivalent ioctl: ``VHOST_SET_VRING_ERR`` 967 :master payload: ``u64`` 968 969 Set the event file descriptor to signal when error occurs. It is 970 passed in the ancillary data. 971 972 Bits (0-7) of the payload contain the vring index. Bit 8 is the 973 invalid FD flag. This flag is set when there is no file descriptor 974 in the ancillary data. 975 976``VHOST_USER_GET_QUEUE_NUM`` 977 :id: 17 978 :equivalent ioctl: N/A 979 :master payload: N/A 980 :slave payload: u64 981 982 Query how many queues the backend supports. 983 984 This request should be sent only when ``VHOST_USER_PROTOCOL_F_MQ`` 985 is set in queried protocol features by 986 ``VHOST_USER_GET_PROTOCOL_FEATURES``. 987 988``VHOST_USER_SET_VRING_ENABLE`` 989 :id: 18 990 :equivalent ioctl: N/A 991 :master payload: vring state description 992 993 Signal slave to enable or disable corresponding vring. 994 995 This request should be sent only when 996 ``VHOST_USER_F_PROTOCOL_FEATURES`` has been negotiated. 997 998``VHOST_USER_SEND_RARP`` 999 :id: 19 1000 :equivalent ioctl: N/A 1001 :master payload: ``u64`` 1002 1003 Ask vhost user backend to broadcast a fake RARP to notify the migration 1004 is terminated for guest that does not support GUEST_ANNOUNCE. 1005 1006 Only legal if feature bit ``VHOST_USER_F_PROTOCOL_FEATURES`` is 1007 present in ``VHOST_USER_GET_FEATURES`` and protocol feature bit 1008 ``VHOST_USER_PROTOCOL_F_RARP`` is present in 1009 ``VHOST_USER_GET_PROTOCOL_FEATURES``. The first 6 bytes of the 1010 payload contain the mac address of the guest to allow the vhost user 1011 backend to construct and broadcast the fake RARP. 1012 1013``VHOST_USER_NET_SET_MTU`` 1014 :id: 20 1015 :equivalent ioctl: N/A 1016 :master payload: ``u64`` 1017 1018 Set host MTU value exposed to the guest. 1019 1020 This request should be sent only when ``VIRTIO_NET_F_MTU`` feature 1021 has been successfully negotiated, ``VHOST_USER_F_PROTOCOL_FEATURES`` 1022 is present in ``VHOST_USER_GET_FEATURES`` and protocol feature bit 1023 ``VHOST_USER_PROTOCOL_F_NET_MTU`` is present in 1024 ``VHOST_USER_GET_PROTOCOL_FEATURES``. 1025 1026 If ``VHOST_USER_PROTOCOL_F_REPLY_ACK`` is negotiated, slave must 1027 respond with zero in case the specified MTU is valid, or non-zero 1028 otherwise. 1029 1030``VHOST_USER_SET_SLAVE_REQ_FD`` 1031 :id: 21 1032 :equivalent ioctl: N/A 1033 :master payload: N/A 1034 1035 Set the socket file descriptor for slave initiated requests. It is passed 1036 in the ancillary data. 1037 1038 This request should be sent only when 1039 ``VHOST_USER_F_PROTOCOL_FEATURES`` has been negotiated, and protocol 1040 feature bit ``VHOST_USER_PROTOCOL_F_SLAVE_REQ`` bit is present in 1041 ``VHOST_USER_GET_PROTOCOL_FEATURES``. If 1042 ``VHOST_USER_PROTOCOL_F_REPLY_ACK`` is negotiated, slave must 1043 respond with zero for success, non-zero otherwise. 1044 1045``VHOST_USER_IOTLB_MSG`` 1046 :id: 22 1047 :equivalent ioctl: N/A (equivalent to ``VHOST_IOTLB_MSG`` message type) 1048 :master payload: ``struct vhost_iotlb_msg`` 1049 :slave payload: ``u64`` 1050 1051 Send IOTLB messages with ``struct vhost_iotlb_msg`` as payload. 1052 1053 Master sends such requests to update and invalidate entries in the 1054 device IOTLB. The slave has to acknowledge the request with sending 1055 zero as ``u64`` payload for success, non-zero otherwise. 1056 1057 This request should be send only when ``VIRTIO_F_IOMMU_PLATFORM`` 1058 feature has been successfully negotiated. 1059 1060``VHOST_USER_SET_VRING_ENDIAN`` 1061 :id: 23 1062 :equivalent ioctl: ``VHOST_SET_VRING_ENDIAN`` 1063 :master payload: vring state description 1064 1065 Set the endianness of a VQ for legacy devices. Little-endian is 1066 indicated with state.num set to 0 and big-endian is indicated with 1067 state.num set to 1. Other values are invalid. 1068 1069 This request should be sent only when 1070 ``VHOST_USER_PROTOCOL_F_CROSS_ENDIAN`` has been negotiated. 1071 Backends that negotiated this feature should handle both 1072 endiannesses and expect this message once (per VQ) during device 1073 configuration (ie. before the master starts the VQ). 1074 1075``VHOST_USER_GET_CONFIG`` 1076 :id: 24 1077 :equivalent ioctl: N/A 1078 :master payload: virtio device config space 1079 :slave payload: virtio device config space 1080 1081 When ``VHOST_USER_PROTOCOL_F_CONFIG`` is negotiated, this message is 1082 submitted by the vhost-user master to fetch the contents of the 1083 virtio device configuration space, vhost-user slave's payload size 1084 MUST match master's request, vhost-user slave uses zero length of 1085 payload to indicate an error to vhost-user master. The vhost-user 1086 master may cache the contents to avoid repeated 1087 ``VHOST_USER_GET_CONFIG`` calls. 1088 1089``VHOST_USER_SET_CONFIG`` 1090 :id: 25 1091 :equivalent ioctl: N/A 1092 :master payload: virtio device config space 1093 :slave payload: N/A 1094 1095 When ``VHOST_USER_PROTOCOL_F_CONFIG`` is negotiated, this message is 1096 submitted by the vhost-user master when the Guest changes the virtio 1097 device configuration space and also can be used for live migration 1098 on the destination host. The vhost-user slave must check the flags 1099 field, and slaves MUST NOT accept SET_CONFIG for read-only 1100 configuration space fields unless the live migration bit is set. 1101 1102``VHOST_USER_CREATE_CRYPTO_SESSION`` 1103 :id: 26 1104 :equivalent ioctl: N/A 1105 :master payload: crypto session description 1106 :slave payload: crypto session description 1107 1108 Create a session for crypto operation. The server side must return 1109 the session id, 0 or positive for success, negative for failure. 1110 This request should be sent only when 1111 ``VHOST_USER_PROTOCOL_F_CRYPTO_SESSION`` feature has been 1112 successfully negotiated. It's a required feature for crypto 1113 devices. 1114 1115``VHOST_USER_CLOSE_CRYPTO_SESSION`` 1116 :id: 27 1117 :equivalent ioctl: N/A 1118 :master payload: ``u64`` 1119 1120 Close a session for crypto operation which was previously 1121 created by ``VHOST_USER_CREATE_CRYPTO_SESSION``. 1122 1123 This request should be sent only when 1124 ``VHOST_USER_PROTOCOL_F_CRYPTO_SESSION`` feature has been 1125 successfully negotiated. It's a required feature for crypto 1126 devices. 1127 1128``VHOST_USER_POSTCOPY_ADVISE`` 1129 :id: 28 1130 :master payload: N/A 1131 :slave payload: userfault fd 1132 1133 When ``VHOST_USER_PROTOCOL_F_PAGEFAULT`` is supported, the master 1134 advises slave that a migration with postcopy enabled is underway, 1135 the slave must open a userfaultfd for later use. Note that at this 1136 stage the migration is still in precopy mode. 1137 1138``VHOST_USER_POSTCOPY_LISTEN`` 1139 :id: 29 1140 :master payload: N/A 1141 1142 Master advises slave that a transition to postcopy mode has 1143 happened. The slave must ensure that shared memory is registered 1144 with userfaultfd to cause faulting of non-present pages. 1145 1146 This is always sent sometime after a ``VHOST_USER_POSTCOPY_ADVISE``, 1147 and thus only when ``VHOST_USER_PROTOCOL_F_PAGEFAULT`` is supported. 1148 1149``VHOST_USER_POSTCOPY_END`` 1150 :id: 30 1151 :slave payload: ``u64`` 1152 1153 Master advises that postcopy migration has now completed. The slave 1154 must disable the userfaultfd. The response is an acknowledgement 1155 only. 1156 1157 When ``VHOST_USER_PROTOCOL_F_PAGEFAULT`` is supported, this message 1158 is sent at the end of the migration, after 1159 ``VHOST_USER_POSTCOPY_LISTEN`` was previously sent. 1160 1161 The value returned is an error indication; 0 is success. 1162 1163``VHOST_USER_GET_INFLIGHT_FD`` 1164 :id: 31 1165 :equivalent ioctl: N/A 1166 :master payload: inflight description 1167 1168 When ``VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD`` protocol feature has 1169 been successfully negotiated, this message is submitted by master to 1170 get a shared buffer from slave. The shared buffer will be used to 1171 track inflight I/O by slave. QEMU should retrieve a new one when vm 1172 reset. 1173 1174``VHOST_USER_SET_INFLIGHT_FD`` 1175 :id: 32 1176 :equivalent ioctl: N/A 1177 :master payload: inflight description 1178 1179 When ``VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD`` protocol feature has 1180 been successfully negotiated, this message is submitted by master to 1181 send the shared inflight buffer back to slave so that slave could 1182 get inflight I/O after a crash or restart. 1183 1184``VHOST_USER_GPU_SET_SOCKET`` 1185 :id: 33 1186 :equivalent ioctl: N/A 1187 :master payload: N/A 1188 1189 Sets the GPU protocol socket file descriptor, which is passed as 1190 ancillary data. The GPU protocol is used to inform the master of 1191 rendering state and updates. See vhost-user-gpu.rst for details. 1192 1193Slave message types 1194------------------- 1195 1196``VHOST_USER_SLAVE_IOTLB_MSG`` 1197 :id: 1 1198 :equivalent ioctl: N/A (equivalent to ``VHOST_IOTLB_MSG`` message type) 1199 :slave payload: ``struct vhost_iotlb_msg`` 1200 :master payload: N/A 1201 1202 Send IOTLB messages with ``struct vhost_iotlb_msg`` as payload. 1203 Slave sends such requests to notify of an IOTLB miss, or an IOTLB 1204 access failure. If ``VHOST_USER_PROTOCOL_F_REPLY_ACK`` is 1205 negotiated, and slave set the ``VHOST_USER_NEED_REPLY`` flag, master 1206 must respond with zero when operation is successfully completed, or 1207 non-zero otherwise. This request should be send only when 1208 ``VIRTIO_F_IOMMU_PLATFORM`` feature has been successfully 1209 negotiated. 1210 1211``VHOST_USER_SLAVE_CONFIG_CHANGE_MSG`` 1212 :id: 2 1213 :equivalent ioctl: N/A 1214 :slave payload: N/A 1215 :master payload: N/A 1216 1217 When ``VHOST_USER_PROTOCOL_F_CONFIG`` is negotiated, vhost-user 1218 slave sends such messages to notify that the virtio device's 1219 configuration space has changed, for those host devices which can 1220 support such feature, host driver can send ``VHOST_USER_GET_CONFIG`` 1221 message to slave to get the latest content. If 1222 ``VHOST_USER_PROTOCOL_F_REPLY_ACK`` is negotiated, and slave set the 1223 ``VHOST_USER_NEED_REPLY`` flag, master must respond with zero when 1224 operation is successfully completed, or non-zero otherwise. 1225 1226``VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG`` 1227 :id: 3 1228 :equivalent ioctl: N/A 1229 :slave payload: vring area description 1230 :master payload: N/A 1231 1232 Sets host notifier for a specified queue. The queue index is 1233 contained in the ``u64`` field of the vring area description. The 1234 host notifier is described by the file descriptor (typically it's a 1235 VFIO device fd) which is passed as ancillary data and the size 1236 (which is mmap size and should be the same as host page size) and 1237 offset (which is mmap offset) carried in the vring area 1238 description. QEMU can mmap the file descriptor based on the size and 1239 offset to get a memory range. Registering a host notifier means 1240 mapping this memory range to the VM as the specified queue's notify 1241 MMIO region. Slave sends this request to tell QEMU to de-register 1242 the existing notifier if any and register the new notifier if the 1243 request is sent with a file descriptor. 1244 1245 This request should be sent only when 1246 ``VHOST_USER_PROTOCOL_F_HOST_NOTIFIER`` protocol feature has been 1247 successfully negotiated. 1248 1249.. _reply_ack: 1250 1251VHOST_USER_PROTOCOL_F_REPLY_ACK 1252------------------------------- 1253 1254The original vhost-user specification only demands replies for certain 1255commands. This differs from the vhost protocol implementation where 1256commands are sent over an ``ioctl()`` call and block until the client 1257has completed. 1258 1259With this protocol extension negotiated, the sender (QEMU) can set the 1260``need_reply`` [Bit 3] flag to any command. This indicates that the 1261client MUST respond with a Payload ``VhostUserMsg`` indicating success 1262or failure. The payload should be set to zero on success or non-zero 1263on failure, unless the message already has an explicit reply body. 1264 1265The response payload gives QEMU a deterministic indication of the result 1266of the command. Today, QEMU is expected to terminate the main vhost-user 1267loop upon receiving such errors. In future, qemu could be taught to be more 1268resilient for selective requests. 1269 1270For the message types that already solicit a reply from the client, 1271the presence of ``VHOST_USER_PROTOCOL_F_REPLY_ACK`` or need_reply bit 1272being set brings no behavioural change. (See the Communication_ 1273section for details.) 1274 1275.. _backend_conventions: 1276 1277Backend program conventions 1278=========================== 1279 1280vhost-user backends can provide various devices & services and may 1281need to be configured manually depending on the use case. However, it 1282is a good idea to follow the conventions listed here when 1283possible. Users, QEMU or libvirt, can then rely on some common 1284behaviour to avoid heterogenous configuration and management of the 1285backend programs and facilitate interoperability. 1286 1287Each backend installed on a host system should come with at least one 1288JSON file that conforms to the vhost-user.json schema. Each file 1289informs the management applications about the backend type, and binary 1290location. In addition, it defines rules for management apps for 1291picking the highest priority backend when multiple match the search 1292criteria (see ``@VhostUserBackend`` documentation in the schema file). 1293 1294If the backend is not capable of enabling a requested feature on the 1295host (such as 3D acceleration with virgl), or the initialization 1296failed, the backend should fail to start early and exit with a status 1297!= 0. It may also print a message to stderr for further details. 1298 1299The backend program must not daemonize itself, but it may be 1300daemonized by the management layer. It may also have a restricted 1301access to the system. 1302 1303File descriptors 0, 1 and 2 will exist, and have regular 1304stdin/stdout/stderr usage (they may have been redirected to /dev/null 1305by the management layer, or to a log handler). 1306 1307The backend program must end (as quickly and cleanly as possible) when 1308the SIGTERM signal is received. Eventually, it may receive SIGKILL by 1309the management layer after a few seconds. 1310 1311The following command line options have an expected behaviour. They 1312are mandatory, unless explicitly said differently: 1313 1314--socket-path=PATH 1315 1316 This option specify the location of the vhost-user Unix domain socket. 1317 It is incompatible with --fd. 1318 1319--fd=FDNUM 1320 1321 When this argument is given, the backend program is started with the 1322 vhost-user socket as file descriptor FDNUM. It is incompatible with 1323 --socket-path. 1324 1325--print-capabilities 1326 1327 Output to stdout the backend capabilities in JSON format, and then 1328 exit successfully. Other options and arguments should be ignored, and 1329 the backend program should not perform its normal function. The 1330 capabilities can be reported dynamically depending on the host 1331 capabilities. 1332 1333The JSON output is described in the ``vhost-user.json`` schema, by 1334```@VHostUserBackendCapabilities``. Example: 1335 1336.. code:: json 1337 1338 { 1339 "type": "foo", 1340 "features": [ 1341 "feature-a", 1342 "feature-b" 1343 ] 1344 } 1345 1346vhost-user-input 1347---------------- 1348 1349Command line options: 1350 1351--evdev-path=PATH 1352 1353 Specify the linux input device. 1354 1355 (optional) 1356 1357--no-grab 1358 1359 Do no request exclusive access to the input device. 1360 1361 (optional) 1362 1363vhost-user-gpu 1364-------------- 1365 1366Command line options: 1367 1368--render-node=PATH 1369 1370 Specify the GPU DRM render node. 1371 1372 (optional) 1373 1374--virgl 1375 1376 Enable virgl rendering support. 1377 1378 (optional) 1379 1380vhost-user-blk 1381-------------- 1382 1383Command line options: 1384 1385--blk-file=PATH 1386 1387 Specify block device or file path. 1388 1389 (optional) 1390 1391--read-only 1392 1393 Enable read-only. 1394 1395 (optional) 1396