xref: /openbmc/qemu/docs/system/devices/nvme.rst (revision 8092b51849499be97c42c0f1a832ade969e38724)
1==============
2NVMe Emulation
3==============
4
5QEMU provides NVMe emulation through the ``nvme``, ``nvme-ns`` and
6``nvme-subsys`` devices.
7
8See the following sections for specific information on
9
10  * `Adding NVMe Devices`_, `additional namespaces`_ and `NVM subsystems`_.
11  * Configuration of `Optional Features`_ such as `Controller Memory Buffer`_,
12    `Simple Copy`_, `Zoned Namespaces`_, `metadata`_ and `End-to-End Data
13    Protection`_,
14
15Adding NVMe Devices
16===================
17
18Controller Emulation
19--------------------
20
21The QEMU emulated NVMe controller implements version 1.4 of the NVM Express
22specification. All mandatory features are implement with a couple of exceptions
23and limitations:
24
25  * Accounting numbers in the SMART/Health log page are reset when the device
26    is power cycled.
27  * Interrupt Coalescing is not supported and is disabled by default.
28
29The simplest way to attach an NVMe controller on the QEMU PCI bus is to add the
30following parameters:
31
32.. code-block:: console
33
34    -drive file=nvm.img,if=none,id=nvm
35    -device nvme,serial=deadbeef,drive=nvm
36
37There are a number of optional general parameters for the ``nvme`` device. Some
38are mentioned here, but see ``-device nvme,help`` to list all possible
39parameters.
40
41``max_ioqpairs=UINT32`` (default: ``64``)
42  Set the maximum number of allowed I/O queue pairs. This replaces the
43  deprecated ``num_queues`` parameter.
44
45``msix_qsize=UINT16`` (default: ``65``)
46  The number of MSI-X vectors that the device should support.
47
48``mdts=UINT8`` (default: ``7``)
49  Set the Maximum Data Transfer Size of the device.
50
51``use-intel-id`` (default: ``off``)
52  Since QEMU 5.2, the device uses a QEMU allocated "Red Hat" PCI Device and
53  Vendor ID. Set this to ``on`` to revert to the unallocated Intel ID
54  previously used.
55
56Additional Namespaces
57---------------------
58
59In the simplest possible invocation sketched above, the device only support a
60single namespace with the namespace identifier ``1``. To support multiple
61namespaces and additional features, the ``nvme-ns`` device must be used.
62
63.. code-block:: console
64
65   -device nvme,id=nvme-ctrl-0,serial=deadbeef
66   -drive file=nvm-1.img,if=none,id=nvm-1
67   -device nvme-ns,drive=nvm-1
68   -drive file=nvm-2.img,if=none,id=nvm-2
69   -device nvme-ns,drive=nvm-2
70
71The namespaces defined by the ``nvme-ns`` device will attach to the most
72recently defined ``nvme-bus`` that is created by the ``nvme`` device. Namespace
73identifers are allocated automatically, starting from ``1``.
74
75There are a number of parameters available:
76
77``nsid`` (default: ``0``)
78  Explicitly set the namespace identifier.
79
80``uuid`` (default: *autogenerated*)
81  Set the UUID of the namespace. This will be reported as a "Namespace UUID"
82  descriptor in the Namespace Identification Descriptor List.
83
84``eui64``
85  Set the EUI-64 of the namespace. This will be reported as a "IEEE Extended
86  Unique Identifier" descriptor in the Namespace Identification Descriptor List.
87  Since machine type 6.1 a non-zero default value is used if the parameter
88  is not provided. For earlier machine types the field defaults to 0.
89
90``bus``
91  If there are more ``nvme`` devices defined, this parameter may be used to
92  attach the namespace to a specific ``nvme`` device (identified by an ``id``
93  parameter on the controller device).
94
95NVM Subsystems
96--------------
97
98Additional features becomes available if the controller device (``nvme``) is
99linked to an NVM Subsystem device (``nvme-subsys``).
100
101The NVM Subsystem emulation allows features such as shared namespaces and
102multipath I/O.
103
104.. code-block:: console
105
106   -device nvme-subsys,id=nvme-subsys-0,nqn=subsys0
107   -device nvme,serial=a,subsys=nvme-subsys-0
108   -device nvme,serial=b,subsys=nvme-subsys-0
109
110This will create an NVM subsystem with two controllers. Having controllers
111linked to an ``nvme-subsys`` device allows additional ``nvme-ns`` parameters:
112
113``shared`` (default: ``off``)
114  Specifies that the namespace will be attached to all controllers in the
115  subsystem. If set to ``off`` (the default), the namespace will remain a
116  private namespace and may only be attached to a single controller at a time.
117
118``detached`` (default: ``off``)
119  If set to ``on``, the namespace will be be available in the subsystem, but
120  not attached to any controllers initially.
121
122Thus, adding
123
124.. code-block:: console
125
126   -drive file=nvm-1.img,if=none,id=nvm-1
127   -device nvme-ns,drive=nvm-1,nsid=1,shared=on
128   -drive file=nvm-2.img,if=none,id=nvm-2
129   -device nvme-ns,drive=nvm-2,nsid=3,detached=on
130
131will cause NSID 1 will be a shared namespace (due to ``shared=on``) that is
132initially attached to both controllers. NSID 3 will be a private namespace
133(i.e. only attachable to a single controller at a time) and will not be
134attached to any controller initially (due to ``detached=on``).
135
136Optional Features
137=================
138
139Controller Memory Buffer
140------------------------
141
142``nvme`` device parameters related to the Controller Memory Buffer support:
143
144``cmb_size_mb=UINT32`` (default: ``0``)
145  This adds a Controller Memory Buffer of the given size at offset zero in BAR
146  2.
147
148``legacy-cmb`` (default: ``off``)
149  By default, the device uses the "v1.4 scheme" for the Controller Memory
150  Buffer support (i.e, the CMB is initially disabled and must be explicitly
151  enabled by the host). Set this to ``on`` to behave as a v1.3 device wrt. the
152  CMB.
153
154Simple Copy
155-----------
156
157The device includes support for TP 4065 ("Simple Copy Command"). A number of
158additional ``nvme-ns`` device parameters may be used to control the Copy
159command limits:
160
161``mssrl=UINT16`` (default: ``128``)
162  Set the Maximum Single Source Range Length (``MSSRL``). This is the maximum
163  number of logical blocks that may be specified in each source range.
164
165``mcl=UINT32`` (default: ``128``)
166  Set the Maximum Copy Length (``MCL``). This is the maximum number of logical
167  blocks that may be specified in a Copy command (the total for all source
168  ranges).
169
170``msrc=UINT8`` (default: ``127``)
171  Set the Maximum Source Range Count (``MSRC``). This is the maximum number of
172  source ranges that may be used in a Copy command. This is a 0's based value.
173
174Zoned Namespaces
175----------------
176
177A namespaces may be "Zoned" as defined by TP 4053 ("Zoned Namespaces"). Set
178``zoned=on`` on an ``nvme-ns`` device to configure it as a zoned namespace.
179
180The namespace may be configured with additional parameters
181
182``zoned.zone_size=SIZE`` (default: ``128MiB``)
183  Define the zone size (``ZSZE``).
184
185``zoned.zone_capacity=SIZE`` (default: ``0``)
186  Define the zone capacity (``ZCAP``). If left at the default (``0``), the zone
187  capacity will equal the zone size.
188
189``zoned.descr_ext_size=UINT32`` (default: ``0``)
190  Set the Zone Descriptor Extension Size (``ZDES``). Must be a multiple of 64
191  bytes.
192
193``zoned.cross_read=BOOL`` (default: ``off``)
194  Set to ``on`` to allow reads to cross zone boundaries.
195
196``zoned.max_active=UINT32`` (default: ``0``)
197  Set the maximum number of active resources (``MAR``). The default (``0``)
198  allows all zones to be active.
199
200``zoned.max_open=UINT32`` (default: ``0``)
201  Set the maximum number of open resources (``MOR``). The default (``0``)
202  allows all zones to be open. If ``zoned.max_active`` is specified, this value
203  must be less than or equal to that.
204
205``zoned.zasl=UINT8`` (default: ``0``)
206  Set the maximum data transfer size for the Zone Append command. Like
207  ``mdts``, the value is specified as a power of two (2^n) and is in units of
208  the minimum memory page size (CAP.MPSMIN). The default value (``0``)
209  has this property inherit the ``mdts`` value.
210
211Metadata
212--------
213
214The virtual namespace device supports LBA metadata in the form separate
215metadata (``MPTR``-based) and extended LBAs.
216
217``ms=UINT16`` (default: ``0``)
218  Defines the number of metadata bytes per LBA.
219
220``mset=UINT8`` (default: ``0``)
221  Set to ``1`` to enable extended LBAs.
222
223End-to-End Data Protection
224--------------------------
225
226The virtual namespace device supports DIF- and DIX-based protection information
227(depending on ``mset``).
228
229``pi=UINT8`` (default: ``0``)
230  Enable protection information of the specified type (type ``1``, ``2`` or
231  ``3``).
232
233``pil=UINT8`` (default: ``0``)
234  Controls the location of the protection information within the metadata. Set
235  to ``1`` to transfer protection information as the first eight bytes of
236  metadata. Otherwise, the protection information is transferred as the last
237  eight bytes.
238