xref: /openbmc/linux/Documentation/userspace-api/media/v4l/dev-stateless-decoder.rst (revision 4464005a12b5c79e1a364e6272ee10a83413f928)
1.. SPDX-License-Identifier: GPL-2.0
2
3.. _stateless_decoder:
4
5**************************************************
6Memory-to-memory Stateless Video Decoder Interface
7**************************************************
8
9A stateless decoder is a decoder that works without retaining any kind of state
10between processed frames. This means that each frame is decoded independently
11of any previous and future frames, and that the client is responsible for
12maintaining the decoding state and providing it to the decoder with each
13decoding request. This is in contrast to the stateful video decoder interface,
14where the hardware and driver maintain the decoding state and all the client
15has to do is to provide the raw encoded stream and dequeue decoded frames in
16display order.
17
18This section describes how user-space ("the client") is expected to communicate
19with stateless decoders in order to successfully decode an encoded stream.
20Compared to stateful codecs, the decoder/client sequence is simpler, but the
21cost of this simplicity is extra complexity in the client which is responsible
22for maintaining a consistent decoding state.
23
24Stateless decoders make use of the :ref:`media-request-api`. A stateless
25decoder must expose the ``V4L2_BUF_CAP_SUPPORTS_REQUESTS`` capability on its
26``OUTPUT`` queue when :c:func:`VIDIOC_REQBUFS` or :c:func:`VIDIOC_CREATE_BUFS`
27are invoked.
28
29Depending on the encoded formats supported by the decoder, a single decoded
30frame may be the result of several decode requests (for instance, H.264 streams
31with multiple slices per frame). Decoders that support such formats must also
32expose the ``V4L2_BUF_CAP_SUPPORTS_M2M_HOLD_CAPTURE_BUF`` capability on their
33``OUTPUT`` queue.
34
35Querying capabilities
36=====================
37
381. To enumerate the set of coded formats supported by the decoder, the client
39   calls :c:func:`VIDIOC_ENUM_FMT` on the ``OUTPUT`` queue.
40
41   * The driver must always return the full set of supported ``OUTPUT`` formats,
42     irrespective of the format currently set on the ``CAPTURE`` queue.
43
44   * Simultaneously, the driver must restrain the set of values returned by
45     codec-specific capability controls (such as H.264 profiles) to the set
46     actually supported by the hardware.
47
482. To enumerate the set of supported raw formats, the client calls
49   :c:func:`VIDIOC_ENUM_FMT` on the ``CAPTURE`` queue.
50
51   * The driver must return only the formats supported for the format currently
52     active on the ``OUTPUT`` queue.
53
54   * Depending on the currently set ``OUTPUT`` format, the set of supported raw
55     formats may depend on the value of some codec-dependent controls.
56     The client is responsible for making sure that these controls are set
57     before querying the ``CAPTURE`` queue. Failure to do so will result in the
58     default values for these controls being used, and a returned set of formats
59     that may not be usable for the media the client is trying to decode.
60
613. The client may use :c:func:`VIDIOC_ENUM_FRAMESIZES` to detect supported
62   resolutions for a given format, passing desired pixel format in
63   :c:type:`v4l2_frmsizeenum`'s ``pixel_format``.
64
654. Supported profiles and levels for the current ``OUTPUT`` format, if
66   applicable, may be queried using their respective controls via
67   :c:func:`VIDIOC_QUERYCTRL`.
68
69Initialization
70==============
71
721. Set the coded format on the ``OUTPUT`` queue via :c:func:`VIDIOC_S_FMT`.
73
74   * **Required fields:**
75
76     ``type``
77         a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``.
78
79     ``pixelformat``
80         a coded pixel format.
81
82     ``width``, ``height``
83         coded width and height parsed from the stream.
84
85     other fields
86         follow standard semantics.
87
88   .. note::
89
90      Changing the ``OUTPUT`` format may change the currently set ``CAPTURE``
91      format. The driver will derive a new ``CAPTURE`` format from the
92      ``OUTPUT`` format being set, including resolution, colorimetry
93      parameters, etc. If the client needs a specific ``CAPTURE`` format,
94      it must adjust it afterwards.
95
962. Call :c:func:`VIDIOC_S_EXT_CTRLS` to set all the controls (parsed headers,
97   etc.) required by the ``OUTPUT`` format to enumerate the ``CAPTURE`` formats.
98
993. Call :c:func:`VIDIOC_G_FMT` for ``CAPTURE`` queue to get the format for the
100   destination buffers parsed/decoded from the bytestream.
101
102   * **Required fields:**
103
104     ``type``
105         a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``.
106
107   * **Returned fields:**
108
109     ``width``, ``height``
110         frame buffer resolution for the decoded frames.
111
112     ``pixelformat``
113         pixel format for decoded frames.
114
115     ``num_planes`` (for _MPLANE ``type`` only)
116         number of planes for pixelformat.
117
118     ``sizeimage``, ``bytesperline``
119         as per standard semantics; matching frame buffer format.
120
121   .. note::
122
123      The value of ``pixelformat`` may be any pixel format supported for the
124      ``OUTPUT`` format, based on the hardware capabilities. It is suggested
125      that the driver chooses the preferred/optimal format for the current
126      configuration. For example, a YUV format may be preferred over an RGB
127      format, if an additional conversion step would be required for RGB.
128
1294. *[optional]* Enumerate ``CAPTURE`` formats via :c:func:`VIDIOC_ENUM_FMT` on
130   the ``CAPTURE`` queue. The client may use this ioctl to discover which
131   alternative raw formats are supported for the current ``OUTPUT`` format and
132   select one of them via :c:func:`VIDIOC_S_FMT`.
133
134   .. note::
135
136      The driver will return only formats supported for the currently selected
137      ``OUTPUT`` format and currently set controls, even if more formats may be
138      supported by the decoder in general.
139
140      For example, a decoder may support YUV and RGB formats for
141      resolutions 1920x1088 and lower, but only YUV for higher resolutions (due
142      to hardware limitations). After setting a resolution of 1920x1088 or lower
143      as the ``OUTPUT`` format, :c:func:`VIDIOC_ENUM_FMT` may return a set of
144      YUV and RGB pixel formats, but after setting a resolution higher than
145      1920x1088, the driver will not return RGB pixel formats, since they are
146      unsupported for this resolution.
147
1485. *[optional]* Choose a different ``CAPTURE`` format than suggested via
149   :c:func:`VIDIOC_S_FMT` on ``CAPTURE`` queue. It is possible for the client to
150   choose a different format than selected/suggested by the driver in
151   :c:func:`VIDIOC_G_FMT`.
152
153    * **Required fields:**
154
155      ``type``
156          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``.
157
158      ``pixelformat``
159          a raw pixel format.
160
161      ``width``, ``height``
162         frame buffer resolution of the decoded stream; typically unchanged from
163         what was returned with :c:func:`VIDIOC_G_FMT`, but it may be different
164         if the hardware supports composition and/or scaling.
165
166   After performing this step, the client must perform step 3 again in order
167   to obtain up-to-date information about the buffers size and layout.
168
1696. Allocate source (bytestream) buffers via :c:func:`VIDIOC_REQBUFS` on
170   ``OUTPUT`` queue.
171
172    * **Required fields:**
173
174      ``count``
175          requested number of buffers to allocate; greater than zero.
176
177      ``type``
178          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``.
179
180      ``memory``
181          follows standard semantics.
182
183    * **Return fields:**
184
185      ``count``
186          actual number of buffers allocated.
187
188    * If required, the driver will adjust ``count`` to be equal or bigger to the
189      minimum of required number of ``OUTPUT`` buffers for the given format and
190      requested count. The client must check this value after the ioctl returns
191      to get the actual number of buffers allocated.
192
1937. Allocate destination (raw format) buffers via :c:func:`VIDIOC_REQBUFS` on the
194   ``CAPTURE`` queue.
195
196    * **Required fields:**
197
198      ``count``
199          requested number of buffers to allocate; greater than zero. The client
200          is responsible for deducing the minimum number of buffers required
201          for the stream to be properly decoded (taking e.g. reference frames
202          into account) and pass an equal or bigger number.
203
204      ``type``
205          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``.
206
207      ``memory``
208          follows standard semantics. ``V4L2_MEMORY_USERPTR`` is not supported
209          for ``CAPTURE`` buffers.
210
211    * **Return fields:**
212
213      ``count``
214          adjusted to allocated number of buffers, in case the codec requires
215          more buffers than requested.
216
217    * The driver must adjust count to the minimum of required number of
218      ``CAPTURE`` buffers for the current format, stream configuration and
219      requested count. The client must check this value after the ioctl
220      returns to get the number of buffers allocated.
221
2228. Allocate requests (likely one per ``OUTPUT`` buffer) via
223    :c:func:`MEDIA_IOC_REQUEST_ALLOC` on the media device.
224
2259. Start streaming on both ``OUTPUT`` and ``CAPTURE`` queues via
226    :c:func:`VIDIOC_STREAMON`.
227
228Decoding
229========
230
231For each frame, the client is responsible for submitting at least one request to
232which the following is attached:
233
234* The amount of encoded data expected by the codec for its current
235  configuration, as a buffer submitted to the ``OUTPUT`` queue. Typically, this
236  corresponds to one frame worth of encoded data, but some formats may allow (or
237  require) different amounts per unit.
238* All the metadata needed to decode the submitted encoded data, in the form of
239  controls relevant to the format being decoded.
240
241The amount of data and contents of the source ``OUTPUT`` buffer, as well as the
242controls that must be set on the request, depend on the active coded pixel
243format and might be affected by codec-specific extended controls, as stated in
244documentation of each format.
245
246If there is a possibility that the decoded frame will require one or more
247decode requests after the current one in order to be produced, then the client
248must set the ``V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF`` flag on the ``OUTPUT``
249buffer. This will result in the (potentially partially) decoded ``CAPTURE``
250buffer not being made available for dequeueing, and reused for the next decode
251request if the timestamp of the next ``OUTPUT`` buffer has not changed.
252
253A typical frame would thus be decoded using the following sequence:
254
2551. Queue an ``OUTPUT`` buffer containing one unit of encoded bytestream data for
256   the decoding request, using :c:func:`VIDIOC_QBUF`.
257
258    * **Required fields:**
259
260      ``index``
261          index of the buffer being queued.
262
263      ``type``
264          type of the buffer.
265
266      ``bytesused``
267          number of bytes taken by the encoded data frame in the buffer.
268
269      ``flags``
270          the ``V4L2_BUF_FLAG_REQUEST_FD`` flag must be set. Additionally, if
271          we are not sure that the current decode request is the last one needed
272          to produce a fully decoded frame, then
273          ``V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF`` must also be set.
274
275      ``request_fd``
276          must be set to the file descriptor of the decoding request.
277
278      ``timestamp``
279          must be set to a unique value per frame. This value will be propagated
280          into the decoded frame's buffer and can also be used to use this frame
281          as the reference of another. If using multiple decode requests per
282          frame, then the timestamps of all the ``OUTPUT`` buffers for a given
283          frame must be identical. If the timestamp changes, then the currently
284          held ``CAPTURE`` buffer will be made available for dequeuing and the
285          current request will work on a new ``CAPTURE`` buffer.
286
2872. Set the codec-specific controls for the decoding request, using
288   :c:func:`VIDIOC_S_EXT_CTRLS`.
289
290    * **Required fields:**
291
292      ``which``
293          must be ``V4L2_CTRL_WHICH_REQUEST_VAL``.
294
295      ``request_fd``
296          must be set to the file descriptor of the decoding request.
297
298      other fields
299          other fields are set as usual when setting controls. The ``controls``
300          array must contain all the codec-specific controls required to decode
301          a frame.
302
303   .. note::
304
305      It is possible to specify the controls in different invocations of
306      :c:func:`VIDIOC_S_EXT_CTRLS`, or to overwrite a previously set control, as
307      long as ``request_fd`` and ``which`` are properly set. The controls state
308      at the moment of request submission is the one that will be considered.
309
310   .. note::
311
312      The order in which steps 1 and 2 take place is interchangeable.
313
3143. Submit the request by invoking :c:func:`MEDIA_REQUEST_IOC_QUEUE` on the
315   request FD.
316
317    If the request is submitted without an ``OUTPUT`` buffer, or if some of the
318    required controls are missing from the request, then
319    :c:func:`MEDIA_REQUEST_IOC_QUEUE` will return ``-ENOENT``. If more than one
320    ``OUTPUT`` buffer is queued, then it will return ``-EINVAL``.
321    :c:func:`MEDIA_REQUEST_IOC_QUEUE` returning non-zero means that no
322    ``CAPTURE`` buffer will be produced for this request.
323
324``CAPTURE`` buffers must not be part of the request, and are queued
325independently. They are returned in decode order (i.e. the same order as coded
326frames were submitted to the ``OUTPUT`` queue).
327
328Runtime decoding errors are signaled by the dequeued ``CAPTURE`` buffers
329carrying the ``V4L2_BUF_FLAG_ERROR`` flag. If a decoded reference frame has an
330error, then all following decoded frames that refer to it also have the
331``V4L2_BUF_FLAG_ERROR`` flag set, although the decoder will still try to
332produce (likely corrupted) frames.
333
334Buffer management while decoding
335================================
336Contrary to stateful decoders, a stateless decoder does not perform any kind of
337buffer management: it only guarantees that dequeued ``CAPTURE`` buffers can be
338used by the client for as long as they are not queued again. "Used" here
339encompasses using the buffer for compositing or display.
340
341A dequeued capture buffer can also be used as the reference frame of another
342buffer.
343
344A frame is specified as reference by converting its timestamp into nanoseconds,
345and storing it into the relevant member of a codec-dependent control structure.
346The :c:func:`v4l2_timeval_to_ns` function must be used to perform that
347conversion. The timestamp of a frame can be used to reference it as soon as all
348its units of encoded data are successfully submitted to the ``OUTPUT`` queue.
349
350A decoded buffer containing a reference frame must not be reused as a decoding
351target until all the frames referencing it have been decoded. The safest way to
352achieve this is to refrain from queueing a reference buffer until all the
353decoded frames referencing it have been dequeued. However, if the driver can
354guarantee that buffers queued to the ``CAPTURE`` queue are processed in queued
355order, then user-space can take advantage of this guarantee and queue a
356reference buffer when the following conditions are met:
357
3581. All the requests for frames affected by the reference frame have been
359   queued, and
360
3612. A sufficient number of ``CAPTURE`` buffers to cover all the decoded
362   referencing frames have been queued.
363
364When queuing a decoding request, the driver will increase the reference count of
365all the resources associated with reference frames. This means that the client
366can e.g. close the DMABUF file descriptors of reference frame buffers if it
367won't need them afterwards.
368
369Seeking
370=======
371In order to seek, the client just needs to submit requests using input buffers
372corresponding to the new stream position. It must however be aware that
373resolution may have changed and follow the dynamic resolution change sequence in
374that case. Also depending on the codec used, picture parameters (e.g. SPS/PPS
375for H.264) may have changed and the client is responsible for making sure that a
376valid state is sent to the decoder.
377
378The client is then free to ignore any returned ``CAPTURE`` buffer that comes
379from the pre-seek position.
380
381Pausing
382=======
383
384In order to pause, the client can just cease queuing buffers onto the ``OUTPUT``
385queue. Without source bytestream data, there is no data to process and the codec
386will remain idle.
387
388Dynamic resolution change
389=========================
390
391If the client detects a resolution change in the stream, it will need to perform
392the initialization sequence again with the new resolution:
393
3941. If the last submitted request resulted in a ``CAPTURE`` buffer being
395   held by the use of the ``V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF`` flag, then the
396   last frame is not available on the ``CAPTURE`` queue. In this case, a
397   ``V4L2_DEC_CMD_FLUSH`` command shall be sent. This will make the driver
398   dequeue the held ``CAPTURE`` buffer.
399
4002. Wait until all submitted requests have completed and dequeue the
401   corresponding output buffers.
402
4033. Call :c:func:`VIDIOC_STREAMOFF` on both the ``OUTPUT`` and ``CAPTURE``
404   queues.
405
4064. Free all ``CAPTURE`` buffers by calling :c:func:`VIDIOC_REQBUFS` on the
407   ``CAPTURE`` queue with a buffer count of zero.
408
4095. Perform the initialization sequence again (minus the allocation of
410   ``OUTPUT`` buffers), with the new resolution set on the ``OUTPUT`` queue.
411   Note that due to resolution constraints, a different format may need to be
412   picked on the ``CAPTURE`` queue.
413
414Drain
415=====
416
417If the last submitted request resulted in a ``CAPTURE`` buffer being
418held by the use of the ``V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF`` flag, then the
419last frame is not available on the ``CAPTURE`` queue. In this case, a
420``V4L2_DEC_CMD_FLUSH`` command shall be sent. This will make the driver
421dequeue the held ``CAPTURE`` buffer.
422
423After that, in order to drain the stream on a stateless decoder, the client
424just needs to wait until all the submitted requests are completed.
425