Message ID | 20180724140621.59624-2-tfiga@chromium.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Document memory-to-memory video codec interfaces | expand |
Hi Tomasz, Many, many thanks for working on this! It's a great document and when done it will be very useful indeed. Review comments follow... On 24/07/18 16:06, Tomasz Figa wrote: > Due to complexity of the video decoding process, the V4L2 drivers of > stateful decoder hardware require specific sequences of V4L2 API calls > to be followed. These include capability enumeration, initialization, > decoding, seek, pause, dynamic resolution change, drain and end of > stream. > > Specifics of the above have been discussed during Media Workshops at > LinuxCon Europe 2012 in Barcelona and then later Embedded Linux > Conference Europe 2014 in Düsseldorf. The de facto Codec API that > originated at those events was later implemented by the drivers we already > have merged in mainline, such as s5p-mfc or coda. > > The only thing missing was the real specification included as a part of > Linux Media documentation. Fix it now and document the decoder part of > the Codec API. > > Signed-off-by: Tomasz Figa <tfiga@chromium.org> > --- > Documentation/media/uapi/v4l/dev-decoder.rst | 872 +++++++++++++++++++ > Documentation/media/uapi/v4l/devices.rst | 1 + > Documentation/media/uapi/v4l/v4l2.rst | 10 +- > 3 files changed, 882 insertions(+), 1 deletion(-) > create mode 100644 Documentation/media/uapi/v4l/dev-decoder.rst > > diff --git a/Documentation/media/uapi/v4l/dev-decoder.rst b/Documentation/media/uapi/v4l/dev-decoder.rst > new file mode 100644 > index 000000000000..f55d34d2f860 > --- /dev/null > +++ b/Documentation/media/uapi/v4l/dev-decoder.rst > @@ -0,0 +1,872 @@ > +.. -*- coding: utf-8; mode: rst -*- > + > +.. _decoder: > + > +**************************************** > +Memory-to-memory Video Decoder Interface > +**************************************** > + > +Input data to a video decoder are buffers containing unprocessed video > +stream (e.g. Annex-B H.264/HEVC stream, raw VP8/9 stream). The driver is > +expected not to require any additional information from the client to > +process these buffers. Output data are raw video frames returned in display > +order. > + > +Performing software parsing, processing etc. of the stream in the driver > +in order to support this interface is strongly discouraged. In case such > +operations are needed, use of Stateless Video Decoder Interface (in > +development) is strongly advised. > + > +Conventions and notation used in this document > +============================================== > + > +1. The general V4L2 API rules apply if not specified in this document > + otherwise. > + > +2. The meaning of words “must”, “may”, “should”, etc. is as per RFC > + 2119. > + > +3. All steps not marked “optional” are required. > + > +4. :c:func:`VIDIOC_G_EXT_CTRLS`, :c:func:`VIDIOC_S_EXT_CTRLS` may be used > + interchangeably with :c:func:`VIDIOC_G_CTRL`, :c:func:`VIDIOC_S_CTRL`, > + unless specified otherwise. > + > +5. Single-plane API (see spec) and applicable structures may be used > + interchangeably with Multi-plane API, unless specified otherwise, > + depending on driver capabilities and following the general V4L2 > + guidelines. > + > +6. i = [a..b]: sequence of integers from a to b, inclusive, i.e. i = > + [0..2]: i = 0, 1, 2. > + > +7. For ``OUTPUT`` buffer A, A’ represents a buffer on the ``CAPTURE`` queue > + containing data (decoded frame/stream) that resulted from processing > + buffer A. > + > +Glossary > +======== > + > +CAPTURE > + the destination buffer queue; the queue of buffers containing decoded > + frames; ``V4L2_BUF_TYPE_VIDEO_CAPTURE```` or > + ``V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE``; data are captured from the > + hardware into ``CAPTURE`` buffers > + > +client > + application client communicating with the driver implementing this API > + > +coded format > + encoded/compressed video bitstream format (e.g. H.264, VP8, etc.); see > + also: raw format > + > +coded height > + height for given coded resolution > + > +coded resolution > + stream resolution in pixels aligned to codec and hardware requirements; > + typically visible resolution rounded up to full macroblocks; > + see also: visible resolution > + > +coded width > + width for given coded resolution > + > +decode order > + the order in which frames are decoded; may differ from display order if > + coded format includes a feature of frame reordering; ``OUTPUT`` buffers > + must be queued by the client in decode order > + > +destination > + data resulting from the decode process; ``CAPTURE`` > + > +display order > + the order in which frames must be displayed; ``CAPTURE`` buffers must be > + returned by the driver in display order > + > +DPB > + Decoded Picture Buffer; a H.264 term for a buffer that stores a picture a H.264 -> an H.264 > + that is encoded or decoded and available for reference in further > + decode/encode steps. > + > +EOS > + end of stream > + > +IDR > + a type of a keyframe in H.264-encoded stream, which clears the list of > + earlier reference frames (DPBs) You do not actually say what IDR stands for. Can you add that? > + > +keyframe > + an encoded frame that does not reference frames decoded earlier, i.e. > + can be decoded fully on its own. > + > +OUTPUT > + the source buffer queue; the queue of buffers containing encoded > + bitstream; ``V4L2_BUF_TYPE_VIDEO_OUTPUT`` or > + ``V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE``; the hardware is fed with data > + from ``OUTPUT`` buffers > + > +PPS > + Picture Parameter Set; a type of metadata entity in H.264 bitstream > + > +raw format > + uncompressed format containing raw pixel data (e.g. YUV, RGB formats) > + > +resume point > + a point in the bitstream from which decoding may start/continue, without > + any previous state/data present, e.g.: a keyframe (VP8/VP9) or > + SPS/PPS/IDR sequence (H.264); a resume point is required to start decode > + of a new stream, or to resume decoding after a seek > + > +source > + data fed to the decoder; ``OUTPUT`` > + > +SPS > + Sequence Parameter Set; a type of metadata entity in H.264 bitstream > + > +visible height > + height for given visible resolution; display height > + > +visible resolution > + stream resolution of the visible picture, in pixels, to be used for > + display purposes; must be smaller or equal to coded resolution; > + display resolution > + > +visible width > + width for given visible resolution; display width > + > +Querying capabilities > +===================== > + > +1. To enumerate the set of coded formats supported by the driver, the > + client may call :c:func:`VIDIOC_ENUM_FMT` on ``OUTPUT``. > + > + * The driver must always return the full set of supported formats, > + irrespective of the format set on the ``CAPTURE``. > + > +2. To enumerate the set of supported raw formats, the client may call > + :c:func:`VIDIOC_ENUM_FMT` on ``CAPTURE``. > + > + * The driver must return only the formats supported for the format > + currently active on ``OUTPUT``. > + > + * In order to enumerate raw formats supported by a given coded format, > + the client must first set that coded format on ``OUTPUT`` and then > + enumerate the ``CAPTURE`` queue. > + > +3. The client may use :c:func:`VIDIOC_ENUM_FRAMESIZES` to detect supported > + resolutions for a given format, passing desired pixel format in > + :c:type:`v4l2_frmsizeenum` ``pixel_format``. > + > + * Values returned by :c:func:`VIDIOC_ENUM_FRAMESIZES` on ``OUTPUT`` > + must include all possible coded resolutions supported by the decoder > + for given coded pixel format. This is confusing. Since VIDIOC_ENUM_FRAMESIZES does not have a buffer type argument you cannot say 'on OUTPUT'. I would remove 'on OUTPUT' entirely. > + > + * Values returned by :c:func:`VIDIOC_ENUM_FRAMESIZES` on ``CAPTURE`` Ditto for 'on CAPTURE' > + must include all possible frame buffer resolutions supported by the > + decoder for given raw pixel format and coded format currently set on > + ``OUTPUT``. > + > + .. note:: > + > + The client may derive the supported resolution range for a > + combination of coded and raw format by setting width and height of > + ``OUTPUT`` format to 0 and calculating the intersection of > + resolutions returned from calls to :c:func:`VIDIOC_ENUM_FRAMESIZES` > + for the given coded and raw formats. So if the output format is set to 1280x720, then ENUM_FRAMESIZES would just return 1280x720 as the resolution. If the output format is set to 0x0, then it returns the full range it is capable of. Correct? If so, then I think this needs to be a bit more explicit. I had to think about it a bit. Note that the v4l2_pix_format/S_FMT documentation needs to be updated as well since we never allowed 0x0 before. What if you set the format to 0x0 but the stream does not have meta data with the resolution? How does userspace know if 0x0 is allowed or not? If this is specific to the chosen coded pixel format, should be add a new flag for those formats indicating that the coded data contains resolution information? That way userspace knows if 0x0 can be used, and the driver can reject 0x0 for formats that do not support it. > + > +4. Supported profiles and levels for given format, if applicable, may be > + queried using their respective controls via :c:func:`VIDIOC_QUERYCTRL`. > + > +Initialization > +============== > + > +1. *[optional]* Enumerate supported ``OUTPUT`` formats and resolutions. See > + capability enumeration. capability enumeration. -> 'Querying capabilities' above. > + > +2. Set the coded format on ``OUTPUT`` via :c:func:`VIDIOC_S_FMT` > + > + * **Required fields:** > + > + ``type`` > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` > + > + ``pixelformat`` > + a coded pixel format > + > + ``width``, ``height`` > + required only if cannot be parsed from the stream for the given > + coded format; optional otherwise - set to zero to ignore > + > + other fields > + follow standard semantics > + > + * For coded formats including stream resolution information, if width > + and height are set to non-zero values, the driver will propagate the > + resolution to ``CAPTURE`` and signal a source change event > + instantly. However, after the decoder is done parsing the > + information embedded in the stream, it will update ``CAPTURE`` > + format with new values and signal a source change event again, if > + the values do not match. > + > + .. note:: > + > + Changing ``OUTPUT`` format may change currently set ``CAPTURE`` change -> change the > + format. The driver will derive a new ``CAPTURE`` format from from -> from the > + ``OUTPUT`` format being set, including resolution, colorimetry > + parameters, etc. If the client needs a specific ``CAPTURE`` format, > + it must adjust it afterwards. > + > +3. *[optional]* Get minimum number of buffers required for ``OUTPUT`` > + queue via :c:func:`VIDIOC_G_CTRL`. This is useful if client intends to client -> the client > + use more buffers than minimum required by hardware/format. than -> than the > + > + * **Required fields:** > + > + ``id`` > + set to ``V4L2_CID_MIN_BUFFERS_FOR_OUTPUT`` > + > + * **Return fields:** > + > + ``value`` > + required number of ``OUTPUT`` buffers for the currently set > + format > + > +4. Allocate source (bitstream) buffers via :c:func:`VIDIOC_REQBUFS` on > + ``OUTPUT``. > + > + * **Required fields:** > + > + ``count`` > + requested number of buffers to allocate; greater than zero > + > + ``type`` > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` > + > + ``memory`` > + follows standard semantics > + > + ``sizeimage`` > + follows standard semantics; the client is free to choose any > + suitable size, however, it may be subject to change by the > + driver > + > + * **Return fields:** > + > + ``count`` > + actual number of buffers allocated > + > + * The driver must adjust count to minimum of required number of > + ``OUTPUT`` buffers for given format and count passed. The client must > + check this value after the ioctl returns to get the number of > + buffers allocated. > + > + .. note:: > + > + To allocate more than minimum number of buffers (for pipeline > + depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_OUTPUT``) to > + get minimum number of buffers required by the driver/format, > + and pass the obtained value plus the number of additional > + buffers needed in count to :c:func:`VIDIOC_REQBUFS`. > + > +5. Start streaming on ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON`. > + > +6. This step only applies to coded formats that contain resolution > + information in the stream. Continue queuing/dequeuing bitstream > + buffers to/from the ``OUTPUT`` queue via :c:func:`VIDIOC_QBUF` and > + :c:func:`VIDIOC_DQBUF`. The driver must keep processing and returning > + each buffer to the client until required metadata to configure the > + ``CAPTURE`` queue are found. This is indicated by the driver sending > + a ``V4L2_EVENT_SOURCE_CHANGE`` event with > + ``V4L2_EVENT_SRC_CH_RESOLUTION`` source change type. There is no > + requirement to pass enough data for this to occur in the first buffer > + and the driver must be able to process any number. > + > + * If data in a buffer that triggers the event is required to decode > + the first frame, the driver must not return it to the client, > + but must retain it for further decoding. > + > + * If the client set width and height of ``OUTPUT`` format to 0, calling > + :c:func:`VIDIOC_G_FMT` on the ``CAPTURE`` queue will return -EPERM, > + until the driver configures ``CAPTURE`` format according to stream > + metadata. > + > + * If the client subscribes to ``V4L2_EVENT_SOURCE_CHANGE`` events and > + the event is signaled, the decoding process will not continue until > + it is acknowledged by either (re-)starting streaming on ``CAPTURE``, > + or via :c:func:`VIDIOC_DECODER_CMD` with ``V4L2_DEC_CMD_START`` > + command. > + > + .. note:: > + > + No decoded frames are produced during this phase. > + > +7. This step only applies to coded formats that contain resolution > + information in the stream. > + Receive and handle ``V4L2_EVENT_SOURCE_CHANGE`` from the driver > + via :c:func:`VIDIOC_DQEVENT`. The driver must send this event once > + enough data is obtained from the stream to allocate ``CAPTURE`` > + buffers and to begin producing decoded frames. > + > + * **Required fields:** > + > + ``type`` > + set to ``V4L2_EVENT_SOURCE_CHANGE`` > + > + * **Return fields:** > + > + ``u.src_change.changes`` > + set to ``V4L2_EVENT_SRC_CH_RESOLUTION`` > + > + * Any client query issued after the driver queues the event must return > + values applying to the just parsed stream, including queue formats, > + selection rectangles and controls. > + > +8. Call :c:func:`VIDIOC_G_FMT` for ``CAPTURE`` queue to get format for the > + destination buffers parsed/decoded from the bitstream. > + > + * **Required fields:** > + > + ``type`` > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` > + > + * **Return fields:** > + > + ``width``, ``height`` > + frame buffer resolution for the decoded frames > + > + ``pixelformat`` > + pixel format for decoded frames > + > + ``num_planes`` (for _MPLANE ``type`` only) > + number of planes for pixelformat > + > + ``sizeimage``, ``bytesperline`` > + as per standard semantics; matching frame buffer format > + > + .. note:: > + > + The value of ``pixelformat`` may be any pixel format supported and > + must be supported for current stream, based on the information > + parsed from the stream and hardware capabilities. It is suggested > + that driver chooses the preferred/optimal format for given > + configuration. For example, a YUV format may be preferred over an > + RGB format, if additional conversion step would be required. > + > +9. *[optional]* Enumerate ``CAPTURE`` formats via > + :c:func:`VIDIOC_ENUM_FMT` on ``CAPTURE`` queue. Once the stream > + information is parsed and known, the client may use this ioctl to > + discover which raw formats are supported for given stream and select on > + of them via :c:func:`VIDIOC_S_FMT`. > + > + .. note:: > + > + The driver will return only formats supported for the current stream > + parsed in this initialization sequence, even if more formats may be > + supported by the driver in general. > + > + For example, a driver/hardware may support YUV and RGB formats for > + resolutions 1920x1088 and lower, but only YUV for higher > + resolutions (due to hardware limitations). After parsing > + a resolution of 1920x1088 or lower, :c:func:`VIDIOC_ENUM_FMT` may > + return a set of YUV and RGB pixel formats, but after parsing > + resolution higher than 1920x1088, the driver will not return RGB, > + unsupported for this resolution. > + > + However, subsequent resolution change event triggered after > + discovering a resolution change within the same stream may switch > + the stream into a lower resolution and :c:func:`VIDIOC_ENUM_FMT` > + would return RGB formats again in that case. > + > +10. *[optional]* Choose a different ``CAPTURE`` format than suggested via > + :c:func:`VIDIOC_S_FMT` on ``CAPTURE`` queue. It is possible for the > + client to choose a different format than selected/suggested by the > + driver in :c:func:`VIDIOC_G_FMT`. > + > + * **Required fields:** > + > + ``type`` > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` > + > + ``pixelformat`` > + a raw pixel format > + > + .. note:: > + > + Calling :c:func:`VIDIOC_ENUM_FMT` to discover currently available > + formats after receiving ``V4L2_EVENT_SOURCE_CHANGE`` is useful to > + find out a set of allowed formats for given configuration, but not > + required, if the client can accept the defaults. > + > +11. *[optional]* Acquire visible resolution via > + :c:func:`VIDIOC_G_SELECTION`. > + > + * **Required fields:** > + > + ``type`` > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` > + > + ``target`` > + set to ``V4L2_SEL_TGT_COMPOSE`` > + > + * **Return fields:** > + > + ``r.left``, ``r.top``, ``r.width``, ``r.height`` > + visible rectangle; this must fit within frame buffer resolution > + returned by :c:func:`VIDIOC_G_FMT`. > + > + * The driver must expose following selection targets on ``CAPTURE``: > + > + ``V4L2_SEL_TGT_CROP_BOUNDS`` > + corresponds to coded resolution of the stream > + > + ``V4L2_SEL_TGT_CROP_DEFAULT`` > + a rectangle covering the part of the frame buffer that contains > + meaningful picture data (visible area); width and height will be > + equal to visible resolution of the stream > + > + ``V4L2_SEL_TGT_CROP`` > + rectangle within coded resolution to be output to ``CAPTURE``; > + defaults to ``V4L2_SEL_TGT_CROP_DEFAULT``; read-only on hardware > + without additional compose/scaling capabilities > + > + ``V4L2_SEL_TGT_COMPOSE_BOUNDS`` > + maximum rectangle within ``CAPTURE`` buffer, which the cropped > + frame can be output into; equal to ``V4L2_SEL_TGT_CROP``, if the > + hardware does not support compose/scaling > + > + ``V4L2_SEL_TGT_COMPOSE_DEFAULT`` > + equal to ``V4L2_SEL_TGT_CROP`` > + > + ``V4L2_SEL_TGT_COMPOSE`` > + rectangle inside ``OUTPUT`` buffer into which the cropped frame > + is output; defaults to ``V4L2_SEL_TGT_COMPOSE_DEFAULT``; > + read-only on hardware without additional compose/scaling > + capabilities > + > + ``V4L2_SEL_TGT_COMPOSE_PADDED`` > + rectangle inside ``OUTPUT`` buffer which is overwritten by the > + hardware; equal to ``V4L2_SEL_TGT_COMPOSE``, if the hardware > + does not write padding pixels > + > +12. *[optional]* Get minimum number of buffers required for ``CAPTURE`` > + queue via :c:func:`VIDIOC_G_CTRL`. This is useful if client intends to > + use more buffers than minimum required by hardware/format. > + > + * **Required fields:** > + > + ``id`` > + set to ``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE`` > + > + * **Return fields:** > + > + ``value`` > + minimum number of buffers required to decode the stream parsed in > + this initialization sequence. > + > + .. note:: > + > + Note that the minimum number of buffers must be at least the number > + required to successfully decode the current stream. This may for > + example be the required DPB size for an H.264 stream given the > + parsed stream configuration (resolution, level). > + > +13. Allocate destination (raw format) buffers via :c:func:`VIDIOC_REQBUFS` > + on the ``CAPTURE`` queue. > + > + * **Required fields:** > + > + ``count`` > + requested number of buffers to allocate; greater than zero > + > + ``type`` > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` > + > + ``memory`` > + follows standard semantics > + > + * **Return fields:** > + > + ``count`` > + adjusted to allocated number of buffers > + > + * The driver must adjust count to minimum of required number of > + destination buffers for given format and stream configuration and the > + count passed. The client must check this value after the ioctl > + returns to get the number of buffers allocated. > + > + .. note:: > + > + To allocate more than minimum number of buffers (for pipeline > + depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE``) to > + get minimum number of buffers required, and pass the obtained value > + plus the number of additional buffers needed in count to > + :c:func:`VIDIOC_REQBUFS`. I think we should mention here the option of using VIDIOC_CREATE_BUFS in order to allocate buffers larger than the current CAPTURE format in order to accommodate future resolution changes. > + > +14. Call :c:func:`VIDIOC_STREAMON` to initiate decoding frames. > + > +Decoding > +======== > + > +This state is reached after a successful initialization sequence. In this > +state, client queues and dequeues buffers to both queues via > +:c:func:`VIDIOC_QBUF` and :c:func:`VIDIOC_DQBUF`, following standard > +semantics. > + > +Both queues operate independently, following standard behavior of V4L2 > +buffer queues and memory-to-memory devices. In addition, the order of > +decoded frames dequeued from ``CAPTURE`` queue may differ from the order of > +queuing coded frames to ``OUTPUT`` queue, due to properties of selected > +coded format, e.g. frame reordering. The client must not assume any direct > +relationship between ``CAPTURE`` and ``OUTPUT`` buffers, other than > +reported by :c:type:`v4l2_buffer` ``timestamp`` field. Is there a relationship between capture and output buffers w.r.t. the timestamp field? I am not aware that there is one. > + > +The contents of source ``OUTPUT`` buffers depend on active coded pixel > +format and might be affected by codec-specific extended controls, as stated > +in documentation of each format individually. in -> in the each format individually -> each format > + > +The client must not assume any direct relationship between ``CAPTURE`` > +and ``OUTPUT`` buffers and any specific timing of buffers becoming > +available to dequeue. Specifically: > + > +* a buffer queued to ``OUTPUT`` may result in no buffers being produced > + on ``CAPTURE`` (e.g. if it does not contain encoded data, or if only > + metadata syntax structures are present in it), > + > +* a buffer queued to ``OUTPUT`` may result in more than 1 buffer produced > + on ``CAPTURE`` (if the encoded data contained more than one frame, or if > + returning a decoded frame allowed the driver to return a frame that > + preceded it in decode, but succeeded it in display order), > + > +* a buffer queued to ``OUTPUT`` may result in a buffer being produced on > + ``CAPTURE`` later into decode process, and/or after processing further > + ``OUTPUT`` buffers, or be returned out of order, e.g. if display > + reordering is used, > + > +* buffers may become available on the ``CAPTURE`` queue without additional > + buffers queued to ``OUTPUT`` (e.g. during drain or ``EOS``), because of > + ``OUTPUT`` buffers being queued in the past and decoding result of which > + being available only at later time, due to specifics of the decoding > + process. > + > +Seek > +==== > + > +Seek is controlled by the ``OUTPUT`` queue, as it is the source of > +bitstream data. ``CAPTURE`` queue remains unchanged/unaffected. > + > +1. Stop the ``OUTPUT`` queue to begin the seek sequence via > + :c:func:`VIDIOC_STREAMOFF`. > + > + * **Required fields:** > + > + ``type`` > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` > + > + * The driver must drop all the pending ``OUTPUT`` buffers and they are > + treated as returned to the client (following standard semantics). > + > +2. Restart the ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON` > + > + * **Required fields:** > + > + ``type`` > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` > + > + * The driver must be put in a state after seek and be ready to "put in a state"??? > + accept new source bitstream buffers. > + > +3. Start queuing buffers to ``OUTPUT`` queue containing stream data after > + the seek until a suitable resume point is found. > + > + .. note:: > + > + There is no requirement to begin queuing stream starting exactly from > + a resume point (e.g. SPS or a keyframe). The driver must handle any > + data queued and must keep processing the queued buffers until it > + finds a suitable resume point. While looking for a resume point, the > + driver processes ``OUTPUT`` buffers and returns them to the client > + without producing any decoded frames. > + > + For hardware known to be mishandling seeks to a non-resume point, > + e.g. by returning corrupted decoded frames, the driver must be able > + to handle such seeks without a crash or any fatal decode error. > + > +4. After a resume point is found, the driver will start returning > + ``CAPTURE`` buffers with decoded frames. > + > + * There is no precise specification for ``CAPTURE`` queue of when it > + will start producing buffers containing decoded data from buffers > + queued after the seek, as it operates independently > + from ``OUTPUT`` queue. > + > + * The driver is allowed to and may return a number of remaining I'd drop 'is allowed to and'. > + ``CAPTURE`` buffers containing decoded frames from before the seek > + after the seek sequence (STREAMOFF-STREAMON) is performed. > + > + * The driver is also allowed to and may not return all decoded frames Ditto. > + queued but not decode before the seek sequence was initiated. For Very confusing sentence. I think you mean this: The driver may not return all decoded frames that where ready for dequeueing from before the seek sequence was initiated. Is this really true? Once decoded frames are marked as buffer_done by the driver there is no reason for them to be removed. Or you mean something else here, e.g. the frames are decoded, but the buffers not yet given back to vb2. > + example, given an ``OUTPUT`` queue sequence: QBUF(A), QBUF(B), > + STREAMOFF(OUT), STREAMON(OUT), QBUF(G), QBUF(H), any of the > + following results on the ``CAPTURE`` queue is allowed: {A’, B’, G’, > + H’}, {A’, G’, H’}, {G’, H’}. > + > + .. note:: > + > + To achieve instantaneous seek, the client may restart streaming on > + ``CAPTURE`` queue to discard decoded, but not yet dequeued buffers. > + > +Pause > +===== > + > +In order to pause, the client should just cease queuing buffers onto the > +``OUTPUT`` queue. This is different from the general V4L2 API definition of > +pause, which involves calling :c:func:`VIDIOC_STREAMOFF` on the queue. > +Without source bitstream data, there is no data to process and the hardware > +remains idle. > + > +Conversely, using :c:func:`VIDIOC_STREAMOFF` on ``OUTPUT`` queue indicates > +a seek, which > + > +1. drops all ``OUTPUT`` buffers in flight and > +2. after a subsequent :c:func:`VIDIOC_STREAMON`, will look for and only > + continue from a resume point. > + > +This is usually undesirable for pause. The STREAMOFF-STREAMON sequence is > +intended for seeking. > + > +Similarly, ``CAPTURE`` queue should remain streaming as well, as the the ``CAPTURE`` queue (add 'the') > +STREAMOFF-STREAMON sequence on it is intended solely for changing buffer > +sets. 'changing buffer sets': not clear what is meant by this. It's certainly not 'solely' since it can also be used to achieve an instantaneous seek. > + > +Dynamic resolution change > +========================= > + > +A video decoder implementing this interface must support dynamic resolution > +change, for streams, which include resolution metadata in the bitstream. I think the commas can be removed from this sentence. I would also replace 'which' by 'that'. > +When the decoder encounters a resolution change in the stream, the dynamic > +resolution change sequence is started. > + > +1. After encountering a resolution change in the stream, the driver must > + first process and decode all remaining buffers from before the > + resolution change point. > + > +2. After all buffers containing decoded frames from before the resolution > + change point are ready to be dequeued on the ``CAPTURE`` queue, the > + driver sends a ``V4L2_EVENT_SOURCE_CHANGE`` event for source change > + type ``V4L2_EVENT_SRC_CH_RESOLUTION``. > + > + * The last buffer from before the change must be marked with > + :c:type:`v4l2_buffer` ``flags`` flag ``V4L2_BUF_FLAG_LAST`` as in the spurious 'as'? > + drain sequence. The last buffer might be empty (with > + :c:type:`v4l2_buffer` ``bytesused`` = 0) and must be ignored by the > + client, since it does not contain any decoded frame. any -> a > + > + * Any client query issued after the driver queues the event must return > + values applying to the stream after the resolution change, including > + queue formats, selection rectangles and controls. > + > + * If the client subscribes to ``V4L2_EVENT_SOURCE_CHANGE`` events and > + the event is signaled, the decoding process will not continue until > + it is acknowledged by either (re-)starting streaming on ``CAPTURE``, > + or via :c:func:`VIDIOC_DECODER_CMD` with ``V4L2_DEC_CMD_START`` > + command. With (re-)starting streaming you mean a STREAMOFF/ON pair on the CAPTURE queue, right? > + > + .. note:: > + > + Any attempts to dequeue more buffers beyond the buffer marked > + with ``V4L2_BUF_FLAG_LAST`` will result in a -EPIPE error from > + :c:func:`VIDIOC_DQBUF`. > + > +3. The client calls :c:func:`VIDIOC_G_FMT` for ``CAPTURE`` to get the new > + format information. This is identical to calling :c:func:`VIDIOC_G_FMT` > + after ``V4L2_EVENT_SRC_CH_RESOLUTION`` in the initialization sequence > + and should be handled similarly. > + > + .. note:: > + > + It is allowed for the driver not to support the same pixel format as > + previously used (before the resolution change) for the new > + resolution. The driver must select a default supported pixel format, > + return it, if queried using :c:func:`VIDIOC_G_FMT`, and the client > + must take note of it. > + > +4. The client acquires visible resolution as in initialization sequence. > + > +5. *[optional]* The client is allowed to enumerate available formats and > + select a different one than currently chosen (returned via > + :c:func:`VIDIOC_G_FMT)`. This is identical to a corresponding step in > + the initialization sequence. > + > +6. *[optional]* The client acquires minimum number of buffers as in > + initialization sequence. It's an optional step, but what might happen if you ignore it or if the control does not exist? You also should mention that this is the min number of CAPTURE buffers. I wonder if we should make these min buffer controls required. It might be easier that way. > +7. If all the following conditions are met, the client may resume the > + decoding instantly, by using :c:func:`VIDIOC_DECODER_CMD` with > + ``V4L2_DEC_CMD_START`` command, as in case of resuming after the drain > + sequence: > + > + * ``sizeimage`` of new format is less than or equal to the size of > + currently allocated buffers, > + > + * the number of buffers currently allocated is greater than or equal to > + the minimum number of buffers acquired in step 6. You might want to mention that if there are insufficient buffers, then VIDIOC_CREATE_BUFS can be used to add more buffers. > + > + In such case, the remaining steps do not apply. > + > + However, if the client intends to change the buffer set, to lower > + memory usage or for any other reasons, it may be achieved by following > + the steps below. > + > +8. After dequeuing all remaining buffers from the ``CAPTURE`` queue, the > + client must call :c:func:`VIDIOC_STREAMOFF` on the ``CAPTURE`` queue. > + The ``OUTPUT`` queue must remain streaming (calling STREAMOFF on it > + would trigger a seek). > + > +9. The client frees the buffers on the ``CAPTURE`` queue using > + :c:func:`VIDIOC_REQBUFS`. > + > + * **Required fields:** > + > + ``count`` > + set to 0 > + > + ``type`` > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` > + > + ``memory`` > + follows standard semantics > + > +10. The client allocates a new set of buffers for the ``CAPTURE`` queue via > + :c:func:`VIDIOC_REQBUFS`. This is identical to a corresponding step in > + the initialization sequence. > + > +11. The client resumes decoding by issuing :c:func:`VIDIOC_STREAMON` on the > + ``CAPTURE`` queue. > + > +During the resolution change sequence, the ``OUTPUT`` queue must remain > +streaming. Calling :c:func:`VIDIOC_STREAMOFF` on ``OUTPUT`` queue would > +initiate a seek. > + > +The ``OUTPUT`` queue operates separately from the ``CAPTURE`` queue for the > +duration of the entire resolution change sequence. It is allowed (and > +recommended for best performance and simplicity) for the client to keep > +queuing/dequeuing buffers from/to ``OUTPUT`` queue even while processing > +this sequence. > + > +.. note:: > + > + It is also possible for this sequence to be triggered without a change > + in coded resolution, if a different number of ``CAPTURE`` buffers is > + required in order to continue decoding the stream or the visible > + resolution changes. > + > +Drain > +===== > + > +To ensure that all queued ``OUTPUT`` buffers have been processed and > +related ``CAPTURE`` buffers output to the client, the following drain > +sequence may be followed. After the drain sequence is complete, the client > +has received all decoded frames for all ``OUTPUT`` buffers queued before > +the sequence was started. > + > +1. Begin drain by issuing :c:func:`VIDIOC_DECODER_CMD`. > + > + * **Required fields:** > + > + ``cmd`` > + set to ``V4L2_DEC_CMD_STOP`` > + > + ``flags`` > + set to 0 > + > + ``pts`` > + set to 0 > + > +2. The driver must process and decode as normal all ``OUTPUT`` buffers > + queued by the client before the :c:func:`VIDIOC_DECODER_CMD` was issued. > + Any operations triggered as a result of processing these buffers > + (including the initialization and resolution change sequences) must be > + processed as normal by both the driver and the client before proceeding > + with the drain sequence. > + > +3. Once all ``OUTPUT`` buffers queued before ``V4L2_DEC_CMD_STOP`` are > + processed: > + > + * If the ``CAPTURE`` queue is streaming, once all decoded frames (if > + any) are ready to be dequeued on the ``CAPTURE`` queue, the driver > + must send a ``V4L2_EVENT_EOS``. The driver must also set > + ``V4L2_BUF_FLAG_LAST`` in :c:type:`v4l2_buffer` ``flags`` field on the > + buffer on the ``CAPTURE`` queue containing the last frame (if any) > + produced as a result of processing the ``OUTPUT`` buffers queued > + before ``V4L2_DEC_CMD_STOP``. If no more frames are left to be > + returned at the point of handling ``V4L2_DEC_CMD_STOP``, the driver > + must return an empty buffer (with :c:type:`v4l2_buffer` > + ``bytesused`` = 0) as the last buffer with ``V4L2_BUF_FLAG_LAST`` set > + instead. Any attempts to dequeue more buffers beyond the buffer marked > + with ``V4L2_BUF_FLAG_LAST`` will result in a -EPIPE error from > + :c:func:`VIDIOC_DQBUF`. > + > + * If the ``CAPTURE`` queue is NOT streaming, no action is necessary for > + ``CAPTURE`` queue and the driver must send a ``V4L2_EVENT_EOS`` > + immediately after all ``OUTPUT`` buffers in question have been > + processed. > + > +4. At this point, decoding is paused and the driver will accept, but not > + process any newly queued ``OUTPUT`` buffers until the client issues > + ``V4L2_DEC_CMD_START`` or restarts streaming on any queue. > + > +* Once the drain sequence is initiated, the client needs to drive it to > + completion, as described by the above steps, unless it aborts the process > + by issuing :c:func:`VIDIOC_STREAMOFF` on ``OUTPUT`` queue. The client > + is not allowed to issue ``V4L2_DEC_CMD_START`` or ``V4L2_DEC_CMD_STOP`` > + again while the drain sequence is in progress and they will fail with > + -EBUSY error code if attempted. > + > +* Restarting streaming on ``OUTPUT`` queue will implicitly end the paused > + state and reinitialize the decoder (similarly to the seek sequence). > + Restarting ``CAPTURE`` queue will not affect an in-progress drain > + sequence. > + > +* The drivers must also implement :c:func:`VIDIOC_TRY_DECODER_CMD`, as a > + way to let the client query the availability of decoder commands. > + > +End of stream > +============= > + > +If the decoder encounters an end of stream marking in the stream, the > +driver must send a ``V4L2_EVENT_EOS`` event to the client after all frames > +are decoded and ready to be dequeued on the ``CAPTURE`` queue, with the > +:c:type:`v4l2_buffer` ``flags`` set to ``V4L2_BUF_FLAG_LAST``. This > +behavior is identical to the drain sequence triggered by the client via > +``V4L2_DEC_CMD_STOP``. > + > +Commit points > +============= > + > +Setting formats and allocating buffers triggers changes in the behavior > +of the driver. > + > +1. Setting format on ``OUTPUT`` queue may change the set of formats Setting -> Setting the > + supported/advertised on the ``CAPTURE`` queue. In particular, it also > + means that ``CAPTURE`` format may be reset and the client must not that -> that the > + rely on the previously set format being preserved. > + > +2. Enumerating formats on ``CAPTURE`` queue must only return formats > + supported for the ``OUTPUT`` format currently set. > + > +3. Setting/changing format on ``CAPTURE`` queue does not change formats format -> the format > + available on ``OUTPUT`` queue. An attempt to set ``CAPTURE`` format that set -> set a > + is not supported for the currently selected ``OUTPUT`` format must > + result in the driver adjusting the requested format to an acceptable > + one. > + > +4. Enumerating formats on ``OUTPUT`` queue always returns the full set of > + supported coded formats, irrespective of the current ``CAPTURE`` > + format. > + > +5. After allocating buffers on the ``OUTPUT`` queue, it is not possible to > + change format on it. format -> the format > + > +To summarize, setting formats and allocation must always start with the > +``OUTPUT`` queue and the ``OUTPUT`` queue is the master that governs the > +set of supported formats for the ``CAPTURE`` queue. > diff --git a/Documentation/media/uapi/v4l/devices.rst b/Documentation/media/uapi/v4l/devices.rst > index fb7f8c26cf09..12d43fe711cf 100644 > --- a/Documentation/media/uapi/v4l/devices.rst > +++ b/Documentation/media/uapi/v4l/devices.rst > @@ -15,6 +15,7 @@ Interfaces > dev-output > dev-osd > dev-codec > + dev-decoder > dev-effect > dev-raw-vbi > dev-sliced-vbi > diff --git a/Documentation/media/uapi/v4l/v4l2.rst b/Documentation/media/uapi/v4l/v4l2.rst > index b89e5621ae69..65dc096199ad 100644 > --- a/Documentation/media/uapi/v4l/v4l2.rst > +++ b/Documentation/media/uapi/v4l/v4l2.rst > @@ -53,6 +53,10 @@ Authors, in alphabetical order: > > - Original author of the V4L2 API and documentation. > > +- Figa, Tomasz <tfiga@chromium.org> > + > + - Documented the memory-to-memory decoder interface. > + > - H Schimek, Michael <mschimek@gmx.at> > > - Original author of the V4L2 API and documentation. > @@ -61,6 +65,10 @@ Authors, in alphabetical order: > > - Documented the Digital Video timings API. > > +- Osciak, Pawel <posciak@chromium.org> > + > + - Documented the memory-to-memory decoder interface. > + > - Osciak, Pawel <pawel@osciak.com> > > - Designed and documented the multi-planar API. > @@ -85,7 +93,7 @@ Authors, in alphabetical order: > > - Designed and documented the VIDIOC_LOG_STATUS ioctl, the extended control ioctls, major parts of the sliced VBI API, the MPEG encoder and decoder APIs and the DV Timings API. > > -**Copyright** |copy| 1999-2016: Bill Dirks, Michael H. Schimek, Hans Verkuil, Martin Rubli, Andy Walls, Muralidharan Karicheri, Mauro Carvalho Chehab, Pawel Osciak, Sakari Ailus & Antti Palosaari. > +**Copyright** |copy| 1999-2018: Bill Dirks, Michael H. Schimek, Hans Verkuil, Martin Rubli, Andy Walls, Muralidharan Karicheri, Mauro Carvalho Chehab, Pawel Osciak, Sakari Ailus & Antti Palosaari, Tomasz Figa > > Except when explicitly stated as GPL, programming examples within this > part can be used and distributed without restrictions. > Regards, Hans
Hi Hans, On Wed, Jul 25, 2018 at 8:59 PM Hans Verkuil <hverkuil@xs4all.nl> wrote: > > Hi Tomasz, > > Many, many thanks for working on this! It's a great document and when done > it will be very useful indeed. > > Review comments follow... Thanks for review! > > On 24/07/18 16:06, Tomasz Figa wrote: [snip] > > +DPB > > + Decoded Picture Buffer; a H.264 term for a buffer that stores a picture > > a H.264 -> an H.264 > Ack. > > + that is encoded or decoded and available for reference in further > > + decode/encode steps. > > + > > +EOS > > + end of stream > > + > > +IDR > > + a type of a keyframe in H.264-encoded stream, which clears the list of > > + earlier reference frames (DPBs) > > You do not actually say what IDR stands for. Can you add that? > Ack. [snip] > > +3. The client may use :c:func:`VIDIOC_ENUM_FRAMESIZES` to detect supported > > + resolutions for a given format, passing desired pixel format in > > + :c:type:`v4l2_frmsizeenum` ``pixel_format``. > > + > > + * Values returned by :c:func:`VIDIOC_ENUM_FRAMESIZES` on ``OUTPUT`` > > + must include all possible coded resolutions supported by the decoder > > + for given coded pixel format. > > This is confusing. Since VIDIOC_ENUM_FRAMESIZES does not have a buffer type > argument you cannot say 'on OUTPUT'. I would remove 'on OUTPUT' entirely. > > > + > > + * Values returned by :c:func:`VIDIOC_ENUM_FRAMESIZES` on ``CAPTURE`` > > Ditto for 'on CAPTURE' > You're right. I didn't notice that the "type" field in v4l2_frmsizeenum was not buffer type, but type of the range. Thanks for spotting this. > > + must include all possible frame buffer resolutions supported by the > > + decoder for given raw pixel format and coded format currently set on > > + ``OUTPUT``. > > + > > + .. note:: > > + > > + The client may derive the supported resolution range for a > > + combination of coded and raw format by setting width and height of > > + ``OUTPUT`` format to 0 and calculating the intersection of > > + resolutions returned from calls to :c:func:`VIDIOC_ENUM_FRAMESIZES` > > + for the given coded and raw formats. > > So if the output format is set to 1280x720, then ENUM_FRAMESIZES would just > return 1280x720 as the resolution. If the output format is set to 0x0, then > it returns the full range it is capable of. > > Correct? > > If so, then I think this needs to be a bit more explicit. I had to think about > it a bit. > > Note that the v4l2_pix_format/S_FMT documentation needs to be updated as well > since we never allowed 0x0 before. Is there any text that disallows this? I couldn't spot any. Generally there are already drivers which return 0x0 for coded formats (s5p-mfc) and it's not even strange, because in such case, the buffer contains just a sequence of bytes, not a 2D picture. > What if you set the format to 0x0 but the stream does not have meta data with > the resolution? How does userspace know if 0x0 is allowed or not? If this is > specific to the chosen coded pixel format, should be add a new flag for those > formats indicating that the coded data contains resolution information? Yes, this would definitely be on a per-format basis. Not sure what you mean by a flag, though? E.g. if the format is set to H264, then it's bound to include resolution information. If the format doesn't include it, then userspace is already aware of this fact, because it needs to get this from some other source (e.g. container). > > That way userspace knows if 0x0 can be used, and the driver can reject 0x0 > for formats that do not support it. As above, but I might be misunderstanding your suggestion. > > > + > > +4. Supported profiles and levels for given format, if applicable, may be > > + queried using their respective controls via :c:func:`VIDIOC_QUERYCTRL`. > > + > > +Initialization > > +============== > > + > > +1. *[optional]* Enumerate supported ``OUTPUT`` formats and resolutions. See > > + capability enumeration. > > capability enumeration. -> 'Querying capabilities' above. > Ack. > > + > > +2. Set the coded format on ``OUTPUT`` via :c:func:`VIDIOC_S_FMT` > > + > > + * **Required fields:** > > + > > + ``type`` > > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` > > + > > + ``pixelformat`` > > + a coded pixel format > > + > > + ``width``, ``height`` > > + required only if cannot be parsed from the stream for the given > > + coded format; optional otherwise - set to zero to ignore > > + > > + other fields > > + follow standard semantics > > + > > + * For coded formats including stream resolution information, if width > > + and height are set to non-zero values, the driver will propagate the > > + resolution to ``CAPTURE`` and signal a source change event > > + instantly. However, after the decoder is done parsing the > > + information embedded in the stream, it will update ``CAPTURE`` > > + format with new values and signal a source change event again, if > > + the values do not match. > > + > > + .. note:: > > + > > + Changing ``OUTPUT`` format may change currently set ``CAPTURE`` > > change -> change the Ack. > > > + format. The driver will derive a new ``CAPTURE`` format from > > from -> from the Ack. > > > + ``OUTPUT`` format being set, including resolution, colorimetry > > + parameters, etc. If the client needs a specific ``CAPTURE`` format, > > + it must adjust it afterwards. > > + > > +3. *[optional]* Get minimum number of buffers required for ``OUTPUT`` > > + queue via :c:func:`VIDIOC_G_CTRL`. This is useful if client intends to > > client -> the client Ack. > > > + use more buffers than minimum required by hardware/format. > > than -> than the Ack. [snip] > > +13. Allocate destination (raw format) buffers via :c:func:`VIDIOC_REQBUFS` > > + on the ``CAPTURE`` queue. > > + > > + * **Required fields:** > > + > > + ``count`` > > + requested number of buffers to allocate; greater than zero > > + > > + ``type`` > > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` > > + > > + ``memory`` > > + follows standard semantics > > + > > + * **Return fields:** > > + > > + ``count`` > > + adjusted to allocated number of buffers > > + > > + * The driver must adjust count to minimum of required number of > > + destination buffers for given format and stream configuration and the > > + count passed. The client must check this value after the ioctl > > + returns to get the number of buffers allocated. > > + > > + .. note:: > > + > > + To allocate more than minimum number of buffers (for pipeline > > + depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE``) to > > + get minimum number of buffers required, and pass the obtained value > > + plus the number of additional buffers needed in count to > > + :c:func:`VIDIOC_REQBUFS`. > > > I think we should mention here the option of using VIDIOC_CREATE_BUFS in order > to allocate buffers larger than the current CAPTURE format in order to accommodate > future resolution changes. Ack. > > > + > > +14. Call :c:func:`VIDIOC_STREAMON` to initiate decoding frames. > > + > > +Decoding > > +======== > > + > > +This state is reached after a successful initialization sequence. In this > > +state, client queues and dequeues buffers to both queues via > > +:c:func:`VIDIOC_QBUF` and :c:func:`VIDIOC_DQBUF`, following standard > > +semantics. > > + > > +Both queues operate independently, following standard behavior of V4L2 > > +buffer queues and memory-to-memory devices. In addition, the order of > > +decoded frames dequeued from ``CAPTURE`` queue may differ from the order of > > +queuing coded frames to ``OUTPUT`` queue, due to properties of selected > > +coded format, e.g. frame reordering. The client must not assume any direct > > +relationship between ``CAPTURE`` and ``OUTPUT`` buffers, other than > > +reported by :c:type:`v4l2_buffer` ``timestamp`` field. > > Is there a relationship between capture and output buffers w.r.t. the timestamp > field? I am not aware that there is one. I believe the decoder was expected to copy the timestamp of matching OUTPUT buffer to respective CAPTURE buffer. Both s5p-mfc and coda seem to be implementing it this way. I guess it might be a good idea to specify this more explicitly. > > > + > > +The contents of source ``OUTPUT`` buffers depend on active coded pixel > > +format and might be affected by codec-specific extended controls, as stated > > +in documentation of each format individually. > > in -> in the > each format individually -> each format > Ack. [snip] > > +2. Restart the ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON` > > + > > + * **Required fields:** > > + > > + ``type`` > > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` > > + > > + * The driver must be put in a state after seek and be ready to > > "put in a state"??? > I'm not sure what this was supposed to be. I guess just "The driver must start accepting new source bitstream buffers after the call returns." would be enough. > > + accept new source bitstream buffers. > > + > > +3. Start queuing buffers to ``OUTPUT`` queue containing stream data after > > + the seek until a suitable resume point is found. > > + > > + .. note:: > > + > > + There is no requirement to begin queuing stream starting exactly from > > + a resume point (e.g. SPS or a keyframe). The driver must handle any > > + data queued and must keep processing the queued buffers until it > > + finds a suitable resume point. While looking for a resume point, the > > + driver processes ``OUTPUT`` buffers and returns them to the client > > + without producing any decoded frames. > > + > > + For hardware known to be mishandling seeks to a non-resume point, > > + e.g. by returning corrupted decoded frames, the driver must be able > > + to handle such seeks without a crash or any fatal decode error. > > + > > +4. After a resume point is found, the driver will start returning > > + ``CAPTURE`` buffers with decoded frames. > > + > > + * There is no precise specification for ``CAPTURE`` queue of when it > > + will start producing buffers containing decoded data from buffers > > + queued after the seek, as it operates independently > > + from ``OUTPUT`` queue. > > + > > + * The driver is allowed to and may return a number of remaining > > I'd drop 'is allowed to and'. > Ack. > > + ``CAPTURE`` buffers containing decoded frames from before the seek > > + after the seek sequence (STREAMOFF-STREAMON) is performed. > > + > > + * The driver is also allowed to and may not return all decoded frames > > Ditto. Ack. > > > + queued but not decode before the seek sequence was initiated. For > > Very confusing sentence. I think you mean this: > > The driver may not return all decoded frames that where ready for > dequeueing from before the seek sequence was initiated. > > Is this really true? Once decoded frames are marked as buffer_done by the > driver there is no reason for them to be removed. Or you mean something else > here, e.g. the frames are decoded, but the buffers not yet given back to vb2. > Exactly "the frames are decoded, but the buffers not yet given back to vb2", for example, if reordering takes place. However, if one stops streaming before dequeuing all buffers, they are implicitly returned (reset to the state after REQBUFS) and can't be dequeued anymore, so the frames are lost, even if the driver returned them. I guess the sentence was really unfortunate indeed. > > + example, given an ``OUTPUT`` queue sequence: QBUF(A), QBUF(B), > > + STREAMOFF(OUT), STREAMON(OUT), QBUF(G), QBUF(H), any of the > > + following results on the ``CAPTURE`` queue is allowed: {A’, B’, G’, > > + H’}, {A’, G’, H’}, {G’, H’}. > > + > > + .. note:: > > + > > + To achieve instantaneous seek, the client may restart streaming on > > + ``CAPTURE`` queue to discard decoded, but not yet dequeued buffers. > > + > > +Pause > > +===== > > + > > +In order to pause, the client should just cease queuing buffers onto the > > +``OUTPUT`` queue. This is different from the general V4L2 API definition of > > +pause, which involves calling :c:func:`VIDIOC_STREAMOFF` on the queue. > > +Without source bitstream data, there is no data to process and the hardware > > +remains idle. > > + > > +Conversely, using :c:func:`VIDIOC_STREAMOFF` on ``OUTPUT`` queue indicates > > +a seek, which > > + > > +1. drops all ``OUTPUT`` buffers in flight and > > +2. after a subsequent :c:func:`VIDIOC_STREAMON`, will look for and only > > + continue from a resume point. > > + > > +This is usually undesirable for pause. The STREAMOFF-STREAMON sequence is > > +intended for seeking. > > + > > +Similarly, ``CAPTURE`` queue should remain streaming as well, as the > > the ``CAPTURE`` queue > > (add 'the') > Ack. > > +STREAMOFF-STREAMON sequence on it is intended solely for changing buffer > > +sets. > > 'changing buffer sets': not clear what is meant by this. It's certainly not > 'solely' since it can also be used to achieve an instantaneous seek. > To be honest, I'm not sure whether there is even a need to include this whole section. It's obvious that if you stop feeding a mem2mem device, it will pause. Moreover, other sections imply various behaviors triggered by STREAMOFF/STREAMON/DECODER_CMD/etc., so it should be quite clear that they are different from a simple pause. What do you think? > > + > > +Dynamic resolution change > > +========================= > > + > > +A video decoder implementing this interface must support dynamic resolution > > +change, for streams, which include resolution metadata in the bitstream. > > I think the commas can be removed from this sentence. I would also replace > 'which' by 'that'. > Ack. > > +When the decoder encounters a resolution change in the stream, the dynamic > > +resolution change sequence is started. > > + > > +1. After encountering a resolution change in the stream, the driver must > > + first process and decode all remaining buffers from before the > > + resolution change point. > > + > > +2. After all buffers containing decoded frames from before the resolution > > + change point are ready to be dequeued on the ``CAPTURE`` queue, the > > + driver sends a ``V4L2_EVENT_SOURCE_CHANGE`` event for source change > > + type ``V4L2_EVENT_SRC_CH_RESOLUTION``. > > + > > + * The last buffer from before the change must be marked with > > + :c:type:`v4l2_buffer` ``flags`` flag ``V4L2_BUF_FLAG_LAST`` as in the > > spurious 'as'? > It should be: * The last buffer from before the change must be marked with the ``V4L2_BUF_FLAG_LAST`` flag in :c:type:`v4l2_buffer` ``flags`` field, similarly to the > > + drain sequence. The last buffer might be empty (with > > + :c:type:`v4l2_buffer` ``bytesused`` = 0) and must be ignored by the > > + client, since it does not contain any decoded frame. > > any -> a > Ack. > > + > > + * Any client query issued after the driver queues the event must return > > + values applying to the stream after the resolution change, including > > + queue formats, selection rectangles and controls. > > + > > + * If the client subscribes to ``V4L2_EVENT_SOURCE_CHANGE`` events and > > + the event is signaled, the decoding process will not continue until > > + it is acknowledged by either (re-)starting streaming on ``CAPTURE``, > > + or via :c:func:`VIDIOC_DECODER_CMD` with ``V4L2_DEC_CMD_START`` > > + command. > > With (re-)starting streaming you mean a STREAMOFF/ON pair on the CAPTURE queue, > right? > Right. I guess it might be better to just state that explicitly. > > + > > + .. note:: > > + > > + Any attempts to dequeue more buffers beyond the buffer marked > > + with ``V4L2_BUF_FLAG_LAST`` will result in a -EPIPE error from > > + :c:func:`VIDIOC_DQBUF`. > > + > > +3. The client calls :c:func:`VIDIOC_G_FMT` for ``CAPTURE`` to get the new > > + format information. This is identical to calling :c:func:`VIDIOC_G_FMT` > > + after ``V4L2_EVENT_SRC_CH_RESOLUTION`` in the initialization sequence > > + and should be handled similarly. > > + > > + .. note:: > > + > > + It is allowed for the driver not to support the same pixel format as > > + previously used (before the resolution change) for the new > > + resolution. The driver must select a default supported pixel format, > > + return it, if queried using :c:func:`VIDIOC_G_FMT`, and the client > > + must take note of it. > > + > > +4. The client acquires visible resolution as in initialization sequence. > > + > > +5. *[optional]* The client is allowed to enumerate available formats and > > + select a different one than currently chosen (returned via > > + :c:func:`VIDIOC_G_FMT)`. This is identical to a corresponding step in > > + the initialization sequence. > > + > > +6. *[optional]* The client acquires minimum number of buffers as in > > + initialization sequence. > > It's an optional step, but what might happen if you ignore it or if the control > does not exist? REQBUFS is supposed clamp the requested number of buffers to the [min, max] range anyway. > > You also should mention that this is the min number of CAPTURE buffers. > > I wonder if we should make these min buffer controls required. It might be easier > that way. Agreed. Although userspace is still free to ignore it, because REQBUFS would do the right thing anyway. > > > +7. If all the following conditions are met, the client may resume the > > + decoding instantly, by using :c:func:`VIDIOC_DECODER_CMD` with > > + ``V4L2_DEC_CMD_START`` command, as in case of resuming after the drain > > + sequence: > > + > > + * ``sizeimage`` of new format is less than or equal to the size of > > + currently allocated buffers, > > + > > + * the number of buffers currently allocated is greater than or equal to > > + the minimum number of buffers acquired in step 6. > > You might want to mention that if there are insufficient buffers, then > VIDIOC_CREATE_BUFS can be used to add more buffers. > This might be a bit tricky, since at least s5p-mfc and coda can only work on a fixed buffer set and one would need to fully reinitialize the decoding to add one more buffer, which would effectively be the full resolution change sequence, as below, just with REQBUFS(0), REQBUFS(N) replaced with CREATE_BUFS. We should mention CREATE_BUFS as an alternative to steps 9 and 10, though. > > + > > + In such case, the remaining steps do not apply. > > + > > + However, if the client intends to change the buffer set, to lower > > + memory usage or for any other reasons, it may be achieved by following > > + the steps below. > > + > > +8. After dequeuing all remaining buffers from the ``CAPTURE`` queue, the > > + client must call :c:func:`VIDIOC_STREAMOFF` on the ``CAPTURE`` queue. > > + The ``OUTPUT`` queue must remain streaming (calling STREAMOFF on it > > + would trigger a seek). > > + > > +9. The client frees the buffers on the ``CAPTURE`` queue using > > + :c:func:`VIDIOC_REQBUFS`. > > + > > + * **Required fields:** > > + > > + ``count`` > > + set to 0 > > + > > + ``type`` > > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` > > + > > + ``memory`` > > + follows standard semantics > > + > > +10. The client allocates a new set of buffers for the ``CAPTURE`` queue via > > + :c:func:`VIDIOC_REQBUFS`. This is identical to a corresponding step in > > + the initialization sequence. [snip] > > + > > +Commit points > > +============= > > + > > +Setting formats and allocating buffers triggers changes in the behavior > > +of the driver. > > + > > +1. Setting format on ``OUTPUT`` queue may change the set of formats > > Setting -> Setting the > Ack. > > + supported/advertised on the ``CAPTURE`` queue. In particular, it also > > + means that ``CAPTURE`` format may be reset and the client must not > > that -> that the > Ack. > > + rely on the previously set format being preserved. > > + > > +2. Enumerating formats on ``CAPTURE`` queue must only return formats > > + supported for the ``OUTPUT`` format currently set. > > + > > +3. Setting/changing format on ``CAPTURE`` queue does not change formats > > format -> the format > Ack. > > + available on ``OUTPUT`` queue. An attempt to set ``CAPTURE`` format that > > set -> set a > Ack. > > + is not supported for the currently selected ``OUTPUT`` format must > > + result in the driver adjusting the requested format to an acceptable > > + one. > > + > > +4. Enumerating formats on ``OUTPUT`` queue always returns the full set of > > + supported coded formats, irrespective of the current ``CAPTURE`` > > + format. > > + > > +5. After allocating buffers on the ``OUTPUT`` queue, it is not possible to > > + change format on it. > > format -> the format > Ack. Best regards, Tomasz
On Thu, 2018-07-26 at 19:20 +0900, Tomasz Figa wrote: [...] > > You might want to mention that if there are insufficient buffers, then > > VIDIOC_CREATE_BUFS can be used to add more buffers. > > > > This might be a bit tricky, since at least s5p-mfc and coda can only > work on a fixed buffer set and one would need to fully reinitialize > the decoding to add one more buffer, which would effectively be the > full resolution change sequence, as below, just with REQBUFS(0), > REQBUFS(N) replaced with CREATE_BUFS. The coda driver supports CREATE_BUFS on the decoder CAPTURE queue. The firmware indeed needs a fixed frame buffer set, but these buffers are internal only and in a coda specific tiling format. The content of finished internal buffers is copied / detiled into the external CAPTURE buffers, so those can be added at will. regards Philipp
On 26/07/18 12:20, Tomasz Figa wrote: > Hi Hans, > > On Wed, Jul 25, 2018 at 8:59 PM Hans Verkuil <hverkuil@xs4all.nl> wrote: >> >> Hi Tomasz, >> >> Many, many thanks for working on this! It's a great document and when done >> it will be very useful indeed. >> >> Review comments follow... > > Thanks for review! > >> >> On 24/07/18 16:06, Tomasz Figa wrote: > [snip] >> Note that the v4l2_pix_format/S_FMT documentation needs to be updated as well >> since we never allowed 0x0 before. > > Is there any text that disallows this? I couldn't spot any. Generally > there are already drivers which return 0x0 for coded formats (s5p-mfc) > and it's not even strange, because in such case, the buffer contains > just a sequence of bytes, not a 2D picture. All non-m2m devices will always have non-zero width/height values. Only with m2m devices do we see this. This was probably never documented since before m2m appeared it was 'obvious'. This definitely needs to be documented, though. > >> What if you set the format to 0x0 but the stream does not have meta data with >> the resolution? How does userspace know if 0x0 is allowed or not? If this is >> specific to the chosen coded pixel format, should be add a new flag for those >> formats indicating that the coded data contains resolution information? > > Yes, this would definitely be on a per-format basis. Not sure what you > mean by a flag, though? E.g. if the format is set to H264, then it's > bound to include resolution information. If the format doesn't include > it, then userspace is already aware of this fact, because it needs to > get this from some other source (e.g. container). > >> >> That way userspace knows if 0x0 can be used, and the driver can reject 0x0 >> for formats that do not support it. > > As above, but I might be misunderstanding your suggestion. So my question is: is this tied to the pixel format, or should we make it explicit with a flag like V4L2_FMT_FLAG_CAN_DECODE_WXH. The advantage of a flag is that you don't need a switch on the format to know whether or not 0x0 is allowed. And the flag can just be set in v4l2-ioctls.c. >>> +STREAMOFF-STREAMON sequence on it is intended solely for changing buffer >>> +sets. >> >> 'changing buffer sets': not clear what is meant by this. It's certainly not >> 'solely' since it can also be used to achieve an instantaneous seek. >> > > To be honest, I'm not sure whether there is even a need to include > this whole section. It's obvious that if you stop feeding a mem2mem > device, it will pause. Moreover, other sections imply various > behaviors triggered by STREAMOFF/STREAMON/DECODER_CMD/etc., so it > should be quite clear that they are different from a simple pause. > What do you think? Yes, I'd drop this last sentence ('Similarly...sets'). >>> +2. After all buffers containing decoded frames from before the resolution >>> + change point are ready to be dequeued on the ``CAPTURE`` queue, the >>> + driver sends a ``V4L2_EVENT_SOURCE_CHANGE`` event for source change >>> + type ``V4L2_EVENT_SRC_CH_RESOLUTION``. >>> + >>> + * The last buffer from before the change must be marked with >>> + :c:type:`v4l2_buffer` ``flags`` flag ``V4L2_BUF_FLAG_LAST`` as in the >> >> spurious 'as'? >> > > It should be: > > * The last buffer from before the change must be marked with > the ``V4L2_BUF_FLAG_LAST`` flag in :c:type:`v4l2_buffer` ``flags`` field, > similarly to the Ah, OK. Now I get it. >> I wonder if we should make these min buffer controls required. It might be easier >> that way. > > Agreed. Although userspace is still free to ignore it, because REQBUFS > would do the right thing anyway. It's never been entirely clear to me what the purpose of those min buffers controls is. REQBUFS ensures that the number of buffers is at least the minimum needed to make the HW work. So why would you need these controls? It only makes sense if they return something different from REQBUFS. > >> >>> +7. If all the following conditions are met, the client may resume the >>> + decoding instantly, by using :c:func:`VIDIOC_DECODER_CMD` with >>> + ``V4L2_DEC_CMD_START`` command, as in case of resuming after the drain >>> + sequence: >>> + >>> + * ``sizeimage`` of new format is less than or equal to the size of >>> + currently allocated buffers, >>> + >>> + * the number of buffers currently allocated is greater than or equal to >>> + the minimum number of buffers acquired in step 6. >> >> You might want to mention that if there are insufficient buffers, then >> VIDIOC_CREATE_BUFS can be used to add more buffers. >> > > This might be a bit tricky, since at least s5p-mfc and coda can only > work on a fixed buffer set and one would need to fully reinitialize > the decoding to add one more buffer, which would effectively be the > full resolution change sequence, as below, just with REQBUFS(0), > REQBUFS(N) replaced with CREATE_BUFS. What happens today in those drivers if you try to call CREATE_BUFS? Regards, Hans
On 07/24/2018 04:06 PM, Tomasz Figa wrote: > Due to complexity of the video decoding process, the V4L2 drivers of > stateful decoder hardware require specific sequences of V4L2 API calls > to be followed. These include capability enumeration, initialization, > decoding, seek, pause, dynamic resolution change, drain and end of > stream. > > Specifics of the above have been discussed during Media Workshops at > LinuxCon Europe 2012 in Barcelona and then later Embedded Linux > Conference Europe 2014 in Düsseldorf. The de facto Codec API that > originated at those events was later implemented by the drivers we already > have merged in mainline, such as s5p-mfc or coda. > > The only thing missing was the real specification included as a part of > Linux Media documentation. Fix it now and document the decoder part of > the Codec API. > > Signed-off-by: Tomasz Figa <tfiga@chromium.org> > --- > Documentation/media/uapi/v4l/dev-decoder.rst | 872 +++++++++++++++++++ > Documentation/media/uapi/v4l/devices.rst | 1 + > Documentation/media/uapi/v4l/v4l2.rst | 10 +- > 3 files changed, 882 insertions(+), 1 deletion(-) > create mode 100644 Documentation/media/uapi/v4l/dev-decoder.rst > > diff --git a/Documentation/media/uapi/v4l/dev-decoder.rst b/Documentation/media/uapi/v4l/dev-decoder.rst > new file mode 100644 > index 000000000000..f55d34d2f860 > --- /dev/null > +++ b/Documentation/media/uapi/v4l/dev-decoder.rst > @@ -0,0 +1,872 @@ <snip> > +6. This step only applies to coded formats that contain resolution > + information in the stream. Continue queuing/dequeuing bitstream > + buffers to/from the ``OUTPUT`` queue via :c:func:`VIDIOC_QBUF` and > + :c:func:`VIDIOC_DQBUF`. The driver must keep processing and returning > + each buffer to the client until required metadata to configure the > + ``CAPTURE`` queue are found. This is indicated by the driver sending > + a ``V4L2_EVENT_SOURCE_CHANGE`` event with > + ``V4L2_EVENT_SRC_CH_RESOLUTION`` source change type. There is no > + requirement to pass enough data for this to occur in the first buffer > + and the driver must be able to process any number. > + > + * If data in a buffer that triggers the event is required to decode > + the first frame, the driver must not return it to the client, > + but must retain it for further decoding. > + > + * If the client set width and height of ``OUTPUT`` format to 0, calling > + :c:func:`VIDIOC_G_FMT` on the ``CAPTURE`` queue will return -EPERM, > + until the driver configures ``CAPTURE`` format according to stream > + metadata. What about calling TRY/S_FMT on the capture queue: will this also return -EPERM? I assume so. Regards, Hans
On Thu, Jul 26, 2018 at 7:36 PM Philipp Zabel <p.zabel@pengutronix.de> wrote: > > On Thu, 2018-07-26 at 19:20 +0900, Tomasz Figa wrote: > [...] > > > You might want to mention that if there are insufficient buffers, then > > > VIDIOC_CREATE_BUFS can be used to add more buffers. > > > > > > > This might be a bit tricky, since at least s5p-mfc and coda can only > > work on a fixed buffer set and one would need to fully reinitialize > > the decoding to add one more buffer, which would effectively be the > > full resolution change sequence, as below, just with REQBUFS(0), > > REQBUFS(N) replaced with CREATE_BUFS. > > The coda driver supports CREATE_BUFS on the decoder CAPTURE queue. > > The firmware indeed needs a fixed frame buffer set, but these buffers > are internal only and in a coda specific tiling format. The content of > finished internal buffers is copied / detiled into the external CAPTURE > buffers, so those can be added at will. Thanks for clarifying. I forgot about that internal copy indeed. Best regards, Tomasz
On Thu, Jul 26, 2018 at 7:57 PM Hans Verkuil <hverkuil@xs4all.nl> wrote: > > On 26/07/18 12:20, Tomasz Figa wrote: > > Hi Hans, > > > > On Wed, Jul 25, 2018 at 8:59 PM Hans Verkuil <hverkuil@xs4all.nl> wrote: > >> > >> Hi Tomasz, > >> > >> Many, many thanks for working on this! It's a great document and when done > >> it will be very useful indeed. > >> > >> Review comments follow... > > > > Thanks for review! > > > >> > >> On 24/07/18 16:06, Tomasz Figa wrote: > > > [snip] > > >> Note that the v4l2_pix_format/S_FMT documentation needs to be updated as well > >> since we never allowed 0x0 before. > > > > Is there any text that disallows this? I couldn't spot any. Generally > > there are already drivers which return 0x0 for coded formats (s5p-mfc) > > and it's not even strange, because in such case, the buffer contains > > just a sequence of bytes, not a 2D picture. > > All non-m2m devices will always have non-zero width/height values. Only with > m2m devices do we see this. > > This was probably never documented since before m2m appeared it was 'obvious'. > > This definitely needs to be documented, though. > Fair enough. Let me try to add a note there. > > > >> What if you set the format to 0x0 but the stream does not have meta data with > >> the resolution? How does userspace know if 0x0 is allowed or not? If this is > >> specific to the chosen coded pixel format, should be add a new flag for those > >> formats indicating that the coded data contains resolution information? > > > > Yes, this would definitely be on a per-format basis. Not sure what you > > mean by a flag, though? E.g. if the format is set to H264, then it's > > bound to include resolution information. If the format doesn't include > > it, then userspace is already aware of this fact, because it needs to > > get this from some other source (e.g. container). > > > >> > >> That way userspace knows if 0x0 can be used, and the driver can reject 0x0 > >> for formats that do not support it. > > > > As above, but I might be misunderstanding your suggestion. > > So my question is: is this tied to the pixel format, or should we make it > explicit with a flag like V4L2_FMT_FLAG_CAN_DECODE_WXH. > > The advantage of a flag is that you don't need a switch on the format to > know whether or not 0x0 is allowed. And the flag can just be set in > v4l2-ioctls.c. As far as my understanding goes, what data is included in the stream is definitely specified by format. For example, a H264 elementary stream will always include those data as a part of SPS. However, having such flag internally, not exposed to userspace, could indeed be useful to avoid all drivers have such switch. That wouldn't belong to this documentation, though, since it would be just kernel API. > > >>> +STREAMOFF-STREAMON sequence on it is intended solely for changing buffer > >>> +sets. > >> > >> 'changing buffer sets': not clear what is meant by this. It's certainly not > >> 'solely' since it can also be used to achieve an instantaneous seek. > >> > > > > To be honest, I'm not sure whether there is even a need to include > > this whole section. It's obvious that if you stop feeding a mem2mem > > device, it will pause. Moreover, other sections imply various > > behaviors triggered by STREAMOFF/STREAMON/DECODER_CMD/etc., so it > > should be quite clear that they are different from a simple pause. > > What do you think? > > Yes, I'd drop this last sentence ('Similarly...sets'). > Ack. > >>> +2. After all buffers containing decoded frames from before the resolution > >>> + change point are ready to be dequeued on the ``CAPTURE`` queue, the > >>> + driver sends a ``V4L2_EVENT_SOURCE_CHANGE`` event for source change > >>> + type ``V4L2_EVENT_SRC_CH_RESOLUTION``. > >>> + > >>> + * The last buffer from before the change must be marked with > >>> + :c:type:`v4l2_buffer` ``flags`` flag ``V4L2_BUF_FLAG_LAST`` as in the > >> > >> spurious 'as'? > >> > > > > It should be: > > > > * The last buffer from before the change must be marked with > > the ``V4L2_BUF_FLAG_LAST`` flag in :c:type:`v4l2_buffer` ``flags`` field, > > similarly to the > > Ah, OK. Now I get it. > > >> I wonder if we should make these min buffer controls required. It might be easier > >> that way. > > > > Agreed. Although userspace is still free to ignore it, because REQBUFS > > would do the right thing anyway. > > It's never been entirely clear to me what the purpose of those min buffers controls > is. REQBUFS ensures that the number of buffers is at least the minimum needed to > make the HW work. So why would you need these controls? It only makes sense if they > return something different from REQBUFS. > The purpose of those controls is to let the client allocate a number of buffers bigger than minimum, without the need to allocate the minimum number of buffers first (to just learn the number), free them and then allocate a bigger number again. > > > >> > >>> +7. If all the following conditions are met, the client may resume the > >>> + decoding instantly, by using :c:func:`VIDIOC_DECODER_CMD` with > >>> + ``V4L2_DEC_CMD_START`` command, as in case of resuming after the drain > >>> + sequence: > >>> + > >>> + * ``sizeimage`` of new format is less than or equal to the size of > >>> + currently allocated buffers, > >>> + > >>> + * the number of buffers currently allocated is greater than or equal to > >>> + the minimum number of buffers acquired in step 6. > >> > >> You might want to mention that if there are insufficient buffers, then > >> VIDIOC_CREATE_BUFS can be used to add more buffers. > >> > > > > This might be a bit tricky, since at least s5p-mfc and coda can only > > work on a fixed buffer set and one would need to fully reinitialize > > the decoding to add one more buffer, which would effectively be the > > full resolution change sequence, as below, just with REQBUFS(0), > > REQBUFS(N) replaced with CREATE_BUFS. > > What happens today in those drivers if you try to call CREATE_BUFS? s5p-mfc doesn't set the .vidioc_create_bufs pointer in its v4l2_ioctl_ops, so I suppose that would be -ENOTTY? Best regards, Tomasz
On Mon, Jul 30, 2018 at 9:52 PM Hans Verkuil <hverkuil@xs4all.nl> wrote: > > On 07/24/2018 04:06 PM, Tomasz Figa wrote: > > Due to complexity of the video decoding process, the V4L2 drivers of > > stateful decoder hardware require specific sequences of V4L2 API calls > > to be followed. These include capability enumeration, initialization, > > decoding, seek, pause, dynamic resolution change, drain and end of > > stream. > > > > Specifics of the above have been discussed during Media Workshops at > > LinuxCon Europe 2012 in Barcelona and then later Embedded Linux > > Conference Europe 2014 in Düsseldorf. The de facto Codec API that > > originated at those events was later implemented by the drivers we already > > have merged in mainline, such as s5p-mfc or coda. > > > > The only thing missing was the real specification included as a part of > > Linux Media documentation. Fix it now and document the decoder part of > > the Codec API. > > > > Signed-off-by: Tomasz Figa <tfiga@chromium.org> > > --- > > Documentation/media/uapi/v4l/dev-decoder.rst | 872 +++++++++++++++++++ > > Documentation/media/uapi/v4l/devices.rst | 1 + > > Documentation/media/uapi/v4l/v4l2.rst | 10 +- > > 3 files changed, 882 insertions(+), 1 deletion(-) > > create mode 100644 Documentation/media/uapi/v4l/dev-decoder.rst > > > > diff --git a/Documentation/media/uapi/v4l/dev-decoder.rst b/Documentation/media/uapi/v4l/dev-decoder.rst > > new file mode 100644 > > index 000000000000..f55d34d2f860 > > --- /dev/null > > +++ b/Documentation/media/uapi/v4l/dev-decoder.rst > > @@ -0,0 +1,872 @@ > > <snip> > > > +6. This step only applies to coded formats that contain resolution > > + information in the stream. Continue queuing/dequeuing bitstream > > + buffers to/from the ``OUTPUT`` queue via :c:func:`VIDIOC_QBUF` and > > + :c:func:`VIDIOC_DQBUF`. The driver must keep processing and returning > > + each buffer to the client until required metadata to configure the > > + ``CAPTURE`` queue are found. This is indicated by the driver sending > > + a ``V4L2_EVENT_SOURCE_CHANGE`` event with > > + ``V4L2_EVENT_SRC_CH_RESOLUTION`` source change type. There is no > > + requirement to pass enough data for this to occur in the first buffer > > + and the driver must be able to process any number. > > + > > + * If data in a buffer that triggers the event is required to decode > > + the first frame, the driver must not return it to the client, > > + but must retain it for further decoding. > > + > > + * If the client set width and height of ``OUTPUT`` format to 0, calling > > + :c:func:`VIDIOC_G_FMT` on the ``CAPTURE`` queue will return -EPERM, > > + until the driver configures ``CAPTURE`` format according to stream > > + metadata. > > What about calling TRY/S_FMT on the capture queue: will this also return -EPERM? > I assume so. We should make it so indeed, to make things consistent. On another note, I don't really like this -EPERM here, as one could just see that the format is 0x0 and know that it's not valid. This is only needed for legacy userspace that doesn't handle the source change event in initial stream parsing and just checks whether G_FMT returns an error instead. Nicolas, for more insight here. Best regards, Tomasz
On 07/26/2018 12:20 PM, Tomasz Figa wrote: > Hi Hans, > > On Wed, Jul 25, 2018 at 8:59 PM Hans Verkuil <hverkuil@xs4all.nl> wrote: >>> + >>> +14. Call :c:func:`VIDIOC_STREAMON` to initiate decoding frames. >>> + >>> +Decoding >>> +======== >>> + >>> +This state is reached after a successful initialization sequence. In this >>> +state, client queues and dequeues buffers to both queues via >>> +:c:func:`VIDIOC_QBUF` and :c:func:`VIDIOC_DQBUF`, following standard >>> +semantics. >>> + >>> +Both queues operate independently, following standard behavior of V4L2 >>> +buffer queues and memory-to-memory devices. In addition, the order of >>> +decoded frames dequeued from ``CAPTURE`` queue may differ from the order of >>> +queuing coded frames to ``OUTPUT`` queue, due to properties of selected >>> +coded format, e.g. frame reordering. The client must not assume any direct >>> +relationship between ``CAPTURE`` and ``OUTPUT`` buffers, other than >>> +reported by :c:type:`v4l2_buffer` ``timestamp`` field. >> >> Is there a relationship between capture and output buffers w.r.t. the timestamp >> field? I am not aware that there is one. > > I believe the decoder was expected to copy the timestamp of matching > OUTPUT buffer to respective CAPTURE buffer. Both s5p-mfc and coda seem > to be implementing it this way. I guess it might be a good idea to > specify this more explicitly. What about an output buffer producing multiple capture buffers? Or the case where the encoded bitstream of a frame starts at one output buffer and ends at another? What happens if you have B frames and the order of the capture buffers is different from the output buffers? In other words, for codecs there is no clear 1-to-1 relationship between an output buffer and a capture buffer. And we never defined what the 'copy timestamp' behavior should be in that case or if it even makes sense. Regards, Hans
On 08/07/2018 09:05 AM, Tomasz Figa wrote: > On Thu, Jul 26, 2018 at 7:57 PM Hans Verkuil <hverkuil@xs4all.nl> wrote: >>>> What if you set the format to 0x0 but the stream does not have meta data with >>>> the resolution? How does userspace know if 0x0 is allowed or not? If this is >>>> specific to the chosen coded pixel format, should be add a new flag for those >>>> formats indicating that the coded data contains resolution information? >>> >>> Yes, this would definitely be on a per-format basis. Not sure what you >>> mean by a flag, though? E.g. if the format is set to H264, then it's >>> bound to include resolution information. If the format doesn't include >>> it, then userspace is already aware of this fact, because it needs to >>> get this from some other source (e.g. container). >>> >>>> >>>> That way userspace knows if 0x0 can be used, and the driver can reject 0x0 >>>> for formats that do not support it. >>> >>> As above, but I might be misunderstanding your suggestion. >> >> So my question is: is this tied to the pixel format, or should we make it >> explicit with a flag like V4L2_FMT_FLAG_CAN_DECODE_WXH. >> >> The advantage of a flag is that you don't need a switch on the format to >> know whether or not 0x0 is allowed. And the flag can just be set in >> v4l2-ioctls.c. > > As far as my understanding goes, what data is included in the stream > is definitely specified by format. For example, a H264 elementary > stream will always include those data as a part of SPS. > > However, having such flag internally, not exposed to userspace, could > indeed be useful to avoid all drivers have such switch. That wouldn't > belong to this documentation, though, since it would be just kernel > API. Why would you keep this internally only? >>>> I wonder if we should make these min buffer controls required. It might be easier >>>> that way. >>> >>> Agreed. Although userspace is still free to ignore it, because REQBUFS >>> would do the right thing anyway. >> >> It's never been entirely clear to me what the purpose of those min buffers controls >> is. REQBUFS ensures that the number of buffers is at least the minimum needed to >> make the HW work. So why would you need these controls? It only makes sense if they >> return something different from REQBUFS. >> > > The purpose of those controls is to let the client allocate a number > of buffers bigger than minimum, without the need to allocate the > minimum number of buffers first (to just learn the number), free them > and then allocate a bigger number again. I don't feel this is particularly useful. One problem with the minimum number of buffers as used in the kernel is that it is often the minimum number of buffers required to make the hardware work, but it may not be optimal. E.g. quite a few capture drivers set the minimum to 2, which is enough for the hardware, but it will likely lead to dropped frames. You really need 3 (one is being DMAed, one is queued and linked into the DMA engine and one is being processed by userspace). I would actually prefer this to be the recommended minimum number of buffers, which is >= the minimum REQBUFS uses. I.e., if you use this number and you have no special requirements, then you'll get good performance. > >>> >>>> >>>>> +7. If all the following conditions are met, the client may resume the >>>>> + decoding instantly, by using :c:func:`VIDIOC_DECODER_CMD` with >>>>> + ``V4L2_DEC_CMD_START`` command, as in case of resuming after the drain >>>>> + sequence: >>>>> + >>>>> + * ``sizeimage`` of new format is less than or equal to the size of >>>>> + currently allocated buffers, >>>>> + >>>>> + * the number of buffers currently allocated is greater than or equal to >>>>> + the minimum number of buffers acquired in step 6. >>>> >>>> You might want to mention that if there are insufficient buffers, then >>>> VIDIOC_CREATE_BUFS can be used to add more buffers. >>>> >>> >>> This might be a bit tricky, since at least s5p-mfc and coda can only >>> work on a fixed buffer set and one would need to fully reinitialize >>> the decoding to add one more buffer, which would effectively be the >>> full resolution change sequence, as below, just with REQBUFS(0), >>> REQBUFS(N) replaced with CREATE_BUFS. >> >> What happens today in those drivers if you try to call CREATE_BUFS? > > s5p-mfc doesn't set the .vidioc_create_bufs pointer in its > v4l2_ioctl_ops, so I suppose that would be -ENOTTY? Correct for s5p-mfc. Regards, Hans
2018-08-07 9:13 GMT+02:00 Hans Verkuil <hverkuil@xs4all.nl>: > On 07/26/2018 12:20 PM, Tomasz Figa wrote: >> Hi Hans, >> >> On Wed, Jul 25, 2018 at 8:59 PM Hans Verkuil <hverkuil@xs4all.nl> wrote: >>>> + >>>> +14. Call :c:func:`VIDIOC_STREAMON` to initiate decoding frames. >>>> + >>>> +Decoding >>>> +======== >>>> + >>>> +This state is reached after a successful initialization sequence. In this >>>> +state, client queues and dequeues buffers to both queues via >>>> +:c:func:`VIDIOC_QBUF` and :c:func:`VIDIOC_DQBUF`, following standard >>>> +semantics. >>>> + >>>> +Both queues operate independently, following standard behavior of V4L2 >>>> +buffer queues and memory-to-memory devices. In addition, the order of >>>> +decoded frames dequeued from ``CAPTURE`` queue may differ from the order of >>>> +queuing coded frames to ``OUTPUT`` queue, due to properties of selected >>>> +coded format, e.g. frame reordering. The client must not assume any direct >>>> +relationship between ``CAPTURE`` and ``OUTPUT`` buffers, other than >>>> +reported by :c:type:`v4l2_buffer` ``timestamp`` field. >>> >>> Is there a relationship between capture and output buffers w.r.t. the timestamp >>> field? I am not aware that there is one. >> >> I believe the decoder was expected to copy the timestamp of matching >> OUTPUT buffer to respective CAPTURE buffer. Both s5p-mfc and coda seem >> to be implementing it this way. I guess it might be a good idea to >> specify this more explicitly. > > What about an output buffer producing multiple capture buffers? Or the case > where the encoded bitstream of a frame starts at one output buffer and ends > at another? What happens if you have B frames and the order of the capture > buffers is different from the output buffers? > > In other words, for codecs there is no clear 1-to-1 relationship between an > output buffer and a capture buffer. And we never defined what the 'copy timestamp' > behavior should be in that case or if it even makes sense. > > Regards, > > Hans As it is done right now in userspace (FFmpeg, GStreamer) and most (if not all?) drivers, it's a 1:1 between OUTPUT and CAPTURE. The only thing that changes is the ordering since OUTPUT buffers are in decoding order while CAPTURE buffers are in presentation order. This almost always implies some timestamping kung-fu to match the OUTPUT timestamps with the corresponding CAPTURE timestamps. It's often done indirectly by the firmware on some platforms (rpi comes to mind iirc). The current constructions also imply one video packet per OUTPUT buffer. If a video packet is too big to fit in a buffer, FFmpeg will crop that packet to the maximum buffer size and will discard the remaining packet data. GStreamer will abort the decoding. This is unfortunately one of the shortcomings of having fixed-size buffers. And if they were to split the packet in multiple buffers, then some drivers in their current state wouldn't be able to handle the timestamping issues and/or x:1 OUTPUT:CAPTURE buffer numbers. Maxime
Hi Maxime, On Tue, Aug 7, 2018 at 5:32 AM Maxime Jourdan <maxi.jourdan@wanadoo.fr> wrote: > > Hi Tomasz, > > Sorry for sending this email only to you, I subscribed to linux-media > after you posted this and I'm not sure how to respond to everybody. > No worries. Let me reply with other recipients added back. Thanks for your comments. > I'm currently developing a V4L2 M2M decoder driver for Amlogic SoCs so > my comments are somewhat biased towards it > (https://github.com/Elyotna/linux) > > > +Seek > > +==== > > + > > +Seek is controlled by the ``OUTPUT`` queue, as it is the source of > > +bitstream data. ``CAPTURE`` queue remains unchanged/unaffected. > > + > > +1. Stop the ``OUTPUT`` queue to begin the seek sequence via > > + :c:func:`VIDIOC_STREAMOFF`. > > + > > + * **Required fields:** > > + > > + ``type`` > > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` > > + > > + * The driver must drop all the pending ``OUTPUT`` buffers and they are > > + treated as returned to the client (following standard semantics). > > + > > +2. Restart the ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON` > > + > > + * **Required fields:** > > + > > + ``type`` > > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` > > + > > + * The driver must be put in a state after seek and be ready to > > + accept new source bitstream buffers. > > + > > +3. Start queuing buffers to ``OUTPUT`` queue containing stream data after > > + the seek until a suitable resume point is found. > > + > > + .. note:: > > + > > + There is no requirement to begin queuing stream starting exactly from > > + a resume point (e.g. SPS or a keyframe). The driver must handle any > > + data queued and must keep processing the queued buffers until it > > + finds a suitable resume point. While looking for a resume point, the > > + driver processes ``OUTPUT`` buffers and returns them to the client > > + without producing any decoded frames. > > + > > + For hardware known to be mishandling seeks to a non-resume point, > > + e.g. by returning corrupted decoded frames, the driver must be able > > + to handle such seeks without a crash or any fatal decode error. > > This is unfortunately my case, apart from parsing the bitstream > manually - which is a no-no -, there is no way to know when I'll be > writing in an IDR frame to the HW bitstream parser. I think it would > be much preferable that the client starts sending in an IDR frame for > sure. Most of the hardware, which have upstream drivers, deal with this correctly and there is existing user space that relies on this, so we cannot simply add such requirement. However, when sending your driver upstream, feel free to include a patch that adds a read-only control that tells the user space that it needs to do seeks to resume points. Obviously this will work only with user space aware of this requirement, but I don't think we can do anything better here. > > > +4. After a resume point is found, the driver will start returning > > + ``CAPTURE`` buffers with decoded frames. > > + > > + * There is no precise specification for ``CAPTURE`` queue of when it > > + will start producing buffers containing decoded data from buffers > > + queued after the seek, as it operates independently > > + from ``OUTPUT`` queue. > > + > > + * The driver is allowed to and may return a number of remaining > > + ``CAPTURE`` buffers containing decoded frames from before the seek > > + after the seek sequence (STREAMOFF-STREAMON) is performed. > > + > > + * The driver is also allowed to and may not return all decoded frames > > + queued but not decode before the seek sequence was initiated. For > > + example, given an ``OUTPUT`` queue sequence: QBUF(A), QBUF(B), > > + STREAMOFF(OUT), STREAMON(OUT), QBUF(G), QBUF(H), any of the > > + following results on the ``CAPTURE`` queue is allowed: {A’, B’, G’, > > + H’}, {A’, G’, H’}, {G’, H’}. > > + > > + .. note:: > > + > > + To achieve instantaneous seek, the client may restart streaming on > > + ``CAPTURE`` queue to discard decoded, but not yet dequeued buffers. > > Overall, I think Drain followed by V4L2_DEC_CMD_START is a more > applicable scenario for seeking. > Heck, simply starting to queue buffers at the seek - starting with an > IDR - without doing any kind of streamon/off or cmd_start(stop) will > do the trick. Why do you think so? For a seek, as expected by a typical device user, the result should be discarding anything already queued and just start decoding new frames as soon as possible. Actually, this section doesn't describe any specific sequence, just possible ways to do a seek using existing primitives. Best regards, Tomasz
On Tue, Aug 7, 2018 at 4:37 PM Hans Verkuil <hverkuil@xs4all.nl> wrote: > > On 08/07/2018 09:05 AM, Tomasz Figa wrote: > > On Thu, Jul 26, 2018 at 7:57 PM Hans Verkuil <hverkuil@xs4all.nl> wrote: > >>>> What if you set the format to 0x0 but the stream does not have meta data with > >>>> the resolution? How does userspace know if 0x0 is allowed or not? If this is > >>>> specific to the chosen coded pixel format, should be add a new flag for those > >>>> formats indicating that the coded data contains resolution information? > >>> > >>> Yes, this would definitely be on a per-format basis. Not sure what you > >>> mean by a flag, though? E.g. if the format is set to H264, then it's > >>> bound to include resolution information. If the format doesn't include > >>> it, then userspace is already aware of this fact, because it needs to > >>> get this from some other source (e.g. container). > >>> > >>>> > >>>> That way userspace knows if 0x0 can be used, and the driver can reject 0x0 > >>>> for formats that do not support it. > >>> > >>> As above, but I might be misunderstanding your suggestion. > >> > >> So my question is: is this tied to the pixel format, or should we make it > >> explicit with a flag like V4L2_FMT_FLAG_CAN_DECODE_WXH. > >> > >> The advantage of a flag is that you don't need a switch on the format to > >> know whether or not 0x0 is allowed. And the flag can just be set in > >> v4l2-ioctls.c. > > > > As far as my understanding goes, what data is included in the stream > > is definitely specified by format. For example, a H264 elementary > > stream will always include those data as a part of SPS. > > > > However, having such flag internally, not exposed to userspace, could > > indeed be useful to avoid all drivers have such switch. That wouldn't > > belong to this documentation, though, since it would be just kernel > > API. > > Why would you keep this internally only? > Well, either keep it internal or make it read-only for the user space, since the behavior is already defined by selected pixel format. > >>>> I wonder if we should make these min buffer controls required. It might be easier > >>>> that way. > >>> > >>> Agreed. Although userspace is still free to ignore it, because REQBUFS > >>> would do the right thing anyway. > >> > >> It's never been entirely clear to me what the purpose of those min buffers controls > >> is. REQBUFS ensures that the number of buffers is at least the minimum needed to > >> make the HW work. So why would you need these controls? It only makes sense if they > >> return something different from REQBUFS. > >> > > > > The purpose of those controls is to let the client allocate a number > > of buffers bigger than minimum, without the need to allocate the > > minimum number of buffers first (to just learn the number), free them > > and then allocate a bigger number again. > > I don't feel this is particularly useful. One problem with the minimum number > of buffers as used in the kernel is that it is often the minimum number of > buffers required to make the hardware work, but it may not be optimal. E.g. > quite a few capture drivers set the minimum to 2, which is enough for the > hardware, but it will likely lead to dropped frames. You really need 3 > (one is being DMAed, one is queued and linked into the DMA engine and one is > being processed by userspace). > > I would actually prefer this to be the recommended minimum number of buffers, > which is >= the minimum REQBUFS uses. > > I.e., if you use this number and you have no special requirements, then you'll > get good performance. I guess we could make it so. It would make existing user space request more buffers than it used to with the original meaning, but I guess it shouldn't be a big problem. > > > > >>> > >>>> > >>>>> +7. If all the following conditions are met, the client may resume the > >>>>> + decoding instantly, by using :c:func:`VIDIOC_DECODER_CMD` with > >>>>> + ``V4L2_DEC_CMD_START`` command, as in case of resuming after the drain > >>>>> + sequence: > >>>>> + > >>>>> + * ``sizeimage`` of new format is less than or equal to the size of > >>>>> + currently allocated buffers, > >>>>> + > >>>>> + * the number of buffers currently allocated is greater than or equal to > >>>>> + the minimum number of buffers acquired in step 6. > >>>> > >>>> You might want to mention that if there are insufficient buffers, then > >>>> VIDIOC_CREATE_BUFS can be used to add more buffers. > >>>> > >>> > >>> This might be a bit tricky, since at least s5p-mfc and coda can only > >>> work on a fixed buffer set and one would need to fully reinitialize > >>> the decoding to add one more buffer, which would effectively be the > >>> full resolution change sequence, as below, just with REQBUFS(0), > >>> REQBUFS(N) replaced with CREATE_BUFS. > >> > >> What happens today in those drivers if you try to call CREATE_BUFS? > > > > s5p-mfc doesn't set the .vidioc_create_bufs pointer in its > > v4l2_ioctl_ops, so I suppose that would be -ENOTTY? > > Correct for s5p-mfc. As Philipp clarified, coda supports adding buffers on the fly. I briefly looked at venus and mtk-vcodec and they seem to use m2m implementation of CREATE_BUFS. Not sure if anyone tested that, though. So the only hardware I know for sure cannot support this is s5p-mfc. Best regards, Tomasz
On Wed, Aug 8, 2018 at 4:11 AM Maxime Jourdan <maxi.jourdan@wanadoo.fr> wrote: > > 2018-08-07 9:13 GMT+02:00 Hans Verkuil <hverkuil@xs4all.nl>: > > On 07/26/2018 12:20 PM, Tomasz Figa wrote: > >> Hi Hans, > >> > >> On Wed, Jul 25, 2018 at 8:59 PM Hans Verkuil <hverkuil@xs4all.nl> wrote: > >>>> + > >>>> +14. Call :c:func:`VIDIOC_STREAMON` to initiate decoding frames. > >>>> + > >>>> +Decoding > >>>> +======== > >>>> + > >>>> +This state is reached after a successful initialization sequence. In this > >>>> +state, client queues and dequeues buffers to both queues via > >>>> +:c:func:`VIDIOC_QBUF` and :c:func:`VIDIOC_DQBUF`, following standard > >>>> +semantics. > >>>> + > >>>> +Both queues operate independently, following standard behavior of V4L2 > >>>> +buffer queues and memory-to-memory devices. In addition, the order of > >>>> +decoded frames dequeued from ``CAPTURE`` queue may differ from the order of > >>>> +queuing coded frames to ``OUTPUT`` queue, due to properties of selected > >>>> +coded format, e.g. frame reordering. The client must not assume any direct > >>>> +relationship between ``CAPTURE`` and ``OUTPUT`` buffers, other than > >>>> +reported by :c:type:`v4l2_buffer` ``timestamp`` field. > >>> > >>> Is there a relationship between capture and output buffers w.r.t. the timestamp > >>> field? I am not aware that there is one. > >> > >> I believe the decoder was expected to copy the timestamp of matching > >> OUTPUT buffer to respective CAPTURE buffer. Both s5p-mfc and coda seem > >> to be implementing it this way. I guess it might be a good idea to > >> specify this more explicitly. > > > > What about an output buffer producing multiple capture buffers? Or the case > > where the encoded bitstream of a frame starts at one output buffer and ends > > at another? What happens if you have B frames and the order of the capture > > buffers is different from the output buffers? > > > > In other words, for codecs there is no clear 1-to-1 relationship between an > > output buffer and a capture buffer. And we never defined what the 'copy timestamp' > > behavior should be in that case or if it even makes sense. > > > > Regards, > > > > Hans > > As it is done right now in userspace (FFmpeg, GStreamer) and most (if > not all?) drivers, it's a 1:1 between OUTPUT and CAPTURE. The only > thing that changes is the ordering since OUTPUT buffers are in > decoding order while CAPTURE buffers are in presentation order. If I understood it correctly, there is a feature in VP9 that lets one frame repeat several times, which would make one OUTPUT buffer produce multiple CAPTURE buffers. Moreover, V4L2_PIX_FMT_H264 is actually defined to be a byte stream, without any need for framing, and yes, there are drivers that follow this definition correctly (s5p-mfc and, AFAIR, coda). In that case, one OUTPUT buffer can have arbitrary amount of bitstream and lead to multiple CAPTURE frames being produced. > > This almost always implies some timestamping kung-fu to match the > OUTPUT timestamps with the corresponding CAPTURE timestamps. It's > often done indirectly by the firmware on some platforms (rpi comes to > mind iirc). I don't think there is an upstream driver for it, is there? (If not, are you aware of any work towards it?) > > The current constructions also imply one video packet per OUTPUT > buffer. If a video packet is too big to fit in a buffer, FFmpeg will > crop that packet to the maximum buffer size and will discard the > remaining packet data. GStreamer will abort the decoding. This is > unfortunately one of the shortcomings of having fixed-size buffers. > And if they were to split the packet in multiple buffers, then some > drivers in their current state wouldn't be able to handle the > timestamping issues and/or x:1 OUTPUT:CAPTURE buffer numbers. In Chromium, we just allocate OUTPUT buffers big enough to be really unlikely for a single frame not to fit inside [1]. Obviously it's a waste of memory, for formats which normally have just single frames inside buffers, but it seems to work in practice. [1] https://cs.chromium.org/chromium/src/media/gpu/v4l2/v4l2_video_decode_accelerator.h?rcl=3468d5a59e00bcb2c2e946a30694e6057fd9ab21&l=118 Best regards, Tomasz
On Tue, Aug 7, 2018 at 4:13 PM Hans Verkuil <hverkuil@xs4all.nl> wrote: > > On 07/26/2018 12:20 PM, Tomasz Figa wrote: > > Hi Hans, > > > > On Wed, Jul 25, 2018 at 8:59 PM Hans Verkuil <hverkuil@xs4all.nl> wrote: > >>> + > >>> +14. Call :c:func:`VIDIOC_STREAMON` to initiate decoding frames. > >>> + > >>> +Decoding > >>> +======== > >>> + > >>> +This state is reached after a successful initialization sequence. In this > >>> +state, client queues and dequeues buffers to both queues via > >>> +:c:func:`VIDIOC_QBUF` and :c:func:`VIDIOC_DQBUF`, following standard > >>> +semantics. > >>> + > >>> +Both queues operate independently, following standard behavior of V4L2 > >>> +buffer queues and memory-to-memory devices. In addition, the order of > >>> +decoded frames dequeued from ``CAPTURE`` queue may differ from the order of > >>> +queuing coded frames to ``OUTPUT`` queue, due to properties of selected > >>> +coded format, e.g. frame reordering. The client must not assume any direct > >>> +relationship between ``CAPTURE`` and ``OUTPUT`` buffers, other than > >>> +reported by :c:type:`v4l2_buffer` ``timestamp`` field. > >> > >> Is there a relationship between capture and output buffers w.r.t. the timestamp > >> field? I am not aware that there is one. > > > > I believe the decoder was expected to copy the timestamp of matching > > OUTPUT buffer to respective CAPTURE buffer. Both s5p-mfc and coda seem > > to be implementing it this way. I guess it might be a good idea to > > specify this more explicitly. > > What about an output buffer producing multiple capture buffers? Or the case > where the encoded bitstream of a frame starts at one output buffer and ends > at another? What happens if you have B frames and the order of the capture > buffers is different from the output buffers? > > In other words, for codecs there is no clear 1-to-1 relationship between an > output buffer and a capture buffer. And we never defined what the 'copy timestamp' > behavior should be in that case or if it even makes sense. You're perfectly right. There is no 1:1 relationship, but it doesn't prevent copying timestamps. It just makes it possible for multiple CAPTURE buffers to have the same timestamp or some OUTPUT timestamps not to be found in any CAPTURE buffer. Best regards, Tomasz
On 08/08/2018 05:11 AM, Tomasz Figa wrote: > On Tue, Aug 7, 2018 at 4:13 PM Hans Verkuil <hverkuil@xs4all.nl> wrote: >> >> On 07/26/2018 12:20 PM, Tomasz Figa wrote: >>> Hi Hans, >>> >>> On Wed, Jul 25, 2018 at 8:59 PM Hans Verkuil <hverkuil@xs4all.nl> wrote: >>>>> + >>>>> +14. Call :c:func:`VIDIOC_STREAMON` to initiate decoding frames. >>>>> + >>>>> +Decoding >>>>> +======== >>>>> + >>>>> +This state is reached after a successful initialization sequence. In this >>>>> +state, client queues and dequeues buffers to both queues via >>>>> +:c:func:`VIDIOC_QBUF` and :c:func:`VIDIOC_DQBUF`, following standard >>>>> +semantics. >>>>> + >>>>> +Both queues operate independently, following standard behavior of V4L2 >>>>> +buffer queues and memory-to-memory devices. In addition, the order of >>>>> +decoded frames dequeued from ``CAPTURE`` queue may differ from the order of >>>>> +queuing coded frames to ``OUTPUT`` queue, due to properties of selected >>>>> +coded format, e.g. frame reordering. The client must not assume any direct >>>>> +relationship between ``CAPTURE`` and ``OUTPUT`` buffers, other than >>>>> +reported by :c:type:`v4l2_buffer` ``timestamp`` field. >>>> >>>> Is there a relationship between capture and output buffers w.r.t. the timestamp >>>> field? I am not aware that there is one. >>> >>> I believe the decoder was expected to copy the timestamp of matching >>> OUTPUT buffer to respective CAPTURE buffer. Both s5p-mfc and coda seem >>> to be implementing it this way. I guess it might be a good idea to >>> specify this more explicitly. >> >> What about an output buffer producing multiple capture buffers? Or the case >> where the encoded bitstream of a frame starts at one output buffer and ends >> at another? What happens if you have B frames and the order of the capture >> buffers is different from the output buffers? >> >> In other words, for codecs there is no clear 1-to-1 relationship between an >> output buffer and a capture buffer. And we never defined what the 'copy timestamp' >> behavior should be in that case or if it even makes sense. > > You're perfectly right. There is no 1:1 relationship, but it doesn't > prevent copying timestamps. It just makes it possible for multiple > CAPTURE buffers to have the same timestamp or some OUTPUT timestamps > not to be found in any CAPTURE buffer. We need to document the behavior. Basically there are three different corner cases that need documenting: 1) one OUTPUT buffer generates multiple CAPTURE buffers 2) multiple OUTPUT buffers generate one CAPTURE buffer 3) the decoding order differs from the presentation order (i.e. the CAPTURE buffers are out-of-order compared to the OUTPUT buffers). For 1) I assume that we just copy the same OUTPUT timestamp to multiple CAPTURE buffers. For 2) we need to specify if the CAPTURE timestamp is copied from the first or last OUTPUT buffer used in creating the capture buffer. Using the last OUTPUT buffer makes more sense to me. And 3) implies that timestamps can be out-of-order. This needs to be very carefully documented since it is very unexpected. This should probably be a separate patch, adding text to the v4l2_buffer documentation (esp. the V4L2_BUF_FLAG_TIMESTAMP_COPY documentation). Regards, Hans
Hi Hans, On 08/08/18 07:43, Hans Verkuil wrote: > On 08/08/2018 05:11 AM, Tomasz Figa wrote: >> On Tue, Aug 7, 2018 at 4:13 PM Hans Verkuil <hverkuil@xs4all.nl> wrote: >>> >>> On 07/26/2018 12:20 PM, Tomasz Figa wrote: >>>> Hi Hans, >>>> >>>> On Wed, Jul 25, 2018 at 8:59 PM Hans Verkuil <hverkuil@xs4all.nl> wrote: >>>>>> + >>>>>> +14. Call :c:func:`VIDIOC_STREAMON` to initiate decoding frames. >>>>>> + >>>>>> +Decoding >>>>>> +======== >>>>>> + >>>>>> +This state is reached after a successful initialization sequence. In this >>>>>> +state, client queues and dequeues buffers to both queues via >>>>>> +:c:func:`VIDIOC_QBUF` and :c:func:`VIDIOC_DQBUF`, following standard >>>>>> +semantics. >>>>>> + >>>>>> +Both queues operate independently, following standard behavior of V4L2 >>>>>> +buffer queues and memory-to-memory devices. In addition, the order of >>>>>> +decoded frames dequeued from ``CAPTURE`` queue may differ from the order of >>>>>> +queuing coded frames to ``OUTPUT`` queue, due to properties of selected >>>>>> +coded format, e.g. frame reordering. The client must not assume any direct >>>>>> +relationship between ``CAPTURE`` and ``OUTPUT`` buffers, other than >>>>>> +reported by :c:type:`v4l2_buffer` ``timestamp`` field. >>>>> >>>>> Is there a relationship between capture and output buffers w.r.t. the timestamp >>>>> field? I am not aware that there is one. >>>> >>>> I believe the decoder was expected to copy the timestamp of matching >>>> OUTPUT buffer to respective CAPTURE buffer. Both s5p-mfc and coda seem >>>> to be implementing it this way. I guess it might be a good idea to >>>> specify this more explicitly. >>> >>> What about an output buffer producing multiple capture buffers? Or the case >>> where the encoded bitstream of a frame starts at one output buffer and ends >>> at another? What happens if you have B frames and the order of the capture >>> buffers is different from the output buffers? >>> >>> In other words, for codecs there is no clear 1-to-1 relationship between an >>> output buffer and a capture buffer. And we never defined what the 'copy timestamp' >>> behavior should be in that case or if it even makes sense. >> >> You're perfectly right. There is no 1:1 relationship, but it doesn't >> prevent copying timestamps. It just makes it possible for multiple >> CAPTURE buffers to have the same timestamp or some OUTPUT timestamps >> not to be found in any CAPTURE buffer. > > We need to document the behavior. Basically there are three different > corner cases that need documenting: > > 1) one OUTPUT buffer generates multiple CAPTURE buffers > 2) multiple OUTPUT buffers generate one CAPTURE buffer > 3) the decoding order differs from the presentation order (i.e. the > CAPTURE buffers are out-of-order compared to the OUTPUT buffers). > > For 1) I assume that we just copy the same OUTPUT timestamp to multiple > CAPTURE buffers. I'm not sure how this interface would handle something like a temporal scalability layer, but conceivably this assumption might be invalid in that case. Regards, Ian. > > For 2) we need to specify if the CAPTURE timestamp is copied from the first > or last OUTPUT buffer used in creating the capture buffer. Using the last > OUTPUT buffer makes more sense to me. > > And 3) implies that timestamps can be out-of-order. This needs to be > very carefully documented since it is very unexpected. > > This should probably be a separate patch, adding text to the v4l2_buffer > documentation (esp. the V4L2_BUF_FLAG_TIMESTAMP_COPY documentation). > > Regards, > > Hans >
2018-08-08 5:07 GMT+02:00 Tomasz Figa <tfiga@chromium.org>: > On Wed, Aug 8, 2018 at 4:11 AM Maxime Jourdan <maxi.jourdan@wanadoo.fr> wrote: >> >> 2018-08-07 9:13 GMT+02:00 Hans Verkuil <hverkuil@xs4all.nl>: >> > On 07/26/2018 12:20 PM, Tomasz Figa wrote: >> >> Hi Hans, >> >> >> >> On Wed, Jul 25, 2018 at 8:59 PM Hans Verkuil <hverkuil@xs4all.nl> wrote: >> >>>> + >> >>>> +14. Call :c:func:`VIDIOC_STREAMON` to initiate decoding frames. >> >>>> + >> >>>> +Decoding >> >>>> +======== >> >>>> + >> >>>> +This state is reached after a successful initialization sequence. In this >> >>>> +state, client queues and dequeues buffers to both queues via >> >>>> +:c:func:`VIDIOC_QBUF` and :c:func:`VIDIOC_DQBUF`, following standard >> >>>> +semantics. >> >>>> + >> >>>> +Both queues operate independently, following standard behavior of V4L2 >> >>>> +buffer queues and memory-to-memory devices. In addition, the order of >> >>>> +decoded frames dequeued from ``CAPTURE`` queue may differ from the order of >> >>>> +queuing coded frames to ``OUTPUT`` queue, due to properties of selected >> >>>> +coded format, e.g. frame reordering. The client must not assume any direct >> >>>> +relationship between ``CAPTURE`` and ``OUTPUT`` buffers, other than >> >>>> +reported by :c:type:`v4l2_buffer` ``timestamp`` field. >> >>> >> >>> Is there a relationship between capture and output buffers w.r.t. the timestamp >> >>> field? I am not aware that there is one. >> >> >> >> I believe the decoder was expected to copy the timestamp of matching >> >> OUTPUT buffer to respective CAPTURE buffer. Both s5p-mfc and coda seem >> >> to be implementing it this way. I guess it might be a good idea to >> >> specify this more explicitly. >> > >> > What about an output buffer producing multiple capture buffers? Or the case >> > where the encoded bitstream of a frame starts at one output buffer and ends >> > at another? What happens if you have B frames and the order of the capture >> > buffers is different from the output buffers? >> > >> > In other words, for codecs there is no clear 1-to-1 relationship between an >> > output buffer and a capture buffer. And we never defined what the 'copy timestamp' >> > behavior should be in that case or if it even makes sense. >> > >> > Regards, >> > >> > Hans >> >> As it is done right now in userspace (FFmpeg, GStreamer) and most (if >> not all?) drivers, it's a 1:1 between OUTPUT and CAPTURE. The only >> thing that changes is the ordering since OUTPUT buffers are in >> decoding order while CAPTURE buffers are in presentation order. > > If I understood it correctly, there is a feature in VP9 that lets one > frame repeat several times, which would make one OUTPUT buffer produce > multiple CAPTURE buffers. > > Moreover, V4L2_PIX_FMT_H264 is actually defined to be a byte stream, > without any need for framing, and yes, there are drivers that follow > this definition correctly (s5p-mfc and, AFAIR, coda). In that case, > one OUTPUT buffer can have arbitrary amount of bitstream and lead to > multiple CAPTURE frames being produced. I can see from the code and your answer to Hans that in such case, all CAPTURE buffers will share the single OUTPUT timestamp. Does this mean that at the end of the day, userspace disregards the CAPTURE timestamps since you have the display order guarantee ? If so, how do you reconstruct the proper PTS on such buffers ? Do you have them saved from prior demuxing ? >> >> This almost always implies some timestamping kung-fu to match the >> OUTPUT timestamps with the corresponding CAPTURE timestamps. It's >> often done indirectly by the firmware on some platforms (rpi comes to >> mind iirc). > > I don't think there is an upstream driver for it, is there? (If not, > are you aware of any work towards it?) You're right, it's not upstream but it is in a relatively good shape at https://github.com/6by9/linux/commits/rpi-4.14.y-v4l2-codec >> >> The current constructions also imply one video packet per OUTPUT >> buffer. If a video packet is too big to fit in a buffer, FFmpeg will >> crop that packet to the maximum buffer size and will discard the >> remaining packet data. GStreamer will abort the decoding. This is >> unfortunately one of the shortcomings of having fixed-size buffers. >> And if they were to split the packet in multiple buffers, then some >> drivers in their current state wouldn't be able to handle the >> timestamping issues and/or x:1 OUTPUT:CAPTURE buffer numbers. > > In Chromium, we just allocate OUTPUT buffers big enough to be really > unlikely for a single frame not to fit inside [1]. Obviously it's a > waste of memory, for formats which normally have just single frames > inside buffers, but it seems to work in practice. > > [1] https://cs.chromium.org/chromium/src/media/gpu/v4l2/v4l2_video_decode_accelerator.h?rcl=3468d5a59e00bcb2c2e946a30694e6057fd9ab21&l=118 Right. As long as you don't need many OUTPUT buffers it's not that big a deal. [snip] >> > + For hardware known to be mishandling seeks to a non-resume point, >> > + e.g. by returning corrupted decoded frames, the driver must be able >> > + to handle such seeks without a crash or any fatal decode error. >> >> This is unfortunately my case, apart from parsing the bitstream >> manually - which is a no-no -, there is no way to know when I'll be >> writing in an IDR frame to the HW bitstream parser. I think it would >> be much preferable that the client starts sending in an IDR frame for >> sure. > > Most of the hardware, which have upstream drivers, deal with this > correctly and there is existing user space that relies on this, so we > cannot simply add such requirement. However, when sending your driver > upstream, feel free to include a patch that adds a read-only control > that tells the user space that it needs to do seeks to resume points. > Obviously this will work only with user space aware of this > requirement, but I don't think we can do anything better here. > Makes sense >> > + To achieve instantaneous seek, the client may restart streaming on >> > + ``CAPTURE`` queue to discard decoded, but not yet dequeued buffers. >> >> Overall, I think Drain followed by V4L2_DEC_CMD_START is a more >> applicable scenario for seeking. >> Heck, simply starting to queue buffers at the seek - starting with an >> IDR - without doing any kind of streamon/off or cmd_start(stop) will >> do the trick. > > Why do you think so? > > For a seek, as expected by a typical device user, the result should be > discarding anything already queued and just start decoding new frames > as soon as possible. > > Actually, this section doesn't describe any specific sequence, just > possible ways to do a seek using existing primitives. Fair enough Regards, Maxime
On Tue, 2018-07-24 at 23:06 +0900, Tomasz Figa wrote: [...] > +Seek > +==== > + > +Seek is controlled by the ``OUTPUT`` queue, as it is the source of > +bitstream data. ``CAPTURE`` queue remains unchanged/unaffected. > + > +1. Stop the ``OUTPUT`` queue to begin the seek sequence via > + :c:func:`VIDIOC_STREAMOFF`. > + > + * **Required fields:** > + > + ``type`` > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` > + > + * The driver must drop all the pending ``OUTPUT`` buffers and they are > + treated as returned to the client (following standard semantics). > + > +2. Restart the ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON` > + > + * **Required fields:** > + > + ``type`` > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` > + > + * The driver must be put in a state after seek and be ready to > + accept new source bitstream buffers. > + > +3. Start queuing buffers to ``OUTPUT`` queue containing stream data after > + the seek until a suitable resume point is found. > + > + .. note:: > + > + There is no requirement to begin queuing stream starting exactly from > + a resume point (e.g. SPS or a keyframe). The driver must handle any > + data queued and must keep processing the queued buffers until it > + finds a suitable resume point. While looking for a resume point, the I think the definition of a resume point is too vague in this place. Can the driver decide whether or not a keyframe without SPS is a suitable resume point? Or do drivers have to parse and store SPS/PPS if the hardware does not support resuming from a keyframe without sending SPS/PPS again? regards Philipp
On Mon, Aug 20, 2018 at 10:04 PM Philipp Zabel <p.zabel@pengutronix.de> wrote: > > On Tue, 2018-07-24 at 23:06 +0900, Tomasz Figa wrote: > [...] > > +Seek > > +==== > > + > > +Seek is controlled by the ``OUTPUT`` queue, as it is the source of > > +bitstream data. ``CAPTURE`` queue remains unchanged/unaffected. > > + > > +1. Stop the ``OUTPUT`` queue to begin the seek sequence via > > + :c:func:`VIDIOC_STREAMOFF`. > > + > > + * **Required fields:** > > + > > + ``type`` > > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` > > + > > + * The driver must drop all the pending ``OUTPUT`` buffers and they are > > + treated as returned to the client (following standard semantics). > > + > > +2. Restart the ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON` > > + > > + * **Required fields:** > > + > > + ``type`` > > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` > > + > > + * The driver must be put in a state after seek and be ready to > > + accept new source bitstream buffers. > > + > > +3. Start queuing buffers to ``OUTPUT`` queue containing stream data after > > + the seek until a suitable resume point is found. > > + > > + .. note:: > > + > > + There is no requirement to begin queuing stream starting exactly from > > + a resume point (e.g. SPS or a keyframe). The driver must handle any > > + data queued and must keep processing the queued buffers until it > > + finds a suitable resume point. While looking for a resume point, the > > I think the definition of a resume point is too vague in this place. > Can the driver decide whether or not a keyframe without SPS is a > suitable resume point? Or do drivers have to parse and store SPS/PPS if > the hardware does not support resuming from a keyframe without sending > SPS/PPS again? The thing is that existing drivers implement and user space clients rely on the behavior described above, so we cannot really change it anymore. Do we have hardware for which this wouldn't work to the point that the driver couldn't even continue with a bunch of frames corrupted? If only frame corruption is a problem, we can add a control to tell the user space to seek to resume points and it can happen in an incremental patch. Best regards, Tomasz
Hi Tomasz, On Mon, 2018-08-20 at 22:12 +0900, Tomasz Figa wrote: > On Mon, Aug 20, 2018 at 10:04 PM Philipp Zabel <p.zabel@pengutronix.de> wrote: > > > > On Tue, 2018-07-24 at 23:06 +0900, Tomasz Figa wrote: > > [...] > > > +Seek > > > +==== > > > + > > > +Seek is controlled by the ``OUTPUT`` queue, as it is the source of > > > +bitstream data. ``CAPTURE`` queue remains unchanged/unaffected. > > > + > > > +1. Stop the ``OUTPUT`` queue to begin the seek sequence via > > > + :c:func:`VIDIOC_STREAMOFF`. > > > + > > > + * **Required fields:** > > > + > > > + ``type`` > > > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` > > > + > > > + * The driver must drop all the pending ``OUTPUT`` buffers and they are > > > + treated as returned to the client (following standard semantics). > > > + > > > +2. Restart the ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON` > > > + > > > + * **Required fields:** > > > + > > > + ``type`` > > > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` > > > + > > > + * The driver must be put in a state after seek and be ready to > > > + accept new source bitstream buffers. > > > + > > > +3. Start queuing buffers to ``OUTPUT`` queue containing stream data after > > > + the seek until a suitable resume point is found. > > > + > > > + .. note:: > > > + > > > + There is no requirement to begin queuing stream starting exactly from > > > + a resume point (e.g. SPS or a keyframe). The driver must handle any > > > + data queued and must keep processing the queued buffers until it > > > + finds a suitable resume point. While looking for a resume point, the > > > > I think the definition of a resume point is too vague in this place. > > Can the driver decide whether or not a keyframe without SPS is a > > suitable resume point? Or do drivers have to parse and store SPS/PPS if > > the hardware does not support resuming from a keyframe without sending > > SPS/PPS again? > > The thing is that existing drivers implement and user space clients > rely on the behavior described above, so we cannot really change it > anymore. My point is that I'm not exactly sure what that behaviour is, given the description. Must a driver be able to resume from a keyframe even if userspace never pushes SPS/PPS again? If so, I think it should be mentioned more explicitly than just via an example in parentheses, to make it clear to all driver developers that this is a requirement that userspace is going to rely on. Or, if that is not the case, is a driver free to define "SPS only" as its "suitable resume point" and to discard all input including keyframes until the next SPS/PPS is pushed? It would be better to clearly define what a "suitable resume point" has to be per codec, and not let the drivers decide for themselves, if at all possible. Otherwise we'd need a away to inform userspace about the per-driver definition. > Do we have hardware for which this wouldn't work to the point that the > driver couldn't even continue with a bunch of frames corrupted? If > only frame corruption is a problem, we can add a control to tell the > user space to seek to resume points and it can happen in an > incremental patch. The coda driver currently can't seek at all, it always stops and restarts the sequence. So depending on the above I might have to either find and store SPS/PPS in software, or figure out how to make the firmware flush the bitstream buffer and restart without actually stopping the sequence. I'm sure the hardware is capable of this, it's more a question of what behaviour is actually intended, and whether I have enough information about the firmware interface to implement it. regards Philipp
On Mon, Aug 20, 2018 at 11:13 PM Philipp Zabel <p.zabel@pengutronix.de> wrote: > > Hi Tomasz, > > On Mon, 2018-08-20 at 22:12 +0900, Tomasz Figa wrote: > > On Mon, Aug 20, 2018 at 10:04 PM Philipp Zabel <p.zabel@pengutronix.de> wrote: > > > > > > On Tue, 2018-07-24 at 23:06 +0900, Tomasz Figa wrote: > > > [...] > > > > +Seek > > > > +==== > > > > + > > > > +Seek is controlled by the ``OUTPUT`` queue, as it is the source of > > > > +bitstream data. ``CAPTURE`` queue remains unchanged/unaffected. > > > > + > > > > +1. Stop the ``OUTPUT`` queue to begin the seek sequence via > > > > + :c:func:`VIDIOC_STREAMOFF`. > > > > + > > > > + * **Required fields:** > > > > + > > > > + ``type`` > > > > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` > > > > + > > > > + * The driver must drop all the pending ``OUTPUT`` buffers and they are > > > > + treated as returned to the client (following standard semantics). > > > > + > > > > +2. Restart the ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON` > > > > + > > > > + * **Required fields:** > > > > + > > > > + ``type`` > > > > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` > > > > + > > > > + * The driver must be put in a state after seek and be ready to > > > > + accept new source bitstream buffers. > > > > + > > > > +3. Start queuing buffers to ``OUTPUT`` queue containing stream data after > > > > + the seek until a suitable resume point is found. > > > > + > > > > + .. note:: > > > > + > > > > + There is no requirement to begin queuing stream starting exactly from > > > > + a resume point (e.g. SPS or a keyframe). The driver must handle any > > > > + data queued and must keep processing the queued buffers until it > > > > + finds a suitable resume point. While looking for a resume point, the > > > > > > I think the definition of a resume point is too vague in this place. > > > Can the driver decide whether or not a keyframe without SPS is a > > > suitable resume point? Or do drivers have to parse and store SPS/PPS if > > > the hardware does not support resuming from a keyframe without sending > > > SPS/PPS again? > > > > The thing is that existing drivers implement and user space clients > > rely on the behavior described above, so we cannot really change it > > anymore. > > My point is that I'm not exactly sure what that behaviour is, given the > description. > > Must a driver be able to resume from a keyframe even if userspace never > pushes SPS/PPS again? > If so, I think it should be mentioned more explicitly than just via an > example in parentheses, to make it clear to all driver developers that > this is a requirement that userspace is going to rely on. > > Or, if that is not the case, is a driver free to define "SPS only" as > its "suitable resume point" and to discard all input including keyframes > until the next SPS/PPS is pushed? > > It would be better to clearly define what a "suitable resume point" has > to be per codec, and not let the drivers decide for themselves, if at > all possible. Otherwise we'd need a away to inform userspace about the > per-driver definition. The intention here is that there is exactly no requirement for the user space to seek to any kind of resume point and so there is no point in defining such. The only requirement here is that the hardware/driver keeps processing the source stream until it finds a resume point suitable for it - if the hardware keeps SPS/PPS in its state then just a keyframe; if it doesn't then SPS/PPS. Note that this is a documentation of the user space API, not a driver implementation guide. We may want to create the latter separately, though. H264 is a bit special here, because one may still seek to a key frame, but past the relevant SPS/PPS headers. In this case, there is no way for the hardware to know that the SPS/PPS it has in its local state is not the one that applies to the frame. It may be worth adding that such case leads to undefined results, but must not cause crash nor a fatal decode error. What do you think? > > > Do we have hardware for which this wouldn't work to the point that the > > driver couldn't even continue with a bunch of frames corrupted? If > > only frame corruption is a problem, we can add a control to tell the > > user space to seek to resume points and it can happen in an > > incremental patch. > > The coda driver currently can't seek at all, it always stops and > restarts the sequence. So depending on the above I might have to either > find and store SPS/PPS in software, or figure out how to make the > firmware flush the bitstream buffer and restart without actually > stopping the sequence. > I'm sure the hardware is capable of this, it's more a question of what > behaviour is actually intended, and whether I have enough information > about the firmware interface to implement it. What happens if you just keep feeding it with next frames? If that would result only in corrupted frames, I suppose the control (say V4L2_CID_MPEG_VIDEO_NEEDS_SEEK_TO_RESUME_POINT) would solve the problem? Best regards, Tomasz
On Mon, 2018-08-20 at 23:27 +0900, Tomasz Figa wrote: [...] > +3. Start queuing buffers to ``OUTPUT`` queue containing stream data after > > > > > + the seek until a suitable resume point is found. > > > > > + > > > > > + .. note:: > > > > > + > > > > > + There is no requirement to begin queuing stream starting exactly from > > > > > + a resume point (e.g. SPS or a keyframe). The driver must handle any > > > > > + data queued and must keep processing the queued buffers until it > > > > > + finds a suitable resume point. While looking for a resume point, the > > > > > > > > I think the definition of a resume point is too vague in this place. > > > > Can the driver decide whether or not a keyframe without SPS is a > > > > suitable resume point? Or do drivers have to parse and store SPS/PPS if > > > > the hardware does not support resuming from a keyframe without sending > > > > SPS/PPS again? > > > > > > The thing is that existing drivers implement and user space clients > > > rely on the behavior described above, so we cannot really change it > > > anymore. > > > > My point is that I'm not exactly sure what that behaviour is, given the > > description. > > > > Must a driver be able to resume from a keyframe even if userspace never > > pushes SPS/PPS again? > > If so, I think it should be mentioned more explicitly than just via an > > example in parentheses, to make it clear to all driver developers that > > this is a requirement that userspace is going to rely on. > > > > Or, if that is not the case, is a driver free to define "SPS only" as > > its "suitable resume point" and to discard all input including keyframes > > until the next SPS/PPS is pushed? > > > > It would be better to clearly define what a "suitable resume point" has > > to be per codec, and not let the drivers decide for themselves, if at > > all possible. Otherwise we'd need a away to inform userspace about the > > per-driver definition. > > The intention here is that there is exactly no requirement for the > user space to seek to any kind of resume point No question about this. > and so there is no point in defining such. I don't agree. Let me give an example: Assume userspace wants to play back a simple h.264 stream that has SPS/PPS exactly once, in the beginning. If drivers are allowed to resume from SPS/PPS only, and have no way to communicate this to userspace, userspace always has to assume that resuming from keyframes alone is not possible. So it has to store SPS/PPS and resubmit them with every seek, even if a specific driver wouldn't require it: Otherwise those drivers that don't store SPS/PPS themselves (or in hardware) would be allowed to just drop everything after the first seek. This effectively would make resending SPS/PPS mandatory, which doesn't fit well with the intention of letting userspace just seek anywhere and start feeding data (or: NAL units) into the driver blindly. > The only requirement here is that the > hardware/driver keeps processing the source stream until it finds a > resume point suitable for it - if the hardware keeps SPS/PPS in its > state then just a keyframe; if it doesn't then SPS/PPS. Yes, but the difference between those two might be very relevant to userspace behaviour. > Note that this is a documentation of the user space API, not a driver > implementation guide. We may want to create the latter separately, > though. This is a good point, I keep switching the perspective from which I look at this document. Even for userspace it would make sense to be as specific as possible, though. Otherwise, doesn't userspace always have to assume the worst? > H264 is a bit special here, because one may still seek to a key frame, > but past the relevant SPS/PPS headers. In this case, there is no way > for the hardware to know that the SPS/PPS it has in its local state is > not the one that applies to the frame. It may be worth adding that > such case leads to undefined results, but must not cause crash nor a > fatal decode error. > > What do you think? That sounds like a good idea. I haven't thought about seeking over a SPS/PPS change. Of course userspace must not expect correct results in this case without providing the new SPS/PPS. > > > Do we have hardware for which this wouldn't work to the point that the > > > driver couldn't even continue with a bunch of frames corrupted? If > > > only frame corruption is a problem, we can add a control to tell the > > > user space to seek to resume points and it can happen in an > > > incremental patch. > > > > The coda driver currently can't seek at all, it always stops and > > restarts the sequence. So depending on the above I might have to either > > find and store SPS/PPS in software, or figure out how to make the > > firmware flush the bitstream buffer and restart without actually > > stopping the sequence. > > I'm sure the hardware is capable of this, it's more a question of what > > behaviour is actually intended, and whether I have enough information > > about the firmware interface to implement it. > > What happens if you just keep feeding it with next frames? As long as they are well formed, it should just decode them, possibly with artifacts due to mismatched reference buffers. There is an I-Frame search mode that should be usable to skip to the next resume point, as well, so I'm sure coda will end up not needing the NEEDS_SEEK_TO_RESUME_POINT flag below. I'm just not certain at this point whether I'll be able to (or: whether I'll have to) keep the SPS/PPS state across seeks. I have seen so many decoder hangs with malformed input on i.MX53 that I couldn't recover from, that I'm wary to make any guarantees without flushing the bitstream buffer first. > If that would result only in corrupted frames, I suppose the control (say > V4L2_CID_MPEG_VIDEO_NEEDS_SEEK_TO_RESUME_POINT) would solve the > problem? For this to be useful, userspace needs to know what a resume point is in the first place, though. regards Philipp
Hi Tomasz, On 08/08/2018 05:55 AM, Tomasz Figa wrote: > On Tue, Aug 7, 2018 at 4:37 PM Hans Verkuil <hverkuil@xs4all.nl> wrote: >>>>>>> +7. If all the following conditions are met, the client may resume the >>>>>>> + decoding instantly, by using :c:func:`VIDIOC_DECODER_CMD` with >>>>>>> + ``V4L2_DEC_CMD_START`` command, as in case of resuming after the drain >>>>>>> + sequence: >>>>>>> + >>>>>>> + * ``sizeimage`` of new format is less than or equal to the size of >>>>>>> + currently allocated buffers, >>>>>>> + >>>>>>> + * the number of buffers currently allocated is greater than or equal to >>>>>>> + the minimum number of buffers acquired in step 6. >>>>>> >>>>>> You might want to mention that if there are insufficient buffers, then >>>>>> VIDIOC_CREATE_BUFS can be used to add more buffers. >>>>>> >>>>> >>>>> This might be a bit tricky, since at least s5p-mfc and coda can only >>>>> work on a fixed buffer set and one would need to fully reinitialize >>>>> the decoding to add one more buffer, which would effectively be the >>>>> full resolution change sequence, as below, just with REQBUFS(0), >>>>> REQBUFS(N) replaced with CREATE_BUFS. >>>> >>>> What happens today in those drivers if you try to call CREATE_BUFS? >>> >>> s5p-mfc doesn't set the .vidioc_create_bufs pointer in its >>> v4l2_ioctl_ops, so I suppose that would be -ENOTTY? >> >> Correct for s5p-mfc. > > As Philipp clarified, coda supports adding buffers on the fly. I > briefly looked at venus and mtk-vcodec and they seem to use m2m > implementation of CREATE_BUFS. Not sure if anyone tested that, though. > So the only hardware I know for sure cannot support this is s5p-mfc. In Venus case CREATE_BUFS is tested with Gstreamer.
Hi Philipp, On Tue, Aug 21, 2018 at 12:34 AM Philipp Zabel <p.zabel@pengutronix.de> wrote: > > On Mon, 2018-08-20 at 23:27 +0900, Tomasz Figa wrote: > [...] > > +3. Start queuing buffers to ``OUTPUT`` queue containing stream data after > > > > > > + the seek until a suitable resume point is found. > > > > > > + > > > > > > + .. note:: > > > > > > + > > > > > > + There is no requirement to begin queuing stream starting exactly from > > > > > > + a resume point (e.g. SPS or a keyframe). The driver must handle any > > > > > > + data queued and must keep processing the queued buffers until it > > > > > > + finds a suitable resume point. While looking for a resume point, the > > > > > > > > > > I think the definition of a resume point is too vague in this place. > > > > > Can the driver decide whether or not a keyframe without SPS is a > > > > > suitable resume point? Or do drivers have to parse and store SPS/PPS if > > > > > the hardware does not support resuming from a keyframe without sending > > > > > SPS/PPS again? > > > > > > > > The thing is that existing drivers implement and user space clients > > > > rely on the behavior described above, so we cannot really change it > > > > anymore. > > > > > > My point is that I'm not exactly sure what that behaviour is, given the > > > description. > > > > > > Must a driver be able to resume from a keyframe even if userspace never > > > pushes SPS/PPS again? > > > If so, I think it should be mentioned more explicitly than just via an > > > example in parentheses, to make it clear to all driver developers that > > > this is a requirement that userspace is going to rely on. > > > > > > Or, if that is not the case, is a driver free to define "SPS only" as > > > its "suitable resume point" and to discard all input including keyframes > > > until the next SPS/PPS is pushed? > > > > > > It would be better to clearly define what a "suitable resume point" has > > > to be per codec, and not let the drivers decide for themselves, if at > > > all possible. Otherwise we'd need a away to inform userspace about the > > > per-driver definition. > > > > The intention here is that there is exactly no requirement for the > > user space to seek to any kind of resume point > > No question about this. > > > and so there is no point in defining such. > > I don't agree. Let me give an example: > > Assume userspace wants to play back a simple h.264 stream that has > SPS/PPS exactly once, in the beginning. > > If drivers are allowed to resume from SPS/PPS only, and have no way to > communicate this to userspace, userspace always has to assume that > resuming from keyframes alone is not possible. So it has to store > SPS/PPS and resubmit them with every seek, even if a specific driver > wouldn't require it: Otherwise those drivers that don't store SPS/PPS > themselves (or in hardware) would be allowed to just drop everything > after the first seek. > This effectively would make resending SPS/PPS mandatory, which doesn't > fit well with the intention of letting userspace just seek anywhere and > start feeding data (or: NAL units) into the driver blindly. > I'd say that such video is broken by design, because you cannot play back any arbitrary later part of it without decoding it from the beginning. However, if the hardware keeps SPS/PPS across seeks (and that should normally be the case), the case could be handled by the user space letting the decoder initialize with the first frames and only then seeking, which would probably be the typical case of a user opening a video file and then moving the seek bar to desired position (or clicking a bookmark). If the hardware doesn't keep SPS/PPS across seeks, stateless API could arguably be a better candidate for it, since it mandates the user space to keep SPS/PPS around. > > The only requirement here is that the > > hardware/driver keeps processing the source stream until it finds a > > resume point suitable for it - if the hardware keeps SPS/PPS in its > > state then just a keyframe; if it doesn't then SPS/PPS. > > Yes, but the difference between those two might be very relevant to > userspace behaviour. > > > Note that this is a documentation of the user space API, not a driver > > implementation guide. We may want to create the latter separately, > > though. > > This is a good point, I keep switching the perspective from which I look > at this document. > Even for userspace it would make sense to be as specific as possible, > though. Otherwise, doesn't userspace always have to assume the worst? > That's right, a generic user space is expected to handle all the possible cases possible with the interface it's using. This is precisely why I'd like to avoid introducing the case where user space needs to carry state around. The API is for stateful hardware, which is expected to carry all the needed state around itself. > > H264 is a bit special here, because one may still seek to a key frame, > > but past the relevant SPS/PPS headers. In this case, there is no way > > for the hardware to know that the SPS/PPS it has in its local state is > > not the one that applies to the frame. It may be worth adding that > > such case leads to undefined results, but must not cause crash nor a > > fatal decode error. > > > > What do you think? > > That sounds like a good idea. I haven't thought about seeking over a > SPS/PPS change. Of course userspace must not expect correct results in > this case without providing the new SPS/PPS. > From what I talked with Pawel, our hardware (s5p-mfc, mtk-vcodec) will just notice that the frames refer to a different SPS/PPS (based on seq_parameter_set_id, I assume) and keep dropping frames until next corresponding header is encountered. > > > > Do we have hardware for which this wouldn't work to the point that the > > > > driver couldn't even continue with a bunch of frames corrupted? If > > > > only frame corruption is a problem, we can add a control to tell the > > > > user space to seek to resume points and it can happen in an > > > > incremental patch. > > > > > > The coda driver currently can't seek at all, it always stops and > > > restarts the sequence. So depending on the above I might have to either > > > find and store SPS/PPS in software, or figure out how to make the > > > firmware flush the bitstream buffer and restart without actually > > > stopping the sequence. > > > I'm sure the hardware is capable of this, it's more a question of what > > > behaviour is actually intended, and whether I have enough information > > > about the firmware interface to implement it. > > > > What happens if you just keep feeding it with next frames? > > As long as they are well formed, it should just decode them, possibly > with artifacts due to mismatched reference buffers. There is an I-Frame > search mode that should be usable to skip to the next resume point, as > well, so I'm sure coda will end up not needing the > NEEDS_SEEK_TO_RESUME_POINT flag below. I'm just not certain at this > point whether I'll be able to (or: whether I'll have to) keep the > SPS/PPS state across seeks. I have seen so many decoder hangs with > malformed input on i.MX53 that I couldn't recover from, that I'm wary > to make any guarantees without flushing the bitstream buffer first. Based on the above, I believe the answer is that your hardware/driver needs to keep SPS/PPS around. Is there a good way to do it with Coda? We definitely don't want to do any parsing inside the driver. > > > If that would result only in corrupted frames, I suppose the control (say > > V4L2_CID_MPEG_VIDEO_NEEDS_SEEK_TO_RESUME_POINT) would solve the > > problem? > > For this to be useful, userspace needs to know what a resume point is in > the first place, though. That would be defined in the context of that control and particular pixel format, since there is no general, yet precise enough definition that could apply to all codecs. Right now, I would like to defer adding such constraints until there is really a hardware which needs it and it can't be handled using stateless API. Best regards, Tomasz
On Tue, Aug 21, 2018 at 8:29 PM Stanimir Varbanov <stanimir.varbanov@linaro.org> wrote: > > Hi Tomasz, > > On 08/08/2018 05:55 AM, Tomasz Figa wrote: > > On Tue, Aug 7, 2018 at 4:37 PM Hans Verkuil <hverkuil@xs4all.nl> wrote: > > >>>>>>> +7. If all the following conditions are met, the client may resume the > >>>>>>> + decoding instantly, by using :c:func:`VIDIOC_DECODER_CMD` with > >>>>>>> + ``V4L2_DEC_CMD_START`` command, as in case of resuming after the drain > >>>>>>> + sequence: > >>>>>>> + > >>>>>>> + * ``sizeimage`` of new format is less than or equal to the size of > >>>>>>> + currently allocated buffers, > >>>>>>> + > >>>>>>> + * the number of buffers currently allocated is greater than or equal to > >>>>>>> + the minimum number of buffers acquired in step 6. > >>>>>> > >>>>>> You might want to mention that if there are insufficient buffers, then > >>>>>> VIDIOC_CREATE_BUFS can be used to add more buffers. > >>>>>> > >>>>> > >>>>> This might be a bit tricky, since at least s5p-mfc and coda can only > >>>>> work on a fixed buffer set and one would need to fully reinitialize > >>>>> the decoding to add one more buffer, which would effectively be the > >>>>> full resolution change sequence, as below, just with REQBUFS(0), > >>>>> REQBUFS(N) replaced with CREATE_BUFS. > >>>> > >>>> What happens today in those drivers if you try to call CREATE_BUFS? > >>> > >>> s5p-mfc doesn't set the .vidioc_create_bufs pointer in its > >>> v4l2_ioctl_ops, so I suppose that would be -ENOTTY? > >> > >> Correct for s5p-mfc. > > > > As Philipp clarified, coda supports adding buffers on the fly. I > > briefly looked at venus and mtk-vcodec and they seem to use m2m > > implementation of CREATE_BUFS. Not sure if anyone tested that, though. > > So the only hardware I know for sure cannot support this is s5p-mfc. > > In Venus case CREATE_BUFS is tested with Gstreamer. Stanimir: Alright. Thanks for confirmation. Hans: Technically, we could still implement CREATE_BUFS for s5p-mfc, but it would need to be restricted to situations where it's possible to reinitialize the whole hardware buffer queue, i.e. - before initial STREAMON(CAPTURE) after header parsing, - after a resolution change and before following STREAMON(CAPTURE) or DECODER_CMD_START (to ack resolution change without buffer reallocation). Would that work for your original suggestion? Best regards, Tomasz
Hi Tomasz, just a few thoughts I came across while writing the stateless codec document: On Tue, Jul 24, 2018 at 11:06 PM Tomasz Figa <tfiga@chromium.org> wrote: [snip] > +**************************************** > +Memory-to-memory Video Decoder Interface > +**************************************** Since we have a m2m stateless decoder interface, can we call this the m2m video *stateful* decoder interface? :) > +Conventions and notation used in this document > +============================================== [snip] > +Glossary > +======== I think these sections apply to both stateless and stateful. How about moving then into dev-codec.rst and mentioning that they apply to the two following sections?
On Fri, Aug 31, 2018 at 5:27 PM Alexandre Courbot <acourbot@chromium.org> wrote: > > Hi Tomasz, just a few thoughts I came across while writing the > stateless codec document: > > On Tue, Jul 24, 2018 at 11:06 PM Tomasz Figa <tfiga@chromium.org> wrote: > [snip] > > +**************************************** > > +Memory-to-memory Video Decoder Interface > > +**************************************** > > Since we have a m2m stateless decoder interface, can we call this the > m2m video *stateful* decoder interface? :) I guess it could make sense indeed. Let's wait for some other opinions, if any. > > > +Conventions and notation used in this document > > +============================================== > [snip] > > +Glossary > > +======== > > I think these sections apply to both stateless and stateful. How about > moving then into dev-codec.rst and mentioning that they apply to the > two following sections? Or maybe we could put them into separate rst files and source them at the top of each interface documentation? Personally, I'm okay with either. On a related note, I'd love to see some kind of glossary lookup on mouse hoover, so that I don't have to scroll back and forth. :) Best regards, Tomasz
Hi Hans, On Thu, Jul 26, 2018 at 7:20 PM Tomasz Figa <tfiga@chromium.org> wrote: > > Hi Hans, > > On Wed, Jul 25, 2018 at 8:59 PM Hans Verkuil <hverkuil@xs4all.nl> wrote: > > > > Hi Tomasz, > > > > Many, many thanks for working on this! It's a great document and when done > > it will be very useful indeed. > > > > Review comments follow... > > Thanks for review! > > > > > On 24/07/18 16:06, Tomasz Figa wrote: [snip] > > > +13. Allocate destination (raw format) buffers via :c:func:`VIDIOC_REQBUFS` > > > + on the ``CAPTURE`` queue. > > > + > > > + * **Required fields:** > > > + > > > + ``count`` > > > + requested number of buffers to allocate; greater than zero > > > + > > > + ``type`` > > > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` > > > + > > > + ``memory`` > > > + follows standard semantics > > > + > > > + * **Return fields:** > > > + > > > + ``count`` > > > + adjusted to allocated number of buffers > > > + > > > + * The driver must adjust count to minimum of required number of > > > + destination buffers for given format and stream configuration and the > > > + count passed. The client must check this value after the ioctl > > > + returns to get the number of buffers allocated. > > > + > > > + .. note:: > > > + > > > + To allocate more than minimum number of buffers (for pipeline > > > + depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE``) to > > > + get minimum number of buffers required, and pass the obtained value > > > + plus the number of additional buffers needed in count to > > > + :c:func:`VIDIOC_REQBUFS`. > > > > > > I think we should mention here the option of using VIDIOC_CREATE_BUFS in order > > to allocate buffers larger than the current CAPTURE format in order to accommodate > > future resolution changes. > > Ack. > I'm about to add a paragraph to describe this, but there is one detail to iron out. The VIDIOC_CREATE_BUFS ioctl accepts a v4l2_format struct. Userspace needs to fill in this struct and the specs says that "Usually this will be done using the VIDIOC_TRY_FMT or VIDIOC_G_FMT ioctls to ensure that the requested format is supported by the driver." However, in case of a decoder, those calls would fixup the format to match the currently parsed stream, which would likely resolve to the current coded resolution (~hardware alignment). How do we get a format for the desired maximum resolution? [snip]. > > > + > > > + * The driver is also allowed to and may not return all decoded frames [snip] > > > + queued but not decode before the seek sequence was initiated. For > > > > Very confusing sentence. I think you mean this: > > > > The driver may not return all decoded frames that where ready for > > dequeueing from before the seek sequence was initiated. > > > > Is this really true? Once decoded frames are marked as buffer_done by the > > driver there is no reason for them to be removed. Or you mean something else > > here, e.g. the frames are decoded, but the buffers not yet given back to vb2. > > > > Exactly "the frames are decoded, but the buffers not yet given back to > vb2", for example, if reordering takes place. However, if one stops > streaming before dequeuing all buffers, they are implicitly returned > (reset to the state after REQBUFS) and can't be dequeued anymore, so > the frames are lost, even if the driver returned them. I guess the > sentence was really unfortunate indeed. > Actually, that's not the only case. The documentation is written from userspace point of view. Queuing an OUTPUT buffer is not equivalent to having it decoded (and a CAPTURE buffer given back to vb2). If the userspace queues a buffer and then stops streaming, the buffer might have been still waiting in the queue, for decoding of previous buffers to finish. So basically by "queued frames" I meant "OUTPUT buffers queued by userspace and not sent to the hardware yet" and by "decoded frames" I meant "CAPTURE buffers containing matching frames given back to vb2". How about rewording like this: * The ``VIDIOC_STREAMOFF`` operation discards any remaining queued ``OUTPUT`` buffers, which means that not all of the ``OUTPUT`` buffers queued before the seek may have matching ``CAPTURE`` buffers produced. For example, [...] > > > + example, given an ``OUTPUT`` queue sequence: QBUF(A), QBUF(B), > > > + STREAMOFF(OUT), STREAMON(OUT), QBUF(G), QBUF(H), any of the > > > + following results on the ``CAPTURE`` queue is allowed: {A’, B’, G’, > > > + H’}, {A’, G’, H’}, {G’, H’}. > > > + > > > + .. note:: > > > + > > > + To achieve instantaneous seek, the client may restart streaming on > > > + ``CAPTURE`` queue to discard decoded, but not yet dequeued buffers. Best regards, Tomasz
On 09/19/2018 12:17 PM, Tomasz Figa wrote: > Hi Hans, > > On Thu, Jul 26, 2018 at 7:20 PM Tomasz Figa <tfiga@chromium.org> wrote: >> >> Hi Hans, >> >> On Wed, Jul 25, 2018 at 8:59 PM Hans Verkuil <hverkuil@xs4all.nl> wrote: >>> >>> Hi Tomasz, >>> >>> Many, many thanks for working on this! It's a great document and when done >>> it will be very useful indeed. >>> >>> Review comments follow... >> >> Thanks for review! >> >>> >>> On 24/07/18 16:06, Tomasz Figa wrote: > [snip] >>>> +13. Allocate destination (raw format) buffers via :c:func:`VIDIOC_REQBUFS` >>>> + on the ``CAPTURE`` queue. >>>> + >>>> + * **Required fields:** >>>> + >>>> + ``count`` >>>> + requested number of buffers to allocate; greater than zero >>>> + >>>> + ``type`` >>>> + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` >>>> + >>>> + ``memory`` >>>> + follows standard semantics >>>> + >>>> + * **Return fields:** >>>> + >>>> + ``count`` >>>> + adjusted to allocated number of buffers >>>> + >>>> + * The driver must adjust count to minimum of required number of >>>> + destination buffers for given format and stream configuration and the >>>> + count passed. The client must check this value after the ioctl >>>> + returns to get the number of buffers allocated. >>>> + >>>> + .. note:: >>>> + >>>> + To allocate more than minimum number of buffers (for pipeline >>>> + depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE``) to >>>> + get minimum number of buffers required, and pass the obtained value >>>> + plus the number of additional buffers needed in count to >>>> + :c:func:`VIDIOC_REQBUFS`. >>> >>> >>> I think we should mention here the option of using VIDIOC_CREATE_BUFS in order >>> to allocate buffers larger than the current CAPTURE format in order to accommodate >>> future resolution changes. >> >> Ack. >> > > I'm about to add a paragraph to describe this, but there is one detail > to iron out. > > The VIDIOC_CREATE_BUFS ioctl accepts a v4l2_format struct. Userspace > needs to fill in this struct and the specs says that > > "Usually this will be done using the VIDIOC_TRY_FMT or VIDIOC_G_FMT > ioctls to ensure that the requested format is supported by the > driver." > > However, in case of a decoder, those calls would fixup the format to > match the currently parsed stream, which would likely resolve to the > current coded resolution (~hardware alignment). How do we get a format > for the desired maximum resolution? You would call G_FMT to get the current format/resolution, then update width and height and call TRY_FMT. Although to be honest you can also just set pixelformat and width/height and zero everything else and call TRY_FMT directly, skipping the G_FMT ioctl. > > [snip]. >>>> + >>>> + * The driver is also allowed to and may not return all decoded frames > [snip] >>>> + queued but not decode before the seek sequence was initiated. For >>> >>> Very confusing sentence. I think you mean this: >>> >>> The driver may not return all decoded frames that where ready for >>> dequeueing from before the seek sequence was initiated. >>> >>> Is this really true? Once decoded frames are marked as buffer_done by the >>> driver there is no reason for them to be removed. Or you mean something else >>> here, e.g. the frames are decoded, but the buffers not yet given back to vb2. >>> >> >> Exactly "the frames are decoded, but the buffers not yet given back to >> vb2", for example, if reordering takes place. However, if one stops >> streaming before dequeuing all buffers, they are implicitly returned >> (reset to the state after REQBUFS) and can't be dequeued anymore, so >> the frames are lost, even if the driver returned them. I guess the >> sentence was really unfortunate indeed. >> > > Actually, that's not the only case. > > The documentation is written from userspace point of view. Queuing an > OUTPUT buffer is not equivalent to having it decoded (and a CAPTURE > buffer given back to vb2). If the userspace queues a buffer and then > stops streaming, the buffer might have been still waiting in the > queue, for decoding of previous buffers to finish. > > So basically by "queued frames" I meant "OUTPUT buffers queued by > userspace and not sent to the hardware yet" and by "decoded frames" I > meant "CAPTURE buffers containing matching frames given back to vb2". > > How about rewording like this: > > * The ``VIDIOC_STREAMOFF`` operation discards any remaining queued > ``OUTPUT`` buffers, which means that not all of the ``OUTPUT`` buffers > queued before the seek may have matching ``CAPTURE`` buffers produced. > For example, [...] That looks correct. Regards, Hans > >>>> + example, given an ``OUTPUT`` queue sequence: QBUF(A), QBUF(B), >>>> + STREAMOFF(OUT), STREAMON(OUT), QBUF(G), QBUF(H), any of the >>>> + following results on the ``CAPTURE`` queue is allowed: {A’, B’, G’, >>>> + H’}, {A’, G’, H’}, {G’, H’}. >>>> + >>>> + .. note:: >>>> + >>>> + To achieve instantaneous seek, the client may restart streaming on >>>> + ``CAPTURE`` queue to discard decoded, but not yet dequeued buffers. > > Best regards, > Tomasz >
On Mon, Oct 8, 2018 at 9:22 PM Hans Verkuil <hverkuil@xs4all.nl> wrote: > > On 09/19/2018 12:17 PM, Tomasz Figa wrote: > > Hi Hans, > > > > On Thu, Jul 26, 2018 at 7:20 PM Tomasz Figa <tfiga@chromium.org> wrote: > >> > >> Hi Hans, > >> > >> On Wed, Jul 25, 2018 at 8:59 PM Hans Verkuil <hverkuil@xs4all.nl> wrote: > >>> > >>> Hi Tomasz, > >>> > >>> Many, many thanks for working on this! It's a great document and when done > >>> it will be very useful indeed. > >>> > >>> Review comments follow... > >> > >> Thanks for review! > >> > >>> > >>> On 24/07/18 16:06, Tomasz Figa wrote: > > [snip] > >>>> +13. Allocate destination (raw format) buffers via :c:func:`VIDIOC_REQBUFS` > >>>> + on the ``CAPTURE`` queue. > >>>> + > >>>> + * **Required fields:** > >>>> + > >>>> + ``count`` > >>>> + requested number of buffers to allocate; greater than zero > >>>> + > >>>> + ``type`` > >>>> + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` > >>>> + > >>>> + ``memory`` > >>>> + follows standard semantics > >>>> + > >>>> + * **Return fields:** > >>>> + > >>>> + ``count`` > >>>> + adjusted to allocated number of buffers > >>>> + > >>>> + * The driver must adjust count to minimum of required number of > >>>> + destination buffers for given format and stream configuration and the > >>>> + count passed. The client must check this value after the ioctl > >>>> + returns to get the number of buffers allocated. > >>>> + > >>>> + .. note:: > >>>> + > >>>> + To allocate more than minimum number of buffers (for pipeline > >>>> + depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE``) to > >>>> + get minimum number of buffers required, and pass the obtained value > >>>> + plus the number of additional buffers needed in count to > >>>> + :c:func:`VIDIOC_REQBUFS`. > >>> > >>> > >>> I think we should mention here the option of using VIDIOC_CREATE_BUFS in order > >>> to allocate buffers larger than the current CAPTURE format in order to accommodate > >>> future resolution changes. > >> > >> Ack. > >> > > > > I'm about to add a paragraph to describe this, but there is one detail > > to iron out. > > > > The VIDIOC_CREATE_BUFS ioctl accepts a v4l2_format struct. Userspace > > needs to fill in this struct and the specs says that > > > > "Usually this will be done using the VIDIOC_TRY_FMT or VIDIOC_G_FMT > > ioctls to ensure that the requested format is supported by the > > driver." > > > > However, in case of a decoder, those calls would fixup the format to > > match the currently parsed stream, which would likely resolve to the > > current coded resolution (~hardware alignment). How do we get a format > > for the desired maximum resolution? > > You would call G_FMT to get the current format/resolution, then update > width and height and call TRY_FMT. > > Although to be honest you can also just set pixelformat and width/height > and zero everything else and call TRY_FMT directly, skipping the G_FMT > ioctl. > Wouldn't TRY_FMT adjust the width and height back to match current stream? Best regards, Tomasz
On 10/09/2018 06:23 AM, Tomasz Figa wrote: > On Mon, Oct 8, 2018 at 9:22 PM Hans Verkuil <hverkuil@xs4all.nl> wrote: >> >> On 09/19/2018 12:17 PM, Tomasz Figa wrote: >>> Hi Hans, >>> >>> On Thu, Jul 26, 2018 at 7:20 PM Tomasz Figa <tfiga@chromium.org> wrote: >>>> >>>> Hi Hans, >>>> >>>> On Wed, Jul 25, 2018 at 8:59 PM Hans Verkuil <hverkuil@xs4all.nl> wrote: >>>>> >>>>> Hi Tomasz, >>>>> >>>>> Many, many thanks for working on this! It's a great document and when done >>>>> it will be very useful indeed. >>>>> >>>>> Review comments follow... >>>> >>>> Thanks for review! >>>> >>>>> >>>>> On 24/07/18 16:06, Tomasz Figa wrote: >>> [snip] >>>>>> +13. Allocate destination (raw format) buffers via :c:func:`VIDIOC_REQBUFS` >>>>>> + on the ``CAPTURE`` queue. >>>>>> + >>>>>> + * **Required fields:** >>>>>> + >>>>>> + ``count`` >>>>>> + requested number of buffers to allocate; greater than zero >>>>>> + >>>>>> + ``type`` >>>>>> + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` >>>>>> + >>>>>> + ``memory`` >>>>>> + follows standard semantics >>>>>> + >>>>>> + * **Return fields:** >>>>>> + >>>>>> + ``count`` >>>>>> + adjusted to allocated number of buffers >>>>>> + >>>>>> + * The driver must adjust count to minimum of required number of >>>>>> + destination buffers for given format and stream configuration and the >>>>>> + count passed. The client must check this value after the ioctl >>>>>> + returns to get the number of buffers allocated. >>>>>> + >>>>>> + .. note:: >>>>>> + >>>>>> + To allocate more than minimum number of buffers (for pipeline >>>>>> + depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE``) to >>>>>> + get minimum number of buffers required, and pass the obtained value >>>>>> + plus the number of additional buffers needed in count to >>>>>> + :c:func:`VIDIOC_REQBUFS`. >>>>> >>>>> >>>>> I think we should mention here the option of using VIDIOC_CREATE_BUFS in order >>>>> to allocate buffers larger than the current CAPTURE format in order to accommodate >>>>> future resolution changes. >>>> >>>> Ack. >>>> >>> >>> I'm about to add a paragraph to describe this, but there is one detail >>> to iron out. >>> >>> The VIDIOC_CREATE_BUFS ioctl accepts a v4l2_format struct. Userspace >>> needs to fill in this struct and the specs says that >>> >>> "Usually this will be done using the VIDIOC_TRY_FMT or VIDIOC_G_FMT >>> ioctls to ensure that the requested format is supported by the >>> driver." >>> >>> However, in case of a decoder, those calls would fixup the format to >>> match the currently parsed stream, which would likely resolve to the >>> current coded resolution (~hardware alignment). How do we get a format >>> for the desired maximum resolution? >> >> You would call G_FMT to get the current format/resolution, then update >> width and height and call TRY_FMT. >> >> Although to be honest you can also just set pixelformat and width/height >> and zero everything else and call TRY_FMT directly, skipping the G_FMT >> ioctl. >> > > Wouldn't TRY_FMT adjust the width and height back to match current stream? Huh. Hmm. Grrr. Good point and I didn't read your original comment carefully enough. Suggestions on a postcard... Regards, Hans
On Wed, Aug 8, 2018 at 11:55 AM Tomasz Figa <tfiga@chromium.org> wrote: > > On Tue, Aug 7, 2018 at 4:37 PM Hans Verkuil <hverkuil@xs4all.nl> wrote: > > > > On 08/07/2018 09:05 AM, Tomasz Figa wrote: > > > On Thu, Jul 26, 2018 at 7:57 PM Hans Verkuil <hverkuil@xs4all.nl> wrote: > > >>>> I wonder if we should make these min buffer controls required. It might be easier > > >>>> that way. > > >>> > > >>> Agreed. Although userspace is still free to ignore it, because REQBUFS > > >>> would do the right thing anyway. > > >> > > >> It's never been entirely clear to me what the purpose of those min buffers controls > > >> is. REQBUFS ensures that the number of buffers is at least the minimum needed to > > >> make the HW work. So why would you need these controls? It only makes sense if they > > >> return something different from REQBUFS. > > >> > > > > > > The purpose of those controls is to let the client allocate a number > > > of buffers bigger than minimum, without the need to allocate the > > > minimum number of buffers first (to just learn the number), free them > > > and then allocate a bigger number again. > > > > I don't feel this is particularly useful. One problem with the minimum number > > of buffers as used in the kernel is that it is often the minimum number of > > buffers required to make the hardware work, but it may not be optimal. E.g. > > quite a few capture drivers set the minimum to 2, which is enough for the > > hardware, but it will likely lead to dropped frames. You really need 3 > > (one is being DMAed, one is queued and linked into the DMA engine and one is > > being processed by userspace). > > > > I would actually prefer this to be the recommended minimum number of buffers, > > which is >= the minimum REQBUFS uses. > > > > I.e., if you use this number and you have no special requirements, then you'll > > get good performance. > > I guess we could make it so. It would make existing user space request > more buffers than it used to with the original meaning, but I guess it > shouldn't be a big problem. I gave it a bit more thought and I feel like kernel is not the right place to put any assumptions on what the userspace expects "good performance" to be. Actually, having these controls return the minimum number of buffers as REQBUFS would allocate makes it very well specified - with this number you can only process frame by frame and the number of buffers added by userspace defines exactly the queue depth. It leaves no space for driver-specific quirks, because the driver doesn't decide what's "good performance" anymore. Best regards, Tomasz
Le lundi 15 octobre 2018 à 19:13 +0900, Tomasz Figa a écrit : > On Wed, Aug 8, 2018 at 11:55 AM Tomasz Figa <tfiga@chromium.org> wrote: > > > > On Tue, Aug 7, 2018 at 4:37 PM Hans Verkuil <hverkuil@xs4all.nl> wrote: > > > > > > On 08/07/2018 09:05 AM, Tomasz Figa wrote: > > > > On Thu, Jul 26, 2018 at 7:57 PM Hans Verkuil <hverkuil@xs4all.nl> wrote: > > > > > > > I wonder if we should make these min buffer controls required. It might be easier > > > > > > > that way. > > > > > > > > > > > > Agreed. Although userspace is still free to ignore it, because REQBUFS > > > > > > would do the right thing anyway. > > > > > > > > > > It's never been entirely clear to me what the purpose of those min buffers controls > > > > > is. REQBUFS ensures that the number of buffers is at least the minimum needed to > > > > > make the HW work. So why would you need these controls? It only makes sense if they > > > > > return something different from REQBUFS. > > > > > > > > > > > > > The purpose of those controls is to let the client allocate a number > > > > of buffers bigger than minimum, without the need to allocate the > > > > minimum number of buffers first (to just learn the number), free them > > > > and then allocate a bigger number again. > > > > > > I don't feel this is particularly useful. One problem with the minimum number > > > of buffers as used in the kernel is that it is often the minimum number of > > > buffers required to make the hardware work, but it may not be optimal. E.g. > > > quite a few capture drivers set the minimum to 2, which is enough for the > > > hardware, but it will likely lead to dropped frames. You really need 3 > > > (one is being DMAed, one is queued and linked into the DMA engine and one is > > > being processed by userspace). > > > > > > I would actually prefer this to be the recommended minimum number of buffers, > > > which is >= the minimum REQBUFS uses. > > > > > > I.e., if you use this number and you have no special requirements, then you'll > > > get good performance. > > > > I guess we could make it so. It would make existing user space request > > more buffers than it used to with the original meaning, but I guess it > > shouldn't be a big problem. > > I gave it a bit more thought and I feel like kernel is not the right > place to put any assumptions on what the userspace expects "good > performance" to be. Actually, having these controls return the minimum > number of buffers as REQBUFS would allocate makes it very well > specified - with this number you can only process frame by frame and > the number of buffers added by userspace defines exactly the queue > depth. It leaves no space for driver-specific quirks, because the > driver doesn't decide what's "good performance" anymore. I agree on that and I would add that the driver making any assumption would lead to memory waste in context where less buffer will still work (think of fence based operation as an example). > > Best regards, > Tomasz
Hi Tomasz, Thank you for the patch. On Tuesday, 24 July 2018 17:06:20 EEST Tomasz Figa wrote: > Due to complexity of the video decoding process, the V4L2 drivers of > stateful decoder hardware require specific sequences of V4L2 API calls > to be followed. These include capability enumeration, initialization, > decoding, seek, pause, dynamic resolution change, drain and end of > stream. > > Specifics of the above have been discussed during Media Workshops at > LinuxCon Europe 2012 in Barcelona and then later Embedded Linux > Conference Europe 2014 in Düsseldorf. The de facto Codec API that > originated at those events was later implemented by the drivers we already > have merged in mainline, such as s5p-mfc or coda. > > The only thing missing was the real specification included as a part of > Linux Media documentation. Fix it now and document the decoder part of > the Codec API. > > Signed-off-by: Tomasz Figa <tfiga@chromium.org> > --- > Documentation/media/uapi/v4l/dev-decoder.rst | 872 +++++++++++++++++++ > Documentation/media/uapi/v4l/devices.rst | 1 + > Documentation/media/uapi/v4l/v4l2.rst | 10 +- > 3 files changed, 882 insertions(+), 1 deletion(-) > create mode 100644 Documentation/media/uapi/v4l/dev-decoder.rst > > diff --git a/Documentation/media/uapi/v4l/dev-decoder.rst > b/Documentation/media/uapi/v4l/dev-decoder.rst new file mode 100644 > index 000000000000..f55d34d2f860 > --- /dev/null > +++ b/Documentation/media/uapi/v4l/dev-decoder.rst > @@ -0,0 +1,872 @@ > +.. -*- coding: utf-8; mode: rst -*- > + > +.. _decoder: > + > +**************************************** > +Memory-to-memory Video Decoder Interface > +**************************************** > + > +Input data to a video decoder are buffers containing unprocessed video > +stream (e.g. Annex-B H.264/HEVC stream, raw VP8/9 stream). The driver is > +expected not to require any additional information from the client to > +process these buffers. Output data are raw video frames returned in display > +order. > + > +Performing software parsing, processing etc. of the stream in the driver > +in order to support this interface is strongly discouraged. In case such > +operations are needed, use of Stateless Video Decoder Interface (in > +development) is strongly advised. > + > +Conventions and notation used in this document > +============================================== > + > +1. The general V4L2 API rules apply if not specified in this document > + otherwise. > + > +2. The meaning of words “must”, “may”, “should”, etc. is as per RFC > + 2119. > + > +3. All steps not marked “optional” are required. > + > +4. :c:func:`VIDIOC_G_EXT_CTRLS`, :c:func:`VIDIOC_S_EXT_CTRLS` may be used > + interchangeably with :c:func:`VIDIOC_G_CTRL`, :c:func:`VIDIOC_S_CTRL`, > + unless specified otherwise. > + > +5. Single-plane API (see spec) and applicable structures may be used > + interchangeably with Multi-plane API, unless specified otherwise, > + depending on driver capabilities and following the general V4L2 > + guidelines. How about also allowing VIDIOC_CREATE_BUFS where VIDIOC_REQBUFS is mentioned ? > +6. i = [a..b]: sequence of integers from a to b, inclusive, i.e. i = > + [0..2]: i = 0, 1, 2. > + > +7. For ``OUTPUT`` buffer A, A’ represents a buffer on the ``CAPTURE`` queue > + containing data (decoded frame/stream) that resulted from processing + > buffer A. > + > +Glossary > +======== > + > +CAPTURE > + the destination buffer queue; the queue of buffers containing decoded > + frames; ``V4L2_BUF_TYPE_VIDEO_CAPTURE```` or > + ``V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE``; data are captured from the > + hardware into ``CAPTURE`` buffers > + > +client > + application client communicating with the driver implementing this API > + > +coded format > + encoded/compressed video bitstream format (e.g. H.264, VP8, etc.); see > + also: raw format > + > +coded height > + height for given coded resolution > + > +coded resolution > + stream resolution in pixels aligned to codec and hardware requirements; > + typically visible resolution rounded up to full macroblocks; > + see also: visible resolution > + > +coded width > + width for given coded resolution > + > +decode order > + the order in which frames are decoded; may differ from display order if > + coded format includes a feature of frame reordering; ``OUTPUT`` buffers > + must be queued by the client in decode order > + > +destination > + data resulting from the decode process; ``CAPTURE`` > + > +display order > + the order in which frames must be displayed; ``CAPTURE`` buffers must be > + returned by the driver in display order > + > +DPB > + Decoded Picture Buffer; a H.264 term for a buffer that stores a picture > + that is encoded or decoded and available for reference in further > + decode/encode steps. By "encoded or decoded", do you mean "raw frames to be encoded (in the encoder use case) or decoded raw frames (in the decoder use case)" ? I think this should be clarified. > +EOS > + end of stream > + > +IDR > + a type of a keyframe in H.264-encoded stream, which clears the list of > + earlier reference frames (DPBs) > + > +keyframe > + an encoded frame that does not reference frames decoded earlier, i.e. > + can be decoded fully on its own. > + > +OUTPUT > + the source buffer queue; the queue of buffers containing encoded > + bitstream; ``V4L2_BUF_TYPE_VIDEO_OUTPUT`` or > + ``V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE``; the hardware is fed with data > + from ``OUTPUT`` buffers > + > +PPS > + Picture Parameter Set; a type of metadata entity in H.264 bitstream > + > +raw format > + uncompressed format containing raw pixel data (e.g. YUV, RGB formats) > + > +resume point > + a point in the bitstream from which decoding may start/continue, without > + any previous state/data present, e.g.: a keyframe (VP8/VP9) or + > SPS/PPS/IDR sequence (H.264); a resume point is required to start decode + > of a new stream, or to resume decoding after a seek > + > +source > + data fed to the decoder; ``OUTPUT`` > + > +SPS > + Sequence Parameter Set; a type of metadata entity in H.264 bitstream > + > +visible height > + height for given visible resolution; display height > + > +visible resolution > + stream resolution of the visible picture, in pixels, to be used for > + display purposes; must be smaller or equal to coded resolution; > + display resolution > + > +visible width > + width for given visible resolution; display width > + > +Querying capabilities > +===================== > + > +1. To enumerate the set of coded formats supported by the driver, the > + client may call :c:func:`VIDIOC_ENUM_FMT` on ``OUTPUT``. > + > + * The driver must always return the full set of supported formats, > + irrespective of the format set on the ``CAPTURE``. > + > +2. To enumerate the set of supported raw formats, the client may call > + :c:func:`VIDIOC_ENUM_FMT` on ``CAPTURE``. > + > + * The driver must return only the formats supported for the format > + currently active on ``OUTPUT``. > + > + * In order to enumerate raw formats supported by a given coded format, > + the client must first set that coded format on ``OUTPUT`` and then > + enumerate the ``CAPTURE`` queue. Maybe s/enumerate the/enumerate formats on the/ ? > +3. The client may use :c:func:`VIDIOC_ENUM_FRAMESIZES` to detect supported > + resolutions for a given format, passing desired pixel format in > + :c:type:`v4l2_frmsizeenum` ``pixel_format``. > + > + * Values returned by :c:func:`VIDIOC_ENUM_FRAMESIZES` on ``OUTPUT`` > + must include all possible coded resolutions supported by the decoder > + for given coded pixel format. > + > + * Values returned by :c:func:`VIDIOC_ENUM_FRAMESIZES` on ``CAPTURE`` > + must include all possible frame buffer resolutions supported by the > + decoder for given raw pixel format and coded format currently set on > + ``OUTPUT``. > + > + .. note:: > + > + The client may derive the supported resolution range for a > + combination of coded and raw format by setting width and height of > + ``OUTPUT`` format to 0 and calculating the intersection of > + resolutions returned from calls to :c:func:`VIDIOC_ENUM_FRAMESIZES` > + for the given coded and raw formats. I'm confused by the note, I'm not sure to understand what you mean. > +4. Supported profiles and levels for given format, if applicable, may be > + queried using their respective controls via :c:func:`VIDIOC_QUERYCTRL`. > + > +Initialization > +============== > + > +1. *[optional]* Enumerate supported ``OUTPUT`` formats and resolutions. See > + capability enumeration. > + > +2. Set the coded format on ``OUTPUT`` via :c:func:`VIDIOC_S_FMT` > + > + * **Required fields:** > + > + ``type`` > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` > + > + ``pixelformat`` > + a coded pixel format > + > + ``width``, ``height`` > + required only if cannot be parsed from the stream for the given > + coded format; optional otherwise - set to zero to ignore > + > + other fields > + follow standard semantics > + > + * For coded formats including stream resolution information, if width > + and height are set to non-zero values, the driver will propagate the > + resolution to ``CAPTURE`` and signal a source change event > + instantly. Maybe s/instantly/immediately before returning from :c:func:`VIDIOC_S_FMT`/ ? > However, after the decoder is done parsing the > + information embedded in the stream, it will update ``CAPTURE`` s/update/update the/ > + format with new values and signal a source change event again, if s/, if/ if/ > + the values do not match. > + > + .. note:: > + > + Changing ``OUTPUT`` format may change currently set ``CAPTURE`` Do you have a particular dislike for definite articles ? :-) I would have written "Changing the ``OUTPUT`` format may change the currently set ``CAPTURE`` ...". I won't repeat the comment through the whole review, but many places seem to be missing a definite article. > + format. The driver will derive a new ``CAPTURE`` format from > + ``OUTPUT`` format being set, including resolution, colorimetry > + parameters, etc. If the client needs a specific ``CAPTURE`` format, > + it must adjust it afterwards. > + > +3. *[optional]* Get minimum number of buffers required for ``OUTPUT`` > + queue via :c:func:`VIDIOC_G_CTRL`. This is useful if client intends to > + use more buffers than minimum required by hardware/format. > + > + * **Required fields:** > + > + ``id`` > + set to ``V4L2_CID_MIN_BUFFERS_FOR_OUTPUT`` > + > + * **Return fields:** > + > + ``value`` > + required number of ``OUTPUT`` buffers for the currently set > + format s/required/required minimum/ > + > +4. Allocate source (bitstream) buffers via :c:func:`VIDIOC_REQBUFS` on > + ``OUTPUT``. > + > + * **Required fields:** > + > + ``count`` > + requested number of buffers to allocate; greater than zero > + > + ``type`` > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` > + > + ``memory`` > + follows standard semantics > + > + ``sizeimage`` > + follows standard semantics; the client is free to choose any > + suitable size, however, it may be subject to change by the > + driver > + > + * **Return fields:** > + > + ``count`` > + actual number of buffers allocated > + > + * The driver must adjust count to minimum of required number of > + ``OUTPUT`` buffers for given format and count passed. Isn't it the maximum, not the minimum ? > The client must > + check this value after the ioctl returns to get the number of > + buffers allocated. > + > + .. note:: > + > + To allocate more than minimum number of buffers (for pipeline > + depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_OUTPUT``) to > + get minimum number of buffers required by the driver/format, > + and pass the obtained value plus the number of additional > + buffers needed in count to :c:func:`VIDIOC_REQBUFS`. > + > +5. Start streaming on ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON`. > + > +6. This step only applies to coded formats that contain resolution > + information in the stream. Continue queuing/dequeuing bitstream > + buffers to/from the ``OUTPUT`` queue via :c:func:`VIDIOC_QBUF` and > + :c:func:`VIDIOC_DQBUF`. The driver must keep processing and returning > + each buffer to the client until required metadata to configure the > + ``CAPTURE`` queue are found. This is indicated by the driver sending > + a ``V4L2_EVENT_SOURCE_CHANGE`` event with > + ``V4L2_EVENT_SRC_CH_RESOLUTION`` source change type. There is no > + requirement to pass enough data for this to occur in the first buffer > + and the driver must be able to process any number. > + > + * If data in a buffer that triggers the event is required to decode > + the first frame, the driver must not return it to the client, > + but must retain it for further decoding. > + > + * If the client set width and height of ``OUTPUT`` format to 0, calling > + :c:func:`VIDIOC_G_FMT` on the ``CAPTURE`` queue will return -EPERM, > + until the driver configures ``CAPTURE`` format according to stream > + metadata. That's a pretty harsh handling for this condition. What's the rationale for returning -EPERM instead of for instance succeeding with width and height set to 0 ? > + * If the client subscribes to ``V4L2_EVENT_SOURCE_CHANGE`` events and > + the event is signaled, the decoding process will not continue until > + it is acknowledged by either (re-)starting streaming on ``CAPTURE``, > + or via :c:func:`VIDIOC_DECODER_CMD` with ``V4L2_DEC_CMD_START`` > + command. > + > + .. note:: > + > + No decoded frames are produced during this phase. > + > +7. This step only applies to coded formats that contain resolution > + information in the stream. > + Receive and handle ``V4L2_EVENT_SOURCE_CHANGE`` from the driver > + via :c:func:`VIDIOC_DQEVENT`. The driver must send this event once > + enough data is obtained from the stream to allocate ``CAPTURE`` > + buffers and to begin producing decoded frames. Doesn't the last sentence belong to step 6 (where it's already explained to some extent) ? > + > + * **Required fields:** > + > + ``type`` > + set to ``V4L2_EVENT_SOURCE_CHANGE`` Isn't the type field set by the driver ? > + * **Return fields:** > + > + ``u.src_change.changes`` > + set to ``V4L2_EVENT_SRC_CH_RESOLUTION`` > + > + * Any client query issued after the driver queues the event must return > + values applying to the just parsed stream, including queue formats, > + selection rectangles and controls. To align with the wording used so far, I would say that "the driver must" return values applying to the just parsed stream. I think I would also move this to step 6, as it's related to queuing the event, not dequeuing it. > +8. Call :c:func:`VIDIOC_G_FMT` for ``CAPTURE`` queue to get format for the > + destination buffers parsed/decoded from the bitstream. > + > + * **Required fields:** > + > + ``type`` > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` > + > + * **Return fields:** > + > + ``width``, ``height`` > + frame buffer resolution for the decoded frames > + > + ``pixelformat`` > + pixel format for decoded frames > + > + ``num_planes`` (for _MPLANE ``type`` only) > + number of planes for pixelformat > + > + ``sizeimage``, ``bytesperline`` > + as per standard semantics; matching frame buffer format > + > + .. note:: > + > + The value of ``pixelformat`` may be any pixel format supported and > + must be supported for current stream, based on the information > + parsed from the stream and hardware capabilities. It is suggested > + that driver chooses the preferred/optimal format for given In compliance with RFC 2119, how about using "Drivers should choose" instead of "It is suggested that driver chooses" ? > + configuration. For example, a YUV format may be preferred over an > + RGB format, if additional conversion step would be required. > + > +9. *[optional]* Enumerate ``CAPTURE`` formats via > + :c:func:`VIDIOC_ENUM_FMT` on ``CAPTURE`` queue. Once the stream > + information is parsed and known, the client may use this ioctl to > + discover which raw formats are supported for given stream and select on s/select on/select one/ > + of them via :c:func:`VIDIOC_S_FMT`. > + > + .. note:: > + > + The driver will return only formats supported for the current stream > + parsed in this initialization sequence, even if more formats may be > + supported by the driver in general. > + > + For example, a driver/hardware may support YUV and RGB formats for > + resolutions 1920x1088 and lower, but only YUV for higher > + resolutions (due to hardware limitations). After parsing > + a resolution of 1920x1088 or lower, :c:func:`VIDIOC_ENUM_FMT` may > + return a set of YUV and RGB pixel formats, but after parsing > + resolution higher than 1920x1088, the driver will not return RGB, > + unsupported for this resolution. > + > + However, subsequent resolution change event triggered after > + discovering a resolution change within the same stream may switch > + the stream into a lower resolution and :c:func:`VIDIOC_ENUM_FMT` > + would return RGB formats again in that case. > + > +10. *[optional]* Choose a different ``CAPTURE`` format than suggested via > + :c:func:`VIDIOC_S_FMT` on ``CAPTURE`` queue. It is possible for the > + client to choose a different format than selected/suggested by the And here, "A client may choose" ? > + driver in :c:func:`VIDIOC_G_FMT`. > + > + * **Required fields:** > + > + ``type`` > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` > + > + ``pixelformat`` > + a raw pixel format > + > + .. note:: > + > + Calling :c:func:`VIDIOC_ENUM_FMT` to discover currently available > + formats after receiving ``V4L2_EVENT_SOURCE_CHANGE`` is useful to > + find out a set of allowed formats for given configuration, but not > + required, if the client can accept the defaults. s/required/required,/ > + > +11. *[optional]* Acquire visible resolution via > + :c:func:`VIDIOC_G_SELECTION`. > + > + * **Required fields:** > + > + ``type`` > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` > + > + ``target`` > + set to ``V4L2_SEL_TGT_COMPOSE`` > + > + * **Return fields:** > + > + ``r.left``, ``r.top``, ``r.width``, ``r.height`` > + visible rectangle; this must fit within frame buffer resolution > + returned by :c:func:`VIDIOC_G_FMT`. > + > + * The driver must expose following selection targets on ``CAPTURE``: > + > + ``V4L2_SEL_TGT_CROP_BOUNDS`` > + corresponds to coded resolution of the stream > + > + ``V4L2_SEL_TGT_CROP_DEFAULT`` > + a rectangle covering the part of the frame buffer that contains > + meaningful picture data (visible area); width and height will be > + equal to visible resolution of the stream > + > + ``V4L2_SEL_TGT_CROP`` > + rectangle within coded resolution to be output to ``CAPTURE``; > + defaults to ``V4L2_SEL_TGT_CROP_DEFAULT``; read-only on hardware > + without additional compose/scaling capabilities > + > + ``V4L2_SEL_TGT_COMPOSE_BOUNDS`` > + maximum rectangle within ``CAPTURE`` buffer, which the cropped > + frame can be output into; equal to ``V4L2_SEL_TGT_CROP``, if the > + hardware does not support compose/scaling > + > + ``V4L2_SEL_TGT_COMPOSE_DEFAULT`` > + equal to ``V4L2_SEL_TGT_CROP`` > + > + ``V4L2_SEL_TGT_COMPOSE`` > + rectangle inside ``OUTPUT`` buffer into which the cropped frame s/OUTPUT/CAPTURE/ ? > + is output; defaults to ``V4L2_SEL_TGT_COMPOSE_DEFAULT``; and "is captured" or "is written" ? > + read-only on hardware without additional compose/scaling > + capabilities > + > + ``V4L2_SEL_TGT_COMPOSE_PADDED`` > + rectangle inside ``OUTPUT`` buffer which is overwritten by the Here too ? > + hardware; equal to ``V4L2_SEL_TGT_COMPOSE``, if the hardware s/, if/ if/ > + does not write padding pixels > + > +12. *[optional]* Get minimum number of buffers required for ``CAPTURE`` > + queue via :c:func:`VIDIOC_G_CTRL`. This is useful if client intends to > + use more buffers than minimum required by hardware/format. > + > + * **Required fields:** > + > + ``id`` > + set to ``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE`` > + > + * **Return fields:** > + > + ``value`` > + minimum number of buffers required to decode the stream parsed in > + this initialization sequence. > + > + .. note:: > + > + Note that the minimum number of buffers must be at least the number > + required to successfully decode the current stream. This may for > + example be the required DPB size for an H.264 stream given the > + parsed stream configuration (resolution, level). > + > +13. Allocate destination (raw format) buffers via :c:func:`VIDIOC_REQBUFS` > + on the ``CAPTURE`` queue. > + > + * **Required fields:** > + > + ``count`` > + requested number of buffers to allocate; greater than zero > + > + ``type`` > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` > + > + ``memory`` > + follows standard semantics > + > + * **Return fields:** > + > + ``count`` > + adjusted to allocated number of buffers > + > + * The driver must adjust count to minimum of required number of s/minimum/maximum/ ? Should we also mentioned that if count > minimum, the driver may additionally limit the number of buffers based on internal limits (such as maximum memory consumption) ? > + destination buffers for given format and stream configuration and the > + count passed. The client must check this value after the ioctl > + returns to get the number of buffers allocated. > + > + .. note:: > + > + To allocate more than minimum number of buffers (for pipeline > + depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE``) to > + get minimum number of buffers required, and pass the obtained value > + plus the number of additional buffers needed in count to > + :c:func:`VIDIOC_REQBUFS`. > + > +14. Call :c:func:`VIDIOC_STREAMON` to initiate decoding frames. > + > +Decoding > +======== > + > +This state is reached after a successful initialization sequence. In this > +state, client queues and dequeues buffers to both queues via > +:c:func:`VIDIOC_QBUF` and :c:func:`VIDIOC_DQBUF`, following standard > +semantics. > + > +Both queues operate independently, following standard behavior of V4L2 > +buffer queues and memory-to-memory devices. In addition, the order of > +decoded frames dequeued from ``CAPTURE`` queue may differ from the order of > +queuing coded frames to ``OUTPUT`` queue, due to properties of selected > +coded format, e.g. frame reordering. The client must not assume any direct > +relationship between ``CAPTURE`` and ``OUTPUT`` buffers, other than > +reported by :c:type:`v4l2_buffer` ``timestamp`` field. > + > +The contents of source ``OUTPUT`` buffers depend on active coded pixel > +format and might be affected by codec-specific extended controls, as stated s/might/may/ > +in documentation of each format individually. > + > +The client must not assume any direct relationship between ``CAPTURE`` > +and ``OUTPUT`` buffers and any specific timing of buffers becoming > +available to dequeue. Specifically: > + > +* a buffer queued to ``OUTPUT`` may result in no buffers being produced > + on ``CAPTURE`` (e.g. if it does not contain encoded data, or if only > + metadata syntax structures are present in it), > + > +* a buffer queued to ``OUTPUT`` may result in more than 1 buffer produced > + on ``CAPTURE`` (if the encoded data contained more than one frame, or if > + returning a decoded frame allowed the driver to return a frame that > + preceded it in decode, but succeeded it in display order), > + > +* a buffer queued to ``OUTPUT`` may result in a buffer being produced on > + ``CAPTURE`` later into decode process, and/or after processing further > + ``OUTPUT`` buffers, or be returned out of order, e.g. if display > + reordering is used, > + > +* buffers may become available on the ``CAPTURE`` queue without additional s/buffers/Buffers/ > + buffers queued to ``OUTPUT`` (e.g. during drain or ``EOS``), because of > + ``OUTPUT`` buffers being queued in the past and decoding result of which > + being available only at later time, due to specifics of the decoding > + process. I understand what you mean, but the wording is weird to my eyes. How about * Buffers may become available on the ``CAPTURE`` queue without additional buffers queued to ``OUTPUT`` (e.g. during drain or ``EOS``), because of ``OUTPUT`` buffers queued in the past whose decoding results are only available at later time, due to specifics of the decoding process. > +Seek > +==== > + > +Seek is controlled by the ``OUTPUT`` queue, as it is the source of > +bitstream data. ``CAPTURE`` queue remains unchanged/unaffected. I assume that a seek may result in a source resolution change event, in which case the capture queue will be affected. How about stating here that controlling seek doesn't require any specific operation on the capture queue, but that the capture queue may be affected as per normal decoder operation ? We may also want to mention the event as an example. > +1. Stop the ``OUTPUT`` queue to begin the seek sequence via > + :c:func:`VIDIOC_STREAMOFF`. > + > + * **Required fields:** > + > + ``type`` > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` > + > + * The driver must drop all the pending ``OUTPUT`` buffers and they are > + treated as returned to the client (following standard semantics). > + > +2. Restart the ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON` > + > + * **Required fields:** > + > + ``type`` > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` > + > + * The driver must be put in a state after seek and be ready to What do you mean by "a state after seek" ? > + accept new source bitstream buffers. > + > +3. Start queuing buffers to ``OUTPUT`` queue containing stream data after > + the seek until a suitable resume point is found. > + > + .. note:: > + > + There is no requirement to begin queuing stream starting exactly from s/stream/buffers/ ? > + a resume point (e.g. SPS or a keyframe). The driver must handle any > + data queued and must keep processing the queued buffers until it > + finds a suitable resume point. While looking for a resume point, the > + driver processes ``OUTPUT`` buffers and returns them to the client > + without producing any decoded frames. > + > + For hardware known to be mishandling seeks to a non-resume point, > + e.g. by returning corrupted decoded frames, the driver must be able > + to handle such seeks without a crash or any fatal decode error. This should be true for any hardware, there should never be any crash or fatal decode error. I'd write it as Some hardware is known to mishandle seeks to a non-resume point. Such an operation may result in an unspecified number of corrupted decoded frames being made available on ``CAPTURE``. Drivers must ensure that no fatal decoding errors or crashes occur, and implement any necessary handling and work-arounds for hardware issues related to seek operations. > +4. After a resume point is found, the driver will start returning > + ``CAPTURE`` buffers with decoded frames. > + > + * There is no precise specification for ``CAPTURE`` queue of when it > + will start producing buffers containing decoded data from buffers > + queued after the seek, as it operates independently > + from ``OUTPUT`` queue. > + > + * The driver is allowed to and may return a number of remaining s/is allowed to and may/may/ > + ``CAPTURE`` buffers containing decoded frames from before the seek > + after the seek sequence (STREAMOFF-STREAMON) is performed. Shouldn't all these buffers be returned when STREAMOFF is called on the OUTPUT side ? > + * The driver is also allowed to and may not return all decoded frames s/is also allowed to and may not return/may also not return/ > + queued but not decode before the seek sequence was initiated. For s/not decode/not decoded/ > + example, given an ``OUTPUT`` queue sequence: QBUF(A), QBUF(B), > + STREAMOFF(OUT), STREAMON(OUT), QBUF(G), QBUF(H), any of the > + following results on the ``CAPTURE`` queue is allowed: {A’, B’, G’, > + H’}, {A’, G’, H’}, {G’, H’}. Related to the previous point, shouldn't this be moved to step 1 ? > + .. note:: > + > + To achieve instantaneous seek, the client may restart streaming on > + ``CAPTURE`` queue to discard decoded, but not yet dequeued buffers. > + > +Pause > +===== > + > +In order to pause, the client should just cease queuing buffers onto the > +``OUTPUT`` queue. This is different from the general V4L2 API definition of > +pause, which involves calling :c:func:`VIDIOC_STREAMOFF` on the queue. > +Without source bitstream data, there is no data to process and the > hardware +remains idle. > + > +Conversely, using :c:func:`VIDIOC_STREAMOFF` on ``OUTPUT`` queue indicates > +a seek, which > + > +1. drops all ``OUTPUT`` buffers in flight and > +2. after a subsequent :c:func:`VIDIOC_STREAMON`, will look for and only > + continue from a resume point. > + > +This is usually undesirable for pause. The STREAMOFF-STREAMON sequence is > +intended for seeking. > + > +Similarly, ``CAPTURE`` queue should remain streaming as well, as the > +STREAMOFF-STREAMON sequence on it is intended solely for changing buffer > +sets. And also to drop decoded buffers for instant seek ? > +Dynamic resolution change > +========================= > + > +A video decoder implementing this interface must support dynamic resolution > +change, for streams, which include resolution metadata in the bitstream. s/for streams, which/for streams that/ > +When the decoder encounters a resolution change in the stream, the dynamic > +resolution change sequence is started. > + > +1. After encountering a resolution change in the stream, the driver must > + first process and decode all remaining buffers from before the > + resolution change point. > + > +2. After all buffers containing decoded frames from before the resolution > + change point are ready to be dequeued on the ``CAPTURE`` queue, the > + driver sends a ``V4L2_EVENT_SOURCE_CHANGE`` event for source change > + type ``V4L2_EVENT_SRC_CH_RESOLUTION``. > + > + * The last buffer from before the change must be marked with > + :c:type:`v4l2_buffer` ``flags`` flag ``V4L2_BUF_FLAG_LAST`` as in the > + drain sequence. The last buffer might be empty (with > + :c:type:`v4l2_buffer` ``bytesused`` = 0) and must be ignored by the > + client, since it does not contain any decoded frame. > + > + * Any client query issued after the driver queues the event must return > + values applying to the stream after the resolution change, including > + queue formats, selection rectangles and controls. > + > + * If the client subscribes to ``V4L2_EVENT_SOURCE_CHANGE`` events and > + the event is signaled, the decoding process will not continue until > + it is acknowledged by either (re-)starting streaming on ``CAPTURE``, > + or via :c:func:`VIDIOC_DECODER_CMD` with ``V4L2_DEC_CMD_START`` > + command. This usage of V4L2_DEC_CMD_START isn't aligned with the documentation of the command. I'm not opposed to this, but I think the use cases of decoder commands for codecs should be explained in the VIDIOC_DECODER_CMD documentation. What bothers me in particular is usage of V4L2_DEC_CMD_START to restart the decoder, while no V4L2_DEC_CMD_STOP has been issued. Should we add a section that details the decoder state machine with the implicit and explicit ways in which it is started and stopped ? I would also reference step 7 here. > + .. note:: > + > + Any attempts to dequeue more buffers beyond the buffer marked > + with ``V4L2_BUF_FLAG_LAST`` will result in a -EPIPE error from > + :c:func:`VIDIOC_DQBUF`. > + > +3. The client calls :c:func:`VIDIOC_G_FMT` for ``CAPTURE`` to get the new > + format information. This is identical to calling :c:func:`VIDIOC_G_FMT` > + after ``V4L2_EVENT_SRC_CH_RESOLUTION`` in the initialization sequence > + and should be handled similarly. As the source resolution change event is mentioned in multiple places, how about extracting the related ioctls sequence to a specific section, and referencing it where needed (at least from the initialization sequence and here) ? > + .. note:: > + > + It is allowed for the driver not to support the same pixel format as "Drivers may not support ..." > + previously used (before the resolution change) for the new > + resolution. The driver must select a default supported pixel format, > + return it, if queried using :c:func:`VIDIOC_G_FMT`, and the client > + must take note of it. > + > +4. The client acquires visible resolution as in initialization sequence. > + > +5. *[optional]* The client is allowed to enumerate available formats and s/is allowed to/may/ > + select a different one than currently chosen (returned via > + :c:func:`VIDIOC_G_FMT)`. This is identical to a corresponding step in > + the initialization sequence. > + > +6. *[optional]* The client acquires minimum number of buffers as in > + initialization sequence. > + > +7. If all the following conditions are met, the client may resume the > + decoding instantly, by using :c:func:`VIDIOC_DECODER_CMD` with > + ``V4L2_DEC_CMD_START`` command, as in case of resuming after the drain > + sequence: > + > + * ``sizeimage`` of new format is less than or equal to the size of > + currently allocated buffers, > + > + * the number of buffers currently allocated is greater than or equal to > + the minimum number of buffers acquired in step 6. > + > + In such case, the remaining steps do not apply. > + > + However, if the client intends to change the buffer set, to lower > + memory usage or for any other reasons, it may be achieved by following > + the steps below. > + > +8. After dequeuing all remaining buffers from the ``CAPTURE`` queue, This is optional, isn't it ? > the > + client must call :c:func:`VIDIOC_STREAMOFF` on the ``CAPTURE`` queue. > + The ``OUTPUT`` queue must remain streaming (calling STREAMOFF on it :c:func:`VIDIOC_STREAMOFF` > + would trigger a seek). > + > +9. The client frees the buffers on the ``CAPTURE`` queue using > + :c:func:`VIDIOC_REQBUFS`. > + > + * **Required fields:** > + > + ``count`` > + set to 0 > + > + ``type`` > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` > + > + ``memory`` > + follows standard semantics > + > +10. The client allocates a new set of buffers for the ``CAPTURE`` queue via > + :c:func:`VIDIOC_REQBUFS`. This is identical to a corresponding step in > + the initialization sequence. > + > +11. The client resumes decoding by issuing :c:func:`VIDIOC_STREAMON` on the > + ``CAPTURE`` queue. > + > +During the resolution change sequence, the ``OUTPUT`` queue must remain > +streaming. Calling :c:func:`VIDIOC_STREAMOFF` on ``OUTPUT`` queue would > +initiate a seek. > + > +The ``OUTPUT`` queue operates separately from the ``CAPTURE`` queue for the > +duration of the entire resolution change sequence. It is allowed (and > +recommended for best performance and simplicity) for the client to keep "The client should (for best performance and simplicity) keep ..." > +queuing/dequeuing buffers from/to ``OUTPUT`` queue even while processing s/from\to/to\/from/ > +this sequence. > + > +.. note:: > + > + It is also possible for this sequence to be triggered without a change "This sequence may be triggered ..." > + in coded resolution, if a different number of ``CAPTURE`` buffers is > + required in order to continue decoding the stream or the visible > + resolution changes. > + > +Drain > +===== > + > +To ensure that all queued ``OUTPUT`` buffers have been processed and > +related ``CAPTURE`` buffers output to the client, the following drain > +sequence may be followed. After the drain sequence is complete, the client > +has received all decoded frames for all ``OUTPUT`` buffers queued before > +the sequence was started. > + > +1. Begin drain by issuing :c:func:`VIDIOC_DECODER_CMD`. > + > + * **Required fields:** > + > + ``cmd`` > + set to ``V4L2_DEC_CMD_STOP`` > + > + ``flags`` > + set to 0 > + > + ``pts`` > + set to 0 > + > +2. The driver must process and decode as normal all ``OUTPUT`` buffers > + queued by the client before the :c:func:`VIDIOC_DECODER_CMD` was issued. > + Any operations triggered as a result of processing these buffers > + (including the initialization and resolution change sequences) must be > + processed as normal by both the driver and the client before proceeding > + with the drain sequence. > + > +3. Once all ``OUTPUT`` buffers queued before ``V4L2_DEC_CMD_STOP`` are > + processed: > + > + * If the ``CAPTURE`` queue is streaming, once all decoded frames (if > + any) are ready to be dequeued on the ``CAPTURE`` queue, the driver > + must send a ``V4L2_EVENT_EOS``. s/\./event./ Is the event sent on the OUTPUT or CAPTURE queue ? I assume the latter, should it be explicitly documented ? > The driver must also set > + ``V4L2_BUF_FLAG_LAST`` in :c:type:`v4l2_buffer` ``flags`` field on the > + buffer on the ``CAPTURE`` queue containing the last frame (if any) > + produced as a result of processing the ``OUTPUT`` buffers queued > + before ``V4L2_DEC_CMD_STOP``. If no more frames are left to be > + returned at the point of handling ``V4L2_DEC_CMD_STOP``, the driver > + must return an empty buffer (with :c:type:`v4l2_buffer` > + ``bytesused`` = 0) as the last buffer with ``V4L2_BUF_FLAG_LAST`` set > + instead. Any attempts to dequeue more buffers beyond the buffer marked > + with ``V4L2_BUF_FLAG_LAST`` will result in a -EPIPE error from > + :c:func:`VIDIOC_DQBUF`. > + > + * If the ``CAPTURE`` queue is NOT streaming, no action is necessary for > + ``CAPTURE`` queue and the driver must send a ``V4L2_EVENT_EOS`` > + immediately after all ``OUTPUT`` buffers in question have been > + processed. What is the use case for this ? Can't we just return an error if decoder isn't streaming ? > +4. At this point, decoding is paused and the driver will accept, but not > + process any newly queued ``OUTPUT`` buffers until the client issues > + ``V4L2_DEC_CMD_START`` or restarts streaming on any queue. > + > +* Once the drain sequence is initiated, the client needs to drive it to > + completion, as described by the above steps, unless it aborts the process > + by issuing :c:func:`VIDIOC_STREAMOFF` on ``OUTPUT`` queue. The client > + is not allowed to issue ``V4L2_DEC_CMD_START`` or ``V4L2_DEC_CMD_STOP`` > + again while the drain sequence is in progress and they will fail with > + -EBUSY error code if attempted. While this seems OK to me, I think drivers will need help to implement all the corner cases correctly without race conditions. > +* Restarting streaming on ``OUTPUT`` queue will implicitly end the paused > + state and reinitialize the decoder (similarly to the seek sequence). > + Restarting ``CAPTURE`` queue will not affect an in-progress drain > + sequence. > + > +* The drivers must also implement :c:func:`VIDIOC_TRY_DECODER_CMD`, as a > + way to let the client query the availability of decoder commands. > + > +End of stream > +============= > + > +If the decoder encounters an end of stream marking in the stream, the > +driver must send a ``V4L2_EVENT_EOS`` event On which queue ? > to the client after all frames > +are decoded and ready to be dequeued on the ``CAPTURE`` queue, with the > +:c:type:`v4l2_buffer` ``flags`` set to ``V4L2_BUF_FLAG_LAST``. This > +behavior is identical to the drain sequence triggered by the client via > +``V4L2_DEC_CMD_STOP``. > + > +Commit points > +============= > + > +Setting formats and allocating buffers triggers changes in the behavior s/triggers/trigger/ > +of the driver. > + > +1. Setting format on ``OUTPUT`` queue may change the set of formats > + supported/advertised on the ``CAPTURE`` queue. In particular, it also > + means that ``CAPTURE`` format may be reset and the client must not > + rely on the previously set format being preserved. > + > +2. Enumerating formats on ``CAPTURE`` queue must only return formats > + supported for the ``OUTPUT`` format currently set. > + > +3. Setting/changing format on ``CAPTURE`` queue does not change formats Why not just "Setting format" ? > + available on ``OUTPUT`` queue. An attempt to set ``CAPTURE`` format that > + is not supported for the currently selected ``OUTPUT`` format must > + result in the driver adjusting the requested format to an acceptable > + one. > + > +4. Enumerating formats on ``OUTPUT`` queue always returns the full set of > + supported coded formats, irrespective of the current ``CAPTURE`` > + format. > + > +5. After allocating buffers on the ``OUTPUT`` queue, it is not possible to > + change format on it. I'd phrase this as "While buffers are allocated on the ``OUTPUT`` queue, clients must not change the format on the queue. Drivers must return <error code> for any such format change attempt." > + > +To summarize, setting formats and allocation must always start with the > +``OUTPUT`` queue and the ``OUTPUT`` queue is the master that governs the > +set of supported formats for the ``CAPTURE`` queue. [snip]
Hi Laurent, On Wed, Oct 17, 2018 at 10:34 PM Laurent Pinchart <laurent.pinchart@ideasonboard.com> wrote: > > Hi Tomasz, > > Thank you for the patch. Thanks for your comments! Please see my replies inline. > > On Tuesday, 24 July 2018 17:06:20 EEST Tomasz Figa wrote: > > Due to complexity of the video decoding process, the V4L2 drivers of > > stateful decoder hardware require specific sequences of V4L2 API calls > > to be followed. These include capability enumeration, initialization, > > decoding, seek, pause, dynamic resolution change, drain and end of > > stream. > > > > Specifics of the above have been discussed during Media Workshops at > > LinuxCon Europe 2012 in Barcelona and then later Embedded Linux > > Conference Europe 2014 in Düsseldorf. The de facto Codec API that > > originated at those events was later implemented by the drivers we already > > have merged in mainline, such as s5p-mfc or coda. > > > > The only thing missing was the real specification included as a part of > > Linux Media documentation. Fix it now and document the decoder part of > > the Codec API. > > > > Signed-off-by: Tomasz Figa <tfiga@chromium.org> > > --- > > Documentation/media/uapi/v4l/dev-decoder.rst | 872 +++++++++++++++++++ > > Documentation/media/uapi/v4l/devices.rst | 1 + > > Documentation/media/uapi/v4l/v4l2.rst | 10 +- > > 3 files changed, 882 insertions(+), 1 deletion(-) > > create mode 100644 Documentation/media/uapi/v4l/dev-decoder.rst > > > > diff --git a/Documentation/media/uapi/v4l/dev-decoder.rst > > b/Documentation/media/uapi/v4l/dev-decoder.rst new file mode 100644 > > index 000000000000..f55d34d2f860 > > --- /dev/null > > +++ b/Documentation/media/uapi/v4l/dev-decoder.rst > > @@ -0,0 +1,872 @@ > > +.. -*- coding: utf-8; mode: rst -*- > > + > > +.. _decoder: > > + > > +**************************************** > > +Memory-to-memory Video Decoder Interface > > +**************************************** > > + > > +Input data to a video decoder are buffers containing unprocessed video > > +stream (e.g. Annex-B H.264/HEVC stream, raw VP8/9 stream). The driver is > > +expected not to require any additional information from the client to > > +process these buffers. Output data are raw video frames returned in display > > +order. > > + > > +Performing software parsing, processing etc. of the stream in the driver > > +in order to support this interface is strongly discouraged. In case such > > +operations are needed, use of Stateless Video Decoder Interface (in > > +development) is strongly advised. > > + > > +Conventions and notation used in this document > > +============================================== > > + > > +1. The general V4L2 API rules apply if not specified in this document > > + otherwise. > > + > > +2. The meaning of words “must”, “may”, “should”, etc. is as per RFC > > + 2119. > > + > > +3. All steps not marked “optional” are required. > > + > > +4. :c:func:`VIDIOC_G_EXT_CTRLS`, :c:func:`VIDIOC_S_EXT_CTRLS` may be used > > + interchangeably with :c:func:`VIDIOC_G_CTRL`, :c:func:`VIDIOC_S_CTRL`, > > + unless specified otherwise. > > + > > +5. Single-plane API (see spec) and applicable structures may be used > > + interchangeably with Multi-plane API, unless specified otherwise, > > + depending on driver capabilities and following the general V4L2 > > + guidelines. > > How about also allowing VIDIOC_CREATE_BUFS where VIDIOC_REQBUFS is mentioned ? > In my draft of v2, I explicitly described VIDIOC_CREATE_BUFS in any step mentioning VIDIOC_REQBUFS. Do you think that's fine too? > > +6. i = [a..b]: sequence of integers from a to b, inclusive, i.e. i = > > + [0..2]: i = 0, 1, 2. > > + > > +7. For ``OUTPUT`` buffer A, A’ represents a buffer on the ``CAPTURE`` queue > > + containing data (decoded frame/stream) that resulted from processing + > > buffer A. > > + > > +Glossary > > +======== > > + > > +CAPTURE > > + the destination buffer queue; the queue of buffers containing decoded > > + frames; ``V4L2_BUF_TYPE_VIDEO_CAPTURE```` or > > + ``V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE``; data are captured from the > > + hardware into ``CAPTURE`` buffers > > + > > +client > > + application client communicating with the driver implementing this API > > + > > +coded format > > + encoded/compressed video bitstream format (e.g. H.264, VP8, etc.); see > > + also: raw format > > + > > +coded height > > + height for given coded resolution > > + > > +coded resolution > > + stream resolution in pixels aligned to codec and hardware requirements; > > + typically visible resolution rounded up to full macroblocks; > > + see also: visible resolution > > + > > +coded width > > + width for given coded resolution > > + > > +decode order > > + the order in which frames are decoded; may differ from display order if > > + coded format includes a feature of frame reordering; ``OUTPUT`` buffers > > + must be queued by the client in decode order > > + > > +destination > > + data resulting from the decode process; ``CAPTURE`` > > + > > +display order > > + the order in which frames must be displayed; ``CAPTURE`` buffers must be > > + returned by the driver in display order > > + > > +DPB > > + Decoded Picture Buffer; a H.264 term for a buffer that stores a picture > > + that is encoded or decoded and available for reference in further > > + decode/encode steps. > > By "encoded or decoded", do you mean "raw frames to be encoded (in the encoder > use case) or decoded raw frames (in the decoder use case)" ? I think this > should be clarified. > Actually it's a decoder-specific term, so changed both decoder and encoder documents to: DPB Decoded Picture Buffer; an H.264 term for a buffer that stores a decoded raw frame available for reference in further decoding steps. Does it sound better now? > > +EOS > > + end of stream > > + > > +IDR > > + a type of a keyframe in H.264-encoded stream, which clears the list of > > + earlier reference frames (DPBs) > > + > > +keyframe > > + an encoded frame that does not reference frames decoded earlier, i.e. > > + can be decoded fully on its own. > > + > > +OUTPUT > > + the source buffer queue; the queue of buffers containing encoded > > + bitstream; ``V4L2_BUF_TYPE_VIDEO_OUTPUT`` or > > + ``V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE``; the hardware is fed with data > > + from ``OUTPUT`` buffers > > + > > +PPS > > + Picture Parameter Set; a type of metadata entity in H.264 bitstream > > + > > +raw format > > + uncompressed format containing raw pixel data (e.g. YUV, RGB formats) > > + > > +resume point > > + a point in the bitstream from which decoding may start/continue, without > > + any previous state/data present, e.g.: a keyframe (VP8/VP9) or + > > SPS/PPS/IDR sequence (H.264); a resume point is required to start decode + > > of a new stream, or to resume decoding after a seek > > + > > +source > > + data fed to the decoder; ``OUTPUT`` > > + > > +SPS > > + Sequence Parameter Set; a type of metadata entity in H.264 bitstream > > + > > +visible height > > + height for given visible resolution; display height > > + > > +visible resolution > > + stream resolution of the visible picture, in pixels, to be used for > > + display purposes; must be smaller or equal to coded resolution; > > + display resolution > > + > > +visible width > > + width for given visible resolution; display width > > + > > +Querying capabilities > > +===================== > > + > > +1. To enumerate the set of coded formats supported by the driver, the > > + client may call :c:func:`VIDIOC_ENUM_FMT` on ``OUTPUT``. > > + > > + * The driver must always return the full set of supported formats, > > + irrespective of the format set on the ``CAPTURE``. > > + > > +2. To enumerate the set of supported raw formats, the client may call > > + :c:func:`VIDIOC_ENUM_FMT` on ``CAPTURE``. > > + > > + * The driver must return only the formats supported for the format > > + currently active on ``OUTPUT``. > > + > > + * In order to enumerate raw formats supported by a given coded format, > > + the client must first set that coded format on ``OUTPUT`` and then > > + enumerate the ``CAPTURE`` queue. > > Maybe s/enumerate the/enumerate formats on the/ ? > > > +3. The client may use :c:func:`VIDIOC_ENUM_FRAMESIZES` to detect supported > > + resolutions for a given format, passing desired pixel format in > > + :c:type:`v4l2_frmsizeenum` ``pixel_format``. > > + > > + * Values returned by :c:func:`VIDIOC_ENUM_FRAMESIZES` on ``OUTPUT`` > > + must include all possible coded resolutions supported by the decoder > > + for given coded pixel format. > > + > > + * Values returned by :c:func:`VIDIOC_ENUM_FRAMESIZES` on ``CAPTURE`` > > + must include all possible frame buffer resolutions supported by the > > + decoder for given raw pixel format and coded format currently set on > > + ``OUTPUT``. > > + > > + .. note:: > > + > > + The client may derive the supported resolution range for a > > + combination of coded and raw format by setting width and height of > > + ``OUTPUT`` format to 0 and calculating the intersection of > > + resolutions returned from calls to :c:func:`VIDIOC_ENUM_FRAMESIZES` > > + for the given coded and raw formats. > > I'm confused by the note, I'm not sure to understand what you mean. > I'm actually going to remove this. This special case of 0 width and height is not only ugly, but also wouldn't work with decoders that actually can do scaling, because the scaling ratio range is often constant, so the supported scaled frame sizes depend on the exact coded format. > > +4. Supported profiles and levels for given format, if applicable, may be > > + queried using their respective controls via :c:func:`VIDIOC_QUERYCTRL`. > > + > > +Initialization > > +============== > > + > > +1. *[optional]* Enumerate supported ``OUTPUT`` formats and resolutions. See > > + capability enumeration. > > + > > +2. Set the coded format on ``OUTPUT`` via :c:func:`VIDIOC_S_FMT` > > + > > + * **Required fields:** > > + > > + ``type`` > > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` > > + > > + ``pixelformat`` > > + a coded pixel format > > + > > + ``width``, ``height`` > > + required only if cannot be parsed from the stream for the given > > + coded format; optional otherwise - set to zero to ignore > > + > > + other fields > > + follow standard semantics > > + > > + * For coded formats including stream resolution information, if width > > + and height are set to non-zero values, the driver will propagate the > > + resolution to ``CAPTURE`` and signal a source change event > > + instantly. > > Maybe s/instantly/immediately before returning from :c:func:`VIDIOC_S_FMT`/ ? > > > However, after the decoder is done parsing the > > + information embedded in the stream, it will update ``CAPTURE`` > > s/update/update the/ > > > + format with new values and signal a source change event again, if > > s/, if/ if/ > > > + the values do not match. > > + > > + .. note:: > > + > > + Changing ``OUTPUT`` format may change currently set ``CAPTURE`` > > Do you have a particular dislike for definite articles ? :-) I would have > written "Changing the ``OUTPUT`` format may change the currently set > ``CAPTURE`` ...". I won't repeat the comment through the whole review, but > many places seem to be missing a definite article. Saving the^Wworld bandwidth one "the " at a time. ;) Hans also pointed some of those and I should have most of the missing ones added in my draft of v2. Thanks. > > > + format. The driver will derive a new ``CAPTURE`` format from > > + ``OUTPUT`` format being set, including resolution, colorimetry > > + parameters, etc. If the client needs a specific ``CAPTURE`` format, > > + it must adjust it afterwards. > > + > > +3. *[optional]* Get minimum number of buffers required for ``OUTPUT`` > > + queue via :c:func:`VIDIOC_G_CTRL`. This is useful if client intends to > > + use more buffers than minimum required by hardware/format. > > + > > + * **Required fields:** > > + > > + ``id`` > > + set to ``V4L2_CID_MIN_BUFFERS_FOR_OUTPUT`` > > + > > + * **Return fields:** > > + > > + ``value`` > > + required number of ``OUTPUT`` buffers for the currently set > > + format > > s/required/required minimum/ I made it "the minimum number of [...] buffers required". > > > + > > +4. Allocate source (bitstream) buffers via :c:func:`VIDIOC_REQBUFS` on > > + ``OUTPUT``. > > + > > + * **Required fields:** > > + > > + ``count`` > > + requested number of buffers to allocate; greater than zero > > + > > + ``type`` > > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` > > + > > + ``memory`` > > + follows standard semantics > > + > > + ``sizeimage`` > > + follows standard semantics; the client is free to choose any > > + suitable size, however, it may be subject to change by the > > + driver > > + > > + * **Return fields:** > > + > > + ``count`` > > + actual number of buffers allocated > > + > > + * The driver must adjust count to minimum of required number of > > + ``OUTPUT`` buffers for given format and count passed. > > Isn't it the maximum, not the minimum ? > It's actually neither. All we can generally say here is that the number will be adjusted and the client must note it. > > The client must > > + check this value after the ioctl returns to get the number of > > + buffers allocated. > > + > > + .. note:: > > + > > + To allocate more than minimum number of buffers (for pipeline > > + depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_OUTPUT``) to > > + get minimum number of buffers required by the driver/format, > > + and pass the obtained value plus the number of additional > > + buffers needed in count to :c:func:`VIDIOC_REQBUFS`. > > + > > +5. Start streaming on ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON`. > > + > > +6. This step only applies to coded formats that contain resolution > > + information in the stream. Continue queuing/dequeuing bitstream > > + buffers to/from the ``OUTPUT`` queue via :c:func:`VIDIOC_QBUF` and > > + :c:func:`VIDIOC_DQBUF`. The driver must keep processing and returning > > + each buffer to the client until required metadata to configure the > > + ``CAPTURE`` queue are found. This is indicated by the driver sending > > + a ``V4L2_EVENT_SOURCE_CHANGE`` event with > > + ``V4L2_EVENT_SRC_CH_RESOLUTION`` source change type. There is no > > + requirement to pass enough data for this to occur in the first buffer > > + and the driver must be able to process any number. > > + > > + * If data in a buffer that triggers the event is required to decode > > + the first frame, the driver must not return it to the client, > > + but must retain it for further decoding. > > + > > + * If the client set width and height of ``OUTPUT`` format to 0, calling > > + :c:func:`VIDIOC_G_FMT` on the ``CAPTURE`` queue will return -EPERM, > > + until the driver configures ``CAPTURE`` format according to stream > > + metadata. > > That's a pretty harsh handling for this condition. What's the rationale for > returning -EPERM instead of for instance succeeding with width and height set > to 0 ? I don't like it, but the error condition must stay for compatibility reasons as that's what current drivers implement and applications expect. (Technically current drivers would return -EINVAL, but we concluded that existing applications don't care about the exact value, so we can change it to make more sense.) > > > + * If the client subscribes to ``V4L2_EVENT_SOURCE_CHANGE`` events and > > + the event is signaled, the decoding process will not continue until > > + it is acknowledged by either (re-)starting streaming on ``CAPTURE``, > > + or via :c:func:`VIDIOC_DECODER_CMD` with ``V4L2_DEC_CMD_START`` > > + command. > > + > > + .. note:: > > + > > + No decoded frames are produced during this phase. > > + > > +7. This step only applies to coded formats that contain resolution > > + information in the stream. > > + Receive and handle ``V4L2_EVENT_SOURCE_CHANGE`` from the driver > > + via :c:func:`VIDIOC_DQEVENT`. The driver must send this event once > > + enough data is obtained from the stream to allocate ``CAPTURE`` > > + buffers and to begin producing decoded frames. > > Doesn't the last sentence belong to step 6 (where it's already explained to > some extent) ? > > > + > > + * **Required fields:** > > + > > + ``type`` > > + set to ``V4L2_EVENT_SOURCE_CHANGE`` > > Isn't the type field set by the driver ? > > > + * **Return fields:** > > + > > + ``u.src_change.changes`` > > + set to ``V4L2_EVENT_SRC_CH_RESOLUTION`` > > + > > + * Any client query issued after the driver queues the event must return > > + values applying to the just parsed stream, including queue formats, > > + selection rectangles and controls. > > To align with the wording used so far, I would say that "the driver must" > return values applying to the just parsed stream. > > I think I would also move this to step 6, as it's related to queuing the > event, not dequeuing it. As I've rephrased the whole document to be more userspace-oriented, this step is actually going away. Step 6 will have a note about driver behavior. > > > +8. Call :c:func:`VIDIOC_G_FMT` for ``CAPTURE`` queue to get format for the > > + destination buffers parsed/decoded from the bitstream. > > + > > + * **Required fields:** > > + > > + ``type`` > > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` > > + > > + * **Return fields:** > > + > > + ``width``, ``height`` > > + frame buffer resolution for the decoded frames > > + > > + ``pixelformat`` > > + pixel format for decoded frames > > + > > + ``num_planes`` (for _MPLANE ``type`` only) > > + number of planes for pixelformat > > + > > + ``sizeimage``, ``bytesperline`` > > + as per standard semantics; matching frame buffer format > > + > > + .. note:: > > + > > + The value of ``pixelformat`` may be any pixel format supported and > > + must be supported for current stream, based on the information > > + parsed from the stream and hardware capabilities. It is suggested > > + that driver chooses the preferred/optimal format for given > > In compliance with RFC 2119, how about using "Drivers should choose" instead > of "It is suggested that driver chooses" ? The whole paragraph became: The value of ``pixelformat`` may be any pixel format supported by the decoder for the current stream. It is expected that the decoder chooses a preferred/optimal format for the default configuration. For example, a YUV format may be preferred over an RGB format, if additional conversion step would be required. > > > + configuration. For example, a YUV format may be preferred over an > > + RGB format, if additional conversion step would be required. > > + > > +9. *[optional]* Enumerate ``CAPTURE`` formats via > > + :c:func:`VIDIOC_ENUM_FMT` on ``CAPTURE`` queue. Once the stream > > + information is parsed and known, the client may use this ioctl to > > + discover which raw formats are supported for given stream and select on > > s/select on/select one/ Done. > > > + of them via :c:func:`VIDIOC_S_FMT`. > > + > > + .. note:: > > + > > + The driver will return only formats supported for the current stream > > + parsed in this initialization sequence, even if more formats may be > > + supported by the driver in general. > > + > > + For example, a driver/hardware may support YUV and RGB formats for > > + resolutions 1920x1088 and lower, but only YUV for higher > > + resolutions (due to hardware limitations). After parsing > > + a resolution of 1920x1088 or lower, :c:func:`VIDIOC_ENUM_FMT` may > > + return a set of YUV and RGB pixel formats, but after parsing > > + resolution higher than 1920x1088, the driver will not return RGB, > > + unsupported for this resolution. > > + > > + However, subsequent resolution change event triggered after > > + discovering a resolution change within the same stream may switch > > + the stream into a lower resolution and :c:func:`VIDIOC_ENUM_FMT` > > + would return RGB formats again in that case. > > + > > +10. *[optional]* Choose a different ``CAPTURE`` format than suggested via > > + :c:func:`VIDIOC_S_FMT` on ``CAPTURE`` queue. It is possible for the > > + client to choose a different format than selected/suggested by the > > And here, "A client may choose" ? > > > + driver in :c:func:`VIDIOC_G_FMT`. > > + > > + * **Required fields:** > > + > > + ``type`` > > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` > > + > > + ``pixelformat`` > > + a raw pixel format > > + > > + .. note:: > > + > > + Calling :c:func:`VIDIOC_ENUM_FMT` to discover currently available > > + formats after receiving ``V4L2_EVENT_SOURCE_CHANGE`` is useful to > > + find out a set of allowed formats for given configuration, but not > > + required, if the client can accept the defaults. > > s/required/required,/ That would become "[...]but not required,, if the client[...]". Is that your suggestion? ;) > > > + > > +11. *[optional]* Acquire visible resolution via > > + :c:func:`VIDIOC_G_SELECTION`. > > + > > + * **Required fields:** > > + > > + ``type`` > > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` > > + > > + ``target`` > > + set to ``V4L2_SEL_TGT_COMPOSE`` > > + > > + * **Return fields:** > > + > > + ``r.left``, ``r.top``, ``r.width``, ``r.height`` > > + visible rectangle; this must fit within frame buffer resolution > > + returned by :c:func:`VIDIOC_G_FMT`. > > + > > + * The driver must expose following selection targets on ``CAPTURE``: > > + > > + ``V4L2_SEL_TGT_CROP_BOUNDS`` > > + corresponds to coded resolution of the stream > > + > > + ``V4L2_SEL_TGT_CROP_DEFAULT`` > > + a rectangle covering the part of the frame buffer that contains > > + meaningful picture data (visible area); width and height will be > > + equal to visible resolution of the stream > > + > > + ``V4L2_SEL_TGT_CROP`` > > + rectangle within coded resolution to be output to ``CAPTURE``; > > + defaults to ``V4L2_SEL_TGT_CROP_DEFAULT``; read-only on hardware > > + without additional compose/scaling capabilities > > + > > + ``V4L2_SEL_TGT_COMPOSE_BOUNDS`` > > + maximum rectangle within ``CAPTURE`` buffer, which the cropped > > + frame can be output into; equal to ``V4L2_SEL_TGT_CROP``, if the > > + hardware does not support compose/scaling > > + > > + ``V4L2_SEL_TGT_COMPOSE_DEFAULT`` > > + equal to ``V4L2_SEL_TGT_CROP`` > > + > > + ``V4L2_SEL_TGT_COMPOSE`` > > + rectangle inside ``OUTPUT`` buffer into which the cropped frame > > s/OUTPUT/CAPTURE/ ? > > > + is output; defaults to ``V4L2_SEL_TGT_COMPOSE_DEFAULT``; > > and "is captured" or "is written" ? > > > + read-only on hardware without additional compose/scaling > > + capabilities > > + > > + ``V4L2_SEL_TGT_COMPOSE_PADDED`` > > + rectangle inside ``OUTPUT`` buffer which is overwritten by the > > Here too ? > > > + hardware; equal to ``V4L2_SEL_TGT_COMPOSE``, if the hardware > > s/, if/ if/ Ack +3 > > > + does not write padding pixels > > + > > +12. *[optional]* Get minimum number of buffers required for ``CAPTURE`` > > + queue via :c:func:`VIDIOC_G_CTRL`. This is useful if client intends to > > + use more buffers than minimum required by hardware/format. > > + > > + * **Required fields:** > > + > > + ``id`` > > + set to ``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE`` > > + > > + * **Return fields:** > > + > > + ``value`` > > + minimum number of buffers required to decode the stream parsed in > > + this initialization sequence. > > + > > + .. note:: > > + > > + Note that the minimum number of buffers must be at least the number > > + required to successfully decode the current stream. This may for > > + example be the required DPB size for an H.264 stream given the > > + parsed stream configuration (resolution, level). > > + > > +13. Allocate destination (raw format) buffers via :c:func:`VIDIOC_REQBUFS` > > + on the ``CAPTURE`` queue. > > + > > + * **Required fields:** > > + > > + ``count`` > > + requested number of buffers to allocate; greater than zero > > + > > + ``type`` > > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` > > + > > + ``memory`` > > + follows standard semantics > > + > > + * **Return fields:** > > + > > + ``count`` > > + adjusted to allocated number of buffers > > + > > + * The driver must adjust count to minimum of required number of > > s/minimum/maximum/ ? > > Should we also mentioned that if count > minimum, the driver may additionally > limit the number of buffers based on internal limits (such as maximum memory > consumption) ? I made it less specific: * The count will be adjusted by the decoder to match the stream and hardware requirements. The client must check the final value after the ioctl returns to get the number of buffers allocated. > > > + destination buffers for given format and stream configuration and the > > + count passed. The client must check this value after the ioctl > > + returns to get the number of buffers allocated. > > + > > + .. note:: > > + > > + To allocate more than minimum number of buffers (for pipeline > > + depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE``) to > > + get minimum number of buffers required, and pass the obtained value > > + plus the number of additional buffers needed in count to > > + :c:func:`VIDIOC_REQBUFS`. > > + > > +14. Call :c:func:`VIDIOC_STREAMON` to initiate decoding frames. > > + > > +Decoding > > +======== > > + > > +This state is reached after a successful initialization sequence. In this > > +state, client queues and dequeues buffers to both queues via > > +:c:func:`VIDIOC_QBUF` and :c:func:`VIDIOC_DQBUF`, following standard > > +semantics. > > + > > +Both queues operate independently, following standard behavior of V4L2 > > +buffer queues and memory-to-memory devices. In addition, the order of > > +decoded frames dequeued from ``CAPTURE`` queue may differ from the order of > > +queuing coded frames to ``OUTPUT`` queue, due to properties of selected > > +coded format, e.g. frame reordering. The client must not assume any direct > > +relationship between ``CAPTURE`` and ``OUTPUT`` buffers, other than > > +reported by :c:type:`v4l2_buffer` ``timestamp`` field. > > + > > +The contents of source ``OUTPUT`` buffers depend on active coded pixel > > +format and might be affected by codec-specific extended controls, as stated > > s/might/may/ > > > +in documentation of each format individually. > > + > > +The client must not assume any direct relationship between ``CAPTURE`` > > +and ``OUTPUT`` buffers and any specific timing of buffers becoming > > +available to dequeue. Specifically: > > + > > +* a buffer queued to ``OUTPUT`` may result in no buffers being produced > > + on ``CAPTURE`` (e.g. if it does not contain encoded data, or if only > > + metadata syntax structures are present in it), > > + > > +* a buffer queued to ``OUTPUT`` may result in more than 1 buffer produced > > + on ``CAPTURE`` (if the encoded data contained more than one frame, or if > > + returning a decoded frame allowed the driver to return a frame that > > + preceded it in decode, but succeeded it in display order), > > + > > +* a buffer queued to ``OUTPUT`` may result in a buffer being produced on > > + ``CAPTURE`` later into decode process, and/or after processing further > > + ``OUTPUT`` buffers, or be returned out of order, e.g. if display > > + reordering is used, > > + > > +* buffers may become available on the ``CAPTURE`` queue without additional > > s/buffers/Buffers/ > I don't think the items should be capitalized here. > > + buffers queued to ``OUTPUT`` (e.g. during drain or ``EOS``), because of > > + ``OUTPUT`` buffers being queued in the past and decoding result of which > > + being available only at later time, due to specifics of the decoding > > + process. > > I understand what you mean, but the wording is weird to my eyes. How about > > * Buffers may become available on the ``CAPTURE`` queue without additional > buffers queued to ``OUTPUT`` (e.g. during drain or ``EOS``), because of > ``OUTPUT`` buffers queued in the past whose decoding results are only > available at later time, due to specifics of the decoding process. Done, thanks. > > > +Seek > > +==== > > + > > +Seek is controlled by the ``OUTPUT`` queue, as it is the source of > > +bitstream data. ``CAPTURE`` queue remains unchanged/unaffected. > > I assume that a seek may result in a source resolution change event, in which > case the capture queue will be affected. How about stating here that > controlling seek doesn't require any specific operation on the capture queue, > but that the capture queue may be affected as per normal decoder operation ? > We may also want to mention the event as an example. Done. I've also added a general section about decoder-initialized sequences in the Decoding section. > > > +1. Stop the ``OUTPUT`` queue to begin the seek sequence via > > + :c:func:`VIDIOC_STREAMOFF`. > > + > > + * **Required fields:** > > + > > + ``type`` > > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` > > + > > + * The driver must drop all the pending ``OUTPUT`` buffers and they are > > + treated as returned to the client (following standard semantics). > > + > > +2. Restart the ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON` > > + > > + * **Required fields:** > > + > > + ``type`` > > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` > > + > > + * The driver must be put in a state after seek and be ready to > > What do you mean by "a state after seek" ? > * The decoder will start accepting new source bitstream buffers after the call returns. > > + accept new source bitstream buffers. > > + > > +3. Start queuing buffers to ``OUTPUT`` queue containing stream data after > > + the seek until a suitable resume point is found. > > + > > + .. note:: > > + > > + There is no requirement to begin queuing stream starting exactly from > > s/stream/buffers/ ? Perhaps "stream data"? The buffers don't have a resume point, the stream does. > > > + a resume point (e.g. SPS or a keyframe). The driver must handle any > > + data queued and must keep processing the queued buffers until it > > + finds a suitable resume point. While looking for a resume point, the > > + driver processes ``OUTPUT`` buffers and returns them to the client > > + without producing any decoded frames. > > + > > + For hardware known to be mishandling seeks to a non-resume point, > > + e.g. by returning corrupted decoded frames, the driver must be able > > + to handle such seeks without a crash or any fatal decode error. > > This should be true for any hardware, there should never be any crash or fatal > decode error. I'd write it as > > Some hardware is known to mishandle seeks to a non-resume point. Such an > operation may result in an unspecified number of corrupted decoded frames > being made available on ``CAPTURE``. Drivers must ensure that no fatal > decoding errors or crashes occur, and implement any necessary handling and > work-arounds for hardware issues related to seek operations. > Done. > > +4. After a resume point is found, the driver will start returning > > + ``CAPTURE`` buffers with decoded frames. > > + > > + * There is no precise specification for ``CAPTURE`` queue of when it > > + will start producing buffers containing decoded data from buffers > > + queued after the seek, as it operates independently > > + from ``OUTPUT`` queue. > > + > > + * The driver is allowed to and may return a number of remaining > > s/is allowed to and may/may/ > > > + ``CAPTURE`` buffers containing decoded frames from before the seek > > + after the seek sequence (STREAMOFF-STREAMON) is performed. > > Shouldn't all these buffers be returned when STREAMOFF is called on the OUTPUT > side ? The queues are independent, so STREAMOFF on OUTPUT would only return the OUTPUT buffers. That's why there is the note suggesting that the application may also stop streaming on CAPTURE to avoid stale frames being returned. > > > + * The driver is also allowed to and may not return all decoded frames > > s/is also allowed to and may not return/may also not return/ > > > + queued but not decode before the seek sequence was initiated. For > > s/not decode/not decoded/ > > > + example, given an ``OUTPUT`` queue sequence: QBUF(A), QBUF(B), > > + STREAMOFF(OUT), STREAMON(OUT), QBUF(G), QBUF(H), any of the > > + following results on the ``CAPTURE`` queue is allowed: {A’, B’, G’, > > + H’}, {A’, G’, H’}, {G’, H’}. > > Related to the previous point, shouldn't this be moved to step 1 ? I've made it a general warning after the whole sequence. > > > + .. note:: > > + > > + To achieve instantaneous seek, the client may restart streaming on > > + ``CAPTURE`` queue to discard decoded, but not yet dequeued buffers. > > + > > +Pause > > +===== > > + > > +In order to pause, the client should just cease queuing buffers onto the > > +``OUTPUT`` queue. This is different from the general V4L2 API definition of > > +pause, which involves calling :c:func:`VIDIOC_STREAMOFF` on the queue. > > +Without source bitstream data, there is no data to process and the > > hardware +remains idle. > > + > > +Conversely, using :c:func:`VIDIOC_STREAMOFF` on ``OUTPUT`` queue indicates > > +a seek, which > > + > > +1. drops all ``OUTPUT`` buffers in flight and > > +2. after a subsequent :c:func:`VIDIOC_STREAMON`, will look for and only > > + continue from a resume point. > > + > > +This is usually undesirable for pause. The STREAMOFF-STREAMON sequence is > > +intended for seeking. > > + > > +Similarly, ``CAPTURE`` queue should remain streaming as well, as the > > +STREAMOFF-STREAMON sequence on it is intended solely for changing buffer > > +sets. > > And also to drop decoded buffers for instant seek ? > I've dropped the Pause section completely. It doesn't provide any useful information IMHO and only doubles with the general semantics of mem2mem devices. > > +Dynamic resolution change > > +========================= > > + > > +A video decoder implementing this interface must support dynamic resolution > > +change, for streams, which include resolution metadata in the bitstream. > > s/for streams, which/for streams that/ > > > +When the decoder encounters a resolution change in the stream, the dynamic > > +resolution change sequence is started. > > + > > +1. After encountering a resolution change in the stream, the driver must > > + first process and decode all remaining buffers from before the > > + resolution change point. > > + > > +2. After all buffers containing decoded frames from before the resolution > > + change point are ready to be dequeued on the ``CAPTURE`` queue, the > > + driver sends a ``V4L2_EVENT_SOURCE_CHANGE`` event for source change > > + type ``V4L2_EVENT_SRC_CH_RESOLUTION``. > > + > > + * The last buffer from before the change must be marked with > > + :c:type:`v4l2_buffer` ``flags`` flag ``V4L2_BUF_FLAG_LAST`` as in the > > + drain sequence. The last buffer might be empty (with > > + :c:type:`v4l2_buffer` ``bytesused`` = 0) and must be ignored by the > > + client, since it does not contain any decoded frame. > > + > > + * Any client query issued after the driver queues the event must return > > + values applying to the stream after the resolution change, including > > + queue formats, selection rectangles and controls. > > + > > + * If the client subscribes to ``V4L2_EVENT_SOURCE_CHANGE`` events and > > + the event is signaled, the decoding process will not continue until > > + it is acknowledged by either (re-)starting streaming on ``CAPTURE``, > > + or via :c:func:`VIDIOC_DECODER_CMD` with ``V4L2_DEC_CMD_START`` > > + command. > > This usage of V4L2_DEC_CMD_START isn't aligned with the documentation of the > command. I'm not opposed to this, but I think the use cases of decoder > commands for codecs should be explained in the VIDIOC_DECODER_CMD > documentation. What bothers me in particular is usage of V4L2_DEC_CMD_START to > restart the decoder, while no V4L2_DEC_CMD_STOP has been issued. Should we add > a section that details the decoder state machine with the implicit and > explicit ways in which it is started and stopped ? Yes, we should probably extend the VIDIOC_DECODER_CMD documentation. As for diagrams, they would indeed be nice to have, but maybe we could add them in a follow up patch? > > I would also reference step 7 here. > > > + .. note:: > > + > > + Any attempts to dequeue more buffers beyond the buffer marked > > + with ``V4L2_BUF_FLAG_LAST`` will result in a -EPIPE error from > > + :c:func:`VIDIOC_DQBUF`. > > + > > +3. The client calls :c:func:`VIDIOC_G_FMT` for ``CAPTURE`` to get the new > > + format information. This is identical to calling :c:func:`VIDIOC_G_FMT` > > + after ``V4L2_EVENT_SRC_CH_RESOLUTION`` in the initialization sequence > > + and should be handled similarly. > > As the source resolution change event is mentioned in multiple places, how > about extracting the related ioctls sequence to a specific section, and > referencing it where needed (at least from the initialization sequence and > here) ? I made the text here refer to the Initialization sequence. > > > + .. note:: > > + > > + It is allowed for the driver not to support the same pixel format as > > "Drivers may not support ..." > > > + previously used (before the resolution change) for the new > > + resolution. The driver must select a default supported pixel format, > > + return it, if queried using :c:func:`VIDIOC_G_FMT`, and the client > > + must take note of it. > > + > > +4. The client acquires visible resolution as in initialization sequence. > > + > > +5. *[optional]* The client is allowed to enumerate available formats and > > s/is allowed to/may/ > > > + select a different one than currently chosen (returned via > > + :c:func:`VIDIOC_G_FMT)`. This is identical to a corresponding step in > > + the initialization sequence. > > + > > +6. *[optional]* The client acquires minimum number of buffers as in > > + initialization sequence. > > + > > +7. If all the following conditions are met, the client may resume the > > + decoding instantly, by using :c:func:`VIDIOC_DECODER_CMD` with > > + ``V4L2_DEC_CMD_START`` command, as in case of resuming after the drain > > + sequence: > > + > > + * ``sizeimage`` of new format is less than or equal to the size of > > + currently allocated buffers, > > + > > + * the number of buffers currently allocated is greater than or equal to > > + the minimum number of buffers acquired in step 6. > > + > > + In such case, the remaining steps do not apply. > > + > > + However, if the client intends to change the buffer set, to lower > > + memory usage or for any other reasons, it may be achieved by following > > + the steps below. > > + > > +8. After dequeuing all remaining buffers from the ``CAPTURE`` queue, > > This is optional, isn't it ? > I wouldn't call it optional, since it depends on what the client does and what the decoder supports. That's why the point above just states that the remaining steps do not apply. Also added a note: To fulfill those requirements, the client may attempt to use :c:func:`VIDIOC_CREATE_BUFS` to add more buffers. However, due to hardware limitations, the decoder may not support adding buffers at this point and the client must be able to handle a failure using the steps below. > > the > > + client must call :c:func:`VIDIOC_STREAMOFF` on the ``CAPTURE`` queue. > > + The ``OUTPUT`` queue must remain streaming (calling STREAMOFF on it > > :c:func:`VIDIOC_STREAMOFF` > > > + would trigger a seek). > > + > > +9. The client frees the buffers on the ``CAPTURE`` queue using > > + :c:func:`VIDIOC_REQBUFS`. > > + > > + * **Required fields:** > > + > > + ``count`` > > + set to 0 > > + > > + ``type`` > > + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` > > + > > + ``memory`` > > + follows standard semantics > > + > > +10. The client allocates a new set of buffers for the ``CAPTURE`` queue via > > + :c:func:`VIDIOC_REQBUFS`. This is identical to a corresponding step in > > + the initialization sequence. > > + > > +11. The client resumes decoding by issuing :c:func:`VIDIOC_STREAMON` on the > > + ``CAPTURE`` queue. > > + > > +During the resolution change sequence, the ``OUTPUT`` queue must remain > > +streaming. Calling :c:func:`VIDIOC_STREAMOFF` on ``OUTPUT`` queue would > > +initiate a seek. > > + > > +The ``OUTPUT`` queue operates separately from the ``CAPTURE`` queue for the > > +duration of the entire resolution change sequence. It is allowed (and > > +recommended for best performance and simplicity) for the client to keep > > "The client should (for best performance and simplicity) keep ..." > > > +queuing/dequeuing buffers from/to ``OUTPUT`` queue even while processing > > s/from\to/to\/from/ > > > +this sequence. > > + > > +.. note:: > > + > > + It is also possible for this sequence to be triggered without a change > > "This sequence may be triggered ..." > > > + in coded resolution, if a different number of ``CAPTURE`` buffers is > > + required in order to continue decoding the stream or the visible > > + resolution changes. > > + > > +Drain > > +===== > > + > > +To ensure that all queued ``OUTPUT`` buffers have been processed and > > +related ``CAPTURE`` buffers output to the client, the following drain > > +sequence may be followed. After the drain sequence is complete, the client > > +has received all decoded frames for all ``OUTPUT`` buffers queued before > > +the sequence was started. > > + > > +1. Begin drain by issuing :c:func:`VIDIOC_DECODER_CMD`. > > + > > + * **Required fields:** > > + > > + ``cmd`` > > + set to ``V4L2_DEC_CMD_STOP`` > > + > > + ``flags`` > > + set to 0 > > + > > + ``pts`` > > + set to 0 > > + > > +2. The driver must process and decode as normal all ``OUTPUT`` buffers > > + queued by the client before the :c:func:`VIDIOC_DECODER_CMD` was issued. > > + Any operations triggered as a result of processing these buffers > > + (including the initialization and resolution change sequences) must be > > + processed as normal by both the driver and the client before proceeding > > + with the drain sequence. > > + > > +3. Once all ``OUTPUT`` buffers queued before ``V4L2_DEC_CMD_STOP`` are > > + processed: > > + > > + * If the ``CAPTURE`` queue is streaming, once all decoded frames (if > > + any) are ready to be dequeued on the ``CAPTURE`` queue, the driver > > + must send a ``V4L2_EVENT_EOS``. > > s/\./event./ > > Is the event sent on the OUTPUT or CAPTURE queue ? I assume the latter, should > it be explicitly documented ? > AFAICS, there is no queue type indication in the v4l2_event struct. In any case, I've removed this event, because existing drivers don't implement it for the drain sequence and it also makes it more consistent, since events would be only signaled for decoder-initiated sequences. It would also allow distinguishing between an EOS mark in the stream (event signaled) or end of a drain sequence (no event). > > The driver must also set > > + ``V4L2_BUF_FLAG_LAST`` in :c:type:`v4l2_buffer` ``flags`` field on the > > + buffer on the ``CAPTURE`` queue containing the last frame (if any) > > + produced as a result of processing the ``OUTPUT`` buffers queued > > + before ``V4L2_DEC_CMD_STOP``. If no more frames are left to be > > + returned at the point of handling ``V4L2_DEC_CMD_STOP``, the driver > > + must return an empty buffer (with :c:type:`v4l2_buffer` > > + ``bytesused`` = 0) as the last buffer with ``V4L2_BUF_FLAG_LAST`` set > > + instead. Any attempts to dequeue more buffers beyond the buffer marked > > + with ``V4L2_BUF_FLAG_LAST`` will result in a -EPIPE error from > > + :c:func:`VIDIOC_DQBUF`. > > + > > + * If the ``CAPTURE`` queue is NOT streaming, no action is necessary for > > + ``CAPTURE`` queue and the driver must send a ``V4L2_EVENT_EOS`` > > + immediately after all ``OUTPUT`` buffers in question have been > > + processed. > > What is the use case for this ? Can't we just return an error if decoder isn't > streaming ? > Actually this is wrong. We want the queued OUTPUT buffers to be processed and decoded, so if the CAPTURE queue is not yet set up (initialization sequence not completed yet), handling the initialization sequence first will be needed as a part of the drain sequence. I've updated the document with that. > > +4. At this point, decoding is paused and the driver will accept, but not > > + process any newly queued ``OUTPUT`` buffers until the client issues > > + ``V4L2_DEC_CMD_START`` or restarts streaming on any queue. > > + > > +* Once the drain sequence is initiated, the client needs to drive it to > > + completion, as described by the above steps, unless it aborts the process > > + by issuing :c:func:`VIDIOC_STREAMOFF` on ``OUTPUT`` queue. The client > > + is not allowed to issue ``V4L2_DEC_CMD_START`` or ``V4L2_DEC_CMD_STOP`` > > + again while the drain sequence is in progress and they will fail with > > + -EBUSY error code if attempted. > > While this seems OK to me, I think drivers will need help to implement all the > corner cases correctly without race conditions. We went through the possible list of corner cases and concluded that there is no use in handling them, especially considering how much they would complicate both the userspace and the drivers. Not even mentioning some hardware, like s5p-mfc, which actually has a dedicated flush operation, that needs to complete before the decoder can switch back to normal mode. > > > +* Restarting streaming on ``OUTPUT`` queue will implicitly end the paused > > + state and reinitialize the decoder (similarly to the seek sequence). > > + Restarting ``CAPTURE`` queue will not affect an in-progress drain > > + sequence. > > + > > +* The drivers must also implement :c:func:`VIDIOC_TRY_DECODER_CMD`, as a > > + way to let the client query the availability of decoder commands. > > + > > +End of stream > > +============= > > + > > +If the decoder encounters an end of stream marking in the stream, the > > +driver must send a ``V4L2_EVENT_EOS`` event > > On which queue ? > Hmm? > > to the client after all frames > > +are decoded and ready to be dequeued on the ``CAPTURE`` queue, with the > > +:c:type:`v4l2_buffer` ``flags`` set to ``V4L2_BUF_FLAG_LAST``. This > > +behavior is identical to the drain sequence triggered by the client via > > +``V4L2_DEC_CMD_STOP``. > > + > > +Commit points > > +============= > > + > > +Setting formats and allocating buffers triggers changes in the behavior > > s/triggers/trigger/ > > > +of the driver. > > + > > +1. Setting format on ``OUTPUT`` queue may change the set of formats > > + supported/advertised on the ``CAPTURE`` queue. In particular, it also > > + means that ``CAPTURE`` format may be reset and the client must not > > + rely on the previously set format being preserved. > > + > > +2. Enumerating formats on ``CAPTURE`` queue must only return formats > > + supported for the ``OUTPUT`` format currently set. > > + > > +3. Setting/changing format on ``CAPTURE`` queue does not change formats > > Why not just "Setting format" ? > > > + available on ``OUTPUT`` queue. An attempt to set ``CAPTURE`` format that > > + is not supported for the currently selected ``OUTPUT`` format must > > + result in the driver adjusting the requested format to an acceptable > > + one. > > + > > +4. Enumerating formats on ``OUTPUT`` queue always returns the full set of > > + supported coded formats, irrespective of the current ``CAPTURE`` > > + format. > > + > > +5. After allocating buffers on the ``OUTPUT`` queue, it is not possible to > > + change format on it. > > I'd phrase this as > > "While buffers are allocated on the ``OUTPUT`` queue, clients must not change > the format on the queue. Drivers must return <error code> for any such format > change attempt." Done, thanks. Best regards, Tomasz
Hi Tomasz, I've stripped out all the parts on which I have no specific comment or just agree with your proposal. Please see below for a few additional remarks. On Thursday, 18 October 2018 13:03:33 EEST Tomasz Figa wrote: > On Wed, Oct 17, 2018 at 10:34 PM Laurent Pinchart wrote: > > On Tuesday, 24 July 2018 17:06:20 EEST Tomasz Figa wrote: > >> Due to complexity of the video decoding process, the V4L2 drivers of > >> stateful decoder hardware require specific sequences of V4L2 API calls > >> to be followed. These include capability enumeration, initialization, > >> decoding, seek, pause, dynamic resolution change, drain and end of > >> stream. > >> > >> Specifics of the above have been discussed during Media Workshops at > >> LinuxCon Europe 2012 in Barcelona and then later Embedded Linux > >> Conference Europe 2014 in Düsseldorf. The de facto Codec API that > >> originated at those events was later implemented by the drivers we > >> already have merged in mainline, such as s5p-mfc or coda. > >> > >> The only thing missing was the real specification included as a part of > >> Linux Media documentation. Fix it now and document the decoder part of > >> the Codec API. > >> > >> Signed-off-by: Tomasz Figa <tfiga@chromium.org> > >> --- > >> > >> Documentation/media/uapi/v4l/dev-decoder.rst | 872 +++++++++++++++++++ > >> Documentation/media/uapi/v4l/devices.rst | 1 + > >> Documentation/media/uapi/v4l/v4l2.rst | 10 +- > >> 3 files changed, 882 insertions(+), 1 deletion(-) > >> create mode 100644 Documentation/media/uapi/v4l/dev-decoder.rst > >> > >> diff --git a/Documentation/media/uapi/v4l/dev-decoder.rst > >> b/Documentation/media/uapi/v4l/dev-decoder.rst new file mode 100644 > >> index 000000000000..f55d34d2f860 > >> --- /dev/null > >> +++ b/Documentation/media/uapi/v4l/dev-decoder.rst > >> @@ -0,0 +1,872 @@ [snip] > >> +4. Allocate source (bitstream) buffers via :c:func:`VIDIOC_REQBUFS` on > >> + ``OUTPUT``. > >> + > >> + * **Required fields:** > >> + > >> + ``count`` > >> + requested number of buffers to allocate; greater than zero > >> + > >> + ``type`` > >> + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` > >> + > >> + ``memory`` > >> + follows standard semantics > >> + > >> + ``sizeimage`` > >> + follows standard semantics; the client is free to choose any > >> + suitable size, however, it may be subject to change by the > >> + driver > >> + > >> + * **Return fields:** > >> + > >> + ``count`` > >> + actual number of buffers allocated > >> + > >> + * The driver must adjust count to minimum of required number of > >> + ``OUTPUT`` buffers for given format and count passed. > > > > Isn't it the maximum, not the minimum ? > > It's actually neither. All we can generally say here is that the > number will be adjusted and the client must note it. I expect it to be clamp(requested count, driver minimum, driver maximum). I'm not sure it's worth capturing this in the document though, but we could say "The driver must clam count to the minimum and maximum number of required ``OUTPUT`` buffers for the given format ." > >> The client must > >> + check this value after the ioctl returns to get the number of > >> + buffers allocated. > >> + > >> + .. note:: > >> + > >> + To allocate more than minimum number of buffers (for pipeline > >> + depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_OUTPUT``) to > >> + get minimum number of buffers required by the driver/format, > >> + and pass the obtained value plus the number of additional > >> + buffers needed in count to :c:func:`VIDIOC_REQBUFS`. > >> + > >> +5. Start streaming on ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON`. > >> + > >> +6. This step only applies to coded formats that contain resolution > >> + information in the stream. Continue queuing/dequeuing bitstream > >> + buffers to/from the ``OUTPUT`` queue via :c:func:`VIDIOC_QBUF` and > >> + :c:func:`VIDIOC_DQBUF`. The driver must keep processing and > >> returning > >> + each buffer to the client until required metadata to configure the > >> + ``CAPTURE`` queue are found. This is indicated by the driver > >> sending > >> + a ``V4L2_EVENT_SOURCE_CHANGE`` event with > >> + ``V4L2_EVENT_SRC_CH_RESOLUTION`` source change type. There is no > >> + requirement to pass enough data for this to occur in the first > >> buffer > >> + and the driver must be able to process any number. > >> + > >> + * If data in a buffer that triggers the event is required to decode > >> + the first frame, the driver must not return it to the client, > >> + but must retain it for further decoding. > >> + > >> + * If the client set width and height of ``OUTPUT`` format to 0, > >> calling > >> + :c:func:`VIDIOC_G_FMT` on the ``CAPTURE`` queue will return > >> -EPERM, > >> + until the driver configures ``CAPTURE`` format according to stream > >> + metadata. > > > > That's a pretty harsh handling for this condition. What's the rationale > > for returning -EPERM instead of for instance succeeding with width and > > height set to 0 ? > > I don't like it, but the error condition must stay for compatibility > reasons as that's what current drivers implement and applications > expect. (Technically current drivers would return -EINVAL, but we > concluded that existing applications don't care about the exact value, > so we can change it to make more sense.) Fair enough :-/ A bit of a shame though. Should we try to use an error code that would have less chance of being confused with an actual permission problem ? -EILSEQ could be an option for "illegal sequence" of operations, but better options could exist. > >> + * If the client subscribes to ``V4L2_EVENT_SOURCE_CHANGE`` events > >> and > >> + the event is signaled, the decoding process will not continue > >> until > >> + it is acknowledged by either (re-)starting streaming on > >> ``CAPTURE``, > >> + or via :c:func:`VIDIOC_DECODER_CMD` with ``V4L2_DEC_CMD_START`` > >> + command. > >> + > >> + .. note:: > >> + > >> + No decoded frames are produced during this phase. > >> + [snip] > >> +8. Call :c:func:`VIDIOC_G_FMT` for ``CAPTURE`` queue to get format for > >> the + destination buffers parsed/decoded from the bitstream. > >> + > >> + * **Required fields:** > >> + > >> + ``type`` > >> + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` > >> + > >> + * **Return fields:** > >> + > >> + ``width``, ``height`` > >> + frame buffer resolution for the decoded frames > >> + > >> + ``pixelformat`` > >> + pixel format for decoded frames > >> + > >> + ``num_planes`` (for _MPLANE ``type`` only) > >> + number of planes for pixelformat > >> + > >> + ``sizeimage``, ``bytesperline`` > >> + as per standard semantics; matching frame buffer format > >> + > >> + .. note:: > >> + > >> + The value of ``pixelformat`` may be any pixel format supported > >> and > >> + must be supported for current stream, based on the information > >> + parsed from the stream and hardware capabilities. It is > >> suggested > >> + that driver chooses the preferred/optimal format for given > > > > In compliance with RFC 2119, how about using "Drivers should choose" > > instead of "It is suggested that driver chooses" ? > > The whole paragraph became: > > The value of ``pixelformat`` may be any pixel format supported by the > decoder for the current stream. It is expected that the decoder chooses a > preferred/optimal format for the default configuration. For example, a YUV > format may be preferred over an RGB format, if additional conversion step > would be required. How about using "should" instead of "it is expected that" ? [snip] > >> +10. *[optional]* Choose a different ``CAPTURE`` format than suggested > >> via > >> + :c:func:`VIDIOC_S_FMT` on ``CAPTURE`` queue. It is possible for > >> the > >> + client to choose a different format than selected/suggested by the > > > > And here, "A client may choose" ? > > > >> + driver in :c:func:`VIDIOC_G_FMT`. > >> + > >> + * **Required fields:** > >> + > >> + ``type`` > >> + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` > >> + > >> + ``pixelformat`` > >> + a raw pixel format > >> + > >> + .. note:: > >> + > >> + Calling :c:func:`VIDIOC_ENUM_FMT` to discover currently > >> available > >> + formats after receiving ``V4L2_EVENT_SOURCE_CHANGE`` is useful > >> to > >> + find out a set of allowed formats for given configuration, but > >> not > >> + required, if the client can accept the defaults. > > > > s/required/required,/ > > That would become "[...]but not required,, if the client[...]". Is > that your suggestion? ;) Oops, the other way around of course :-) [snip] > >> +3. Start queuing buffers to ``OUTPUT`` queue containing stream data > >> after > >> + the seek until a suitable resume point is found. > >> + > >> + .. note:: > >> + > >> + There is no requirement to begin queuing stream starting exactly > >> from > > > > s/stream/buffers/ ? > > Perhaps "stream data"? The buffers don't have a resume point, the stream > does. Maybe "coded data" ? > >> + a resume point (e.g. SPS or a keyframe). The driver must handle > >> any > >> + data queued and must keep processing the queued buffers until it > >> + finds a suitable resume point. While looking for a resume point, > >> the > >> + driver processes ``OUTPUT`` buffers and returns them to the > >> client > >> + without producing any decoded frames. > >> + > >> + For hardware known to be mishandling seeks to a non-resume point, > >> + e.g. by returning corrupted decoded frames, the driver must be > >> able > >> + to handle such seeks without a crash or any fatal decode error. > > > > This should be true for any hardware, there should never be any crash or > > fatal decode error. I'd write it as > > > > Some hardware is known to mishandle seeks to a non-resume point. Such an > > operation may result in an unspecified number of corrupted decoded frames > > being made available on ``CAPTURE``. Drivers must ensure that no fatal > > decoding errors or crashes occur, and implement any necessary handling and > > work-arounds for hardware issues related to seek operations. > > Done. [snip] > >> +2. After all buffers containing decoded frames from before the > >> resolution > >> + change point are ready to be dequeued on the ``CAPTURE`` queue, the > >> + driver sends a ``V4L2_EVENT_SOURCE_CHANGE`` event for source change > >> + type ``V4L2_EVENT_SRC_CH_RESOLUTION``. > >> + > >> + * The last buffer from before the change must be marked with > >> + :c:type:`v4l2_buffer` ``flags`` flag ``V4L2_BUF_FLAG_LAST`` as in > >> the + drain sequence. The last buffer might be empty (with > >> + :c:type:`v4l2_buffer` ``bytesused`` = 0) and must be ignored by > >> the > >> + client, since it does not contain any decoded frame. > >> + > >> + * Any client query issued after the driver queues the event must > >> return > >> + values applying to the stream after the resolution change, > >> including > >> + queue formats, selection rectangles and controls. > >> + > >> + * If the client subscribes to ``V4L2_EVENT_SOURCE_CHANGE`` events > >> and > >> + the event is signaled, the decoding process will not continue > >> until > >> + it is acknowledged by either (re-)starting streaming on > >> ``CAPTURE``, > >> + or via :c:func:`VIDIOC_DECODER_CMD` with ``V4L2_DEC_CMD_START`` > >> + command. > > > > This usage of V4L2_DEC_CMD_START isn't aligned with the documentation of > > the command. I'm not opposed to this, but I think the use cases of > > decoder commands for codecs should be explained in the VIDIOC_DECODER_CMD > > documentation. What bothers me in particular is usage of > > V4L2_DEC_CMD_START to restart the decoder, while no V4L2_DEC_CMD_STOP has > > been issued. Should we add a section that details the decoder state > > machine with the implicit and explicit ways in which it is started and > > stopped ? > > Yes, we should probably extend the VIDIOC_DECODER_CMD documentation. > > As for diagrams, they would indeed be nice to have, but maybe we could > add them in a follow up patch? That's another way to say it won't happen, right ? ;-) I'm OK with that, but I think we should still clarify that the source change generates an implicit V4L2_DEC_CMD_STOP. > > I would also reference step 7 here. > > > >> + .. note:: > >> + > >> + Any attempts to dequeue more buffers beyond the buffer marked > >> + with ``V4L2_BUF_FLAG_LAST`` will result in a -EPIPE error from > >> + :c:func:`VIDIOC_DQBUF`. > >> + > >> +3. The client calls :c:func:`VIDIOC_G_FMT` for ``CAPTURE`` to get the > >> new > >> + format information. This is identical to calling > >> :c:func:`VIDIOC_G_FMT` + after ``V4L2_EVENT_SRC_CH_RESOLUTION`` in > >> the initialization sequence + and should be handled similarly. > > > > As the source resolution change event is mentioned in multiple places, how > > about extracting the related ioctls sequence to a specific section, and > > referencing it where needed (at least from the initialization sequence and > > here) ? > > I made the text here refer to the Initialization sequence. Wouldn't it be clearer if those steps were extracted to a standalone sequence referenced from both locations ? > >> + .. note:: > >> + > >> + It is allowed for the driver not to support the same pixel > >> format as > > > > "Drivers may not support ..." > > > >> + previously used (before the resolution change) for the new > >> + resolution. The driver must select a default supported pixel > >> format, > >> + return it, if queried using :c:func:`VIDIOC_G_FMT`, and the > >> client > >> + must take note of it. > >> + > >> +4. The client acquires visible resolution as in initialization > >> sequence. > >> + > >> +5. *[optional]* The client is allowed to enumerate available formats > >> and > > > > s/is allowed to/may/ > > > >> + select a different one than currently chosen (returned via > >> + :c:func:`VIDIOC_G_FMT)`. This is identical to a corresponding step > >> in > >> + the initialization sequence. > >> + > >> +6. *[optional]* The client acquires minimum number of buffers as in > >> + initialization sequence. > >> + > >> +7. If all the following conditions are met, the client may resume the > >> + decoding instantly, by using :c:func:`VIDIOC_DECODER_CMD` with > >> + ``V4L2_DEC_CMD_START`` command, as in case of resuming after the > >> drain > >> + sequence: > >> + > >> + * ``sizeimage`` of new format is less than or equal to the size of > >> + currently allocated buffers, > >> + > >> + * the number of buffers currently allocated is greater than or > >> equal to > >> + the minimum number of buffers acquired in step 6. > >> + > >> + In such case, the remaining steps do not apply. > >> + > >> + However, if the client intends to change the buffer set, to lower > >> + memory usage or for any other reasons, it may be achieved by > >> following > >> + the steps below. > >> + > >> +8. After dequeuing all remaining buffers from the ``CAPTURE`` queue, > > > > This is optional, isn't it ? > > I wouldn't call it optional, since it depends on what the client does > and what the decoder supports. That's why the point above just states > that the remaining steps do not apply. I meant isn't the "After dequeuing all remaining buffers from the CAPTURE queue" part optional ? As far as I understand, the client may decide not to dequeue them. > Also added a note: > > To fulfill those requirements, the client may attempt to use > :c:func:`VIDIOC_CREATE_BUFS` to add more buffers. However, due to > hardware limitations, the decoder may not support adding buffers at > this point and the client must be able to handle a failure using the > steps below. I wonder if there could be a way to work around those limitations on the driver side. At the beginning of step 7, the decoder is effectively stopped. If the hardware doesn't support adding new buffers on the fly, can't the driver support the VIDIOC_CREATE_BUFS + V4L2_DEC_CMD_START sequence the same way it would support the VIDIOC_STREAMOFF + VIDIOC_REQBUFS(0) + VIDIOC_REQBUFS(n) + VIDIOC_STREAMON ? > >> the > >> + client must call :c:func:`VIDIOC_STREAMOFF` on the ``CAPTURE`` > >> queue. > >> + The ``OUTPUT`` queue must remain streaming (calling STREAMOFF on it > > > > :c:func:`VIDIOC_STREAMOFF` > > > >> + would trigger a seek). [snip]
On Thu, Oct 18, 2018 at 8:22 PM Laurent Pinchart <laurent.pinchart@ideasonboard.com> wrote: > > Hi Tomasz, > > I've stripped out all the parts on which I have no specific comment or just > agree with your proposal. Please see below for a few additional remarks. > > On Thursday, 18 October 2018 13:03:33 EEST Tomasz Figa wrote: > > On Wed, Oct 17, 2018 at 10:34 PM Laurent Pinchart wrote: > > > On Tuesday, 24 July 2018 17:06:20 EEST Tomasz Figa wrote: > > >> Due to complexity of the video decoding process, the V4L2 drivers of > > >> stateful decoder hardware require specific sequences of V4L2 API calls > > >> to be followed. These include capability enumeration, initialization, > > >> decoding, seek, pause, dynamic resolution change, drain and end of > > >> stream. > > >> > > >> Specifics of the above have been discussed during Media Workshops at > > >> LinuxCon Europe 2012 in Barcelona and then later Embedded Linux > > >> Conference Europe 2014 in Düsseldorf. The de facto Codec API that > > >> originated at those events was later implemented by the drivers we > > >> already have merged in mainline, such as s5p-mfc or coda. > > >> > > >> The only thing missing was the real specification included as a part of > > >> Linux Media documentation. Fix it now and document the decoder part of > > >> the Codec API. > > >> > > >> Signed-off-by: Tomasz Figa <tfiga@chromium.org> > > >> --- > > >> > > >> Documentation/media/uapi/v4l/dev-decoder.rst | 872 +++++++++++++++++++ > > >> Documentation/media/uapi/v4l/devices.rst | 1 + > > >> Documentation/media/uapi/v4l/v4l2.rst | 10 +- > > >> 3 files changed, 882 insertions(+), 1 deletion(-) > > >> create mode 100644 Documentation/media/uapi/v4l/dev-decoder.rst > > >> > > >> diff --git a/Documentation/media/uapi/v4l/dev-decoder.rst > > >> b/Documentation/media/uapi/v4l/dev-decoder.rst new file mode 100644 > > >> index 000000000000..f55d34d2f860 > > >> --- /dev/null > > >> +++ b/Documentation/media/uapi/v4l/dev-decoder.rst > > >> @@ -0,0 +1,872 @@ > > [snip] > > > >> +4. Allocate source (bitstream) buffers via :c:func:`VIDIOC_REQBUFS` on > > >> + ``OUTPUT``. > > >> + > > >> + * **Required fields:** > > >> + > > >> + ``count`` > > >> + requested number of buffers to allocate; greater than zero > > >> + > > >> + ``type`` > > >> + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` > > >> + > > >> + ``memory`` > > >> + follows standard semantics > > >> + > > >> + ``sizeimage`` > > >> + follows standard semantics; the client is free to choose any > > >> + suitable size, however, it may be subject to change by the > > >> + driver > > >> + > > >> + * **Return fields:** > > >> + > > >> + ``count`` > > >> + actual number of buffers allocated > > >> + > > >> + * The driver must adjust count to minimum of required number of > > >> + ``OUTPUT`` buffers for given format and count passed. > > > > > > Isn't it the maximum, not the minimum ? > > > > It's actually neither. All we can generally say here is that the > > number will be adjusted and the client must note it. > > I expect it to be clamp(requested count, driver minimum, driver maximum). I'm > not sure it's worth capturing this in the document though, but we could say > > "The driver must clam count to the minimum and maximum number of required > ``OUTPUT`` buffers for the given format ." > I'd leave the details to the documentation of VIDIOC_REQBUFS, if needed. This document focuses on the decoder UAPI and with this note I want to ensure that the applications don't assume that exactly the requested number of buffers is always allocated. How about making it even simpler: The actual number of allocated buffers may differ from the ``count`` given. The client must check the updated value of ``count`` after the call returns. > > >> The client must > > >> + check this value after the ioctl returns to get the number of > > >> + buffers allocated. > > >> + > > >> + .. note:: > > >> + > > >> + To allocate more than minimum number of buffers (for pipeline > > >> + depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_OUTPUT``) to > > >> + get minimum number of buffers required by the driver/format, > > >> + and pass the obtained value plus the number of additional > > >> + buffers needed in count to :c:func:`VIDIOC_REQBUFS`. > > >> + > > >> +5. Start streaming on ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON`. > > >> + > > >> +6. This step only applies to coded formats that contain resolution > > >> + information in the stream. Continue queuing/dequeuing bitstream > > >> + buffers to/from the ``OUTPUT`` queue via :c:func:`VIDIOC_QBUF` and > > >> + :c:func:`VIDIOC_DQBUF`. The driver must keep processing and > > >> returning > > >> + each buffer to the client until required metadata to configure the > > >> + ``CAPTURE`` queue are found. This is indicated by the driver > > >> sending > > >> + a ``V4L2_EVENT_SOURCE_CHANGE`` event with > > >> + ``V4L2_EVENT_SRC_CH_RESOLUTION`` source change type. There is no > > >> + requirement to pass enough data for this to occur in the first > > >> buffer > > >> + and the driver must be able to process any number. > > >> + > > >> + * If data in a buffer that triggers the event is required to decode > > >> + the first frame, the driver must not return it to the client, > > >> + but must retain it for further decoding. > > >> + > > >> + * If the client set width and height of ``OUTPUT`` format to 0, > > >> calling > > >> + :c:func:`VIDIOC_G_FMT` on the ``CAPTURE`` queue will return > > >> -EPERM, > > >> + until the driver configures ``CAPTURE`` format according to stream > > >> + metadata. > > > > > > That's a pretty harsh handling for this condition. What's the rationale > > > for returning -EPERM instead of for instance succeeding with width and > > > height set to 0 ? > > > > I don't like it, but the error condition must stay for compatibility > > reasons as that's what current drivers implement and applications > > expect. (Technically current drivers would return -EINVAL, but we > > concluded that existing applications don't care about the exact value, > > so we can change it to make more sense.) > > Fair enough :-/ A bit of a shame though. Should we try to use an error code > that would have less chance of being confused with an actual permission > problem ? -EILSEQ could be an option for "illegal sequence" of operations, but > better options could exist. > In Request API we concluded that -EACCES is the right code to return for G_EXT_CTRLS on a request that has not finished yet. The case here is similar - the capture queue is not yet set up. What do you think? > > >> + * If the client subscribes to ``V4L2_EVENT_SOURCE_CHANGE`` events > > >> and > > >> + the event is signaled, the decoding process will not continue > > >> until > > >> + it is acknowledged by either (re-)starting streaming on > > >> ``CAPTURE``, > > >> + or via :c:func:`VIDIOC_DECODER_CMD` with ``V4L2_DEC_CMD_START`` > > >> + command. > > >> + > > >> + .. note:: > > >> + > > >> + No decoded frames are produced during this phase. > > >> + > > [snip] > > > >> +8. Call :c:func:`VIDIOC_G_FMT` for ``CAPTURE`` queue to get format for > > >> the + destination buffers parsed/decoded from the bitstream. > > >> + > > >> + * **Required fields:** > > >> + > > >> + ``type`` > > >> + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` > > >> + > > >> + * **Return fields:** > > >> + > > >> + ``width``, ``height`` > > >> + frame buffer resolution for the decoded frames > > >> + > > >> + ``pixelformat`` > > >> + pixel format for decoded frames > > >> + > > >> + ``num_planes`` (for _MPLANE ``type`` only) > > >> + number of planes for pixelformat > > >> + > > >> + ``sizeimage``, ``bytesperline`` > > >> + as per standard semantics; matching frame buffer format > > >> + > > >> + .. note:: > > >> + > > >> + The value of ``pixelformat`` may be any pixel format supported > > >> and > > >> + must be supported for current stream, based on the information > > >> + parsed from the stream and hardware capabilities. It is > > >> suggested > > >> + that driver chooses the preferred/optimal format for given > > > > > > In compliance with RFC 2119, how about using "Drivers should choose" > > > instead of "It is suggested that driver chooses" ? > > > > The whole paragraph became: > > > > The value of ``pixelformat`` may be any pixel format supported by the > > decoder for the current stream. It is expected that the decoder chooses a > > preferred/optimal format for the default configuration. For example, a YUV > > format may be preferred over an RGB format, if additional conversion step > > would be required. > > How about using "should" instead of "it is expected that" ? > Done. > [snip] > > > >> +10. *[optional]* Choose a different ``CAPTURE`` format than suggested > > >> via > > >> + :c:func:`VIDIOC_S_FMT` on ``CAPTURE`` queue. It is possible for > > >> the > > >> + client to choose a different format than selected/suggested by the > > > > > > And here, "A client may choose" ? > > > > > >> + driver in :c:func:`VIDIOC_G_FMT`. > > >> + > > >> + * **Required fields:** > > >> + > > >> + ``type`` > > >> + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` > > >> + > > >> + ``pixelformat`` > > >> + a raw pixel format > > >> + > > >> + .. note:: > > >> + > > >> + Calling :c:func:`VIDIOC_ENUM_FMT` to discover currently > > >> available > > >> + formats after receiving ``V4L2_EVENT_SOURCE_CHANGE`` is useful > > >> to > > >> + find out a set of allowed formats for given configuration, but > > >> not > > >> + required, if the client can accept the defaults. > > > > > > s/required/required,/ > > > > That would become "[...]but not required,, if the client[...]". Is > > that your suggestion? ;) > > Oops, the other way around of course :-) Done. > > [snip] > > > >> +3. Start queuing buffers to ``OUTPUT`` queue containing stream data > > >> after > > >> + the seek until a suitable resume point is found. > > >> + > > >> + .. note:: > > >> + > > >> + There is no requirement to begin queuing stream starting exactly > > >> from > > > > > > s/stream/buffers/ ? > > > > Perhaps "stream data"? The buffers don't have a resume point, the stream > > does. > > Maybe "coded data" ? > Done. > > >> + a resume point (e.g. SPS or a keyframe). The driver must handle > > >> any > > >> + data queued and must keep processing the queued buffers until it > > >> + finds a suitable resume point. While looking for a resume point, > > >> the > > >> + driver processes ``OUTPUT`` buffers and returns them to the > > >> client > > >> + without producing any decoded frames. > > >> + > > >> + For hardware known to be mishandling seeks to a non-resume point, > > >> + e.g. by returning corrupted decoded frames, the driver must be > > >> able > > >> + to handle such seeks without a crash or any fatal decode error. > > > > > > This should be true for any hardware, there should never be any crash or > > > fatal decode error. I'd write it as > > > > > > Some hardware is known to mishandle seeks to a non-resume point. Such an > > > operation may result in an unspecified number of corrupted decoded frames > > > being made available on ``CAPTURE``. Drivers must ensure that no fatal > > > decoding errors or crashes occur, and implement any necessary handling and > > > work-arounds for hardware issues related to seek operations. > > > > Done. > > [snip] > > > >> +2. After all buffers containing decoded frames from before the > > >> resolution > > >> + change point are ready to be dequeued on the ``CAPTURE`` queue, the > > >> + driver sends a ``V4L2_EVENT_SOURCE_CHANGE`` event for source change > > >> + type ``V4L2_EVENT_SRC_CH_RESOLUTION``. > > >> + > > >> + * The last buffer from before the change must be marked with > > >> + :c:type:`v4l2_buffer` ``flags`` flag ``V4L2_BUF_FLAG_LAST`` as in > > >> the + drain sequence. The last buffer might be empty (with > > >> + :c:type:`v4l2_buffer` ``bytesused`` = 0) and must be ignored by > > >> the > > >> + client, since it does not contain any decoded frame. > > >> + > > >> + * Any client query issued after the driver queues the event must > > >> return > > >> + values applying to the stream after the resolution change, > > >> including > > >> + queue formats, selection rectangles and controls. > > >> + > > >> + * If the client subscribes to ``V4L2_EVENT_SOURCE_CHANGE`` events > > >> and > > >> + the event is signaled, the decoding process will not continue > > >> until > > >> + it is acknowledged by either (re-)starting streaming on > > >> ``CAPTURE``, > > >> + or via :c:func:`VIDIOC_DECODER_CMD` with ``V4L2_DEC_CMD_START`` > > >> + command. > > > > > > This usage of V4L2_DEC_CMD_START isn't aligned with the documentation of > > > the command. I'm not opposed to this, but I think the use cases of > > > decoder commands for codecs should be explained in the VIDIOC_DECODER_CMD > > > documentation. What bothers me in particular is usage of > > > V4L2_DEC_CMD_START to restart the decoder, while no V4L2_DEC_CMD_STOP has > > > been issued. Should we add a section that details the decoder state > > > machine with the implicit and explicit ways in which it is started and > > > stopped ? > > > > Yes, we should probably extend the VIDIOC_DECODER_CMD documentation. > > > > As for diagrams, they would indeed be nice to have, but maybe we could > > add them in a follow up patch? > > That's another way to say it won't happen, right ? ;-) I'd prefer to focus on the basic description first, since for the last 6 years we haven't had any documentation at all. I hope we can later have more contributors follow up with patches to make it easier to read, e.g. add nice diagrams. Anyway, I'll try to add a simple state machine diagram in dot, but would appreciate if we could postpone any not critical improvements. > I'm OK with that, but I > think we should still clarify that the source change generates an implicit > V4L2_DEC_CMD_STOP. > Good idea, thanks. > > > I would also reference step 7 here. > > > > > >> + .. note:: > > >> + > > >> + Any attempts to dequeue more buffers beyond the buffer marked > > >> + with ``V4L2_BUF_FLAG_LAST`` will result in a -EPIPE error from > > >> + :c:func:`VIDIOC_DQBUF`. > > >> + > > >> +3. The client calls :c:func:`VIDIOC_G_FMT` for ``CAPTURE`` to get the > > >> new > > >> + format information. This is identical to calling > > >> :c:func:`VIDIOC_G_FMT` + after ``V4L2_EVENT_SRC_CH_RESOLUTION`` in > > >> the initialization sequence + and should be handled similarly. > > > > > > As the source resolution change event is mentioned in multiple places, how > > > about extracting the related ioctls sequence to a specific section, and > > > referencing it where needed (at least from the initialization sequence and > > > here) ? > > > > I made the text here refer to the Initialization sequence. > > Wouldn't it be clearer if those steps were extracted to a standalone sequence > referenced from both locations ? > It might be possible to extract the operations on the CAPTURE queue into a "Capture setup" sequence. Let me check that. > > >> + .. note:: > > >> + > > >> + It is allowed for the driver not to support the same pixel > > >> format as > > > > > > "Drivers may not support ..." > > > > > >> + previously used (before the resolution change) for the new > > >> + resolution. The driver must select a default supported pixel > > >> format, > > >> + return it, if queried using :c:func:`VIDIOC_G_FMT`, and the > > >> client > > >> + must take note of it. > > >> + > > >> +4. The client acquires visible resolution as in initialization > > >> sequence. > > >> + > > >> +5. *[optional]* The client is allowed to enumerate available formats > > >> and > > > > > > s/is allowed to/may/ > > > > > >> + select a different one than currently chosen (returned via > > >> + :c:func:`VIDIOC_G_FMT)`. This is identical to a corresponding step > > >> in > > >> + the initialization sequence. > > >> + > > >> +6. *[optional]* The client acquires minimum number of buffers as in > > >> + initialization sequence. > > >> + > > >> +7. If all the following conditions are met, the client may resume the > > >> + decoding instantly, by using :c:func:`VIDIOC_DECODER_CMD` with > > >> + ``V4L2_DEC_CMD_START`` command, as in case of resuming after the > > >> drain > > >> + sequence: > > >> + > > >> + * ``sizeimage`` of new format is less than or equal to the size of > > >> + currently allocated buffers, > > >> + > > >> + * the number of buffers currently allocated is greater than or > > >> equal to > > >> + the minimum number of buffers acquired in step 6. > > >> + > > >> + In such case, the remaining steps do not apply. > > >> + > > >> + However, if the client intends to change the buffer set, to lower > > >> + memory usage or for any other reasons, it may be achieved by > > >> following > > >> + the steps below. > > >> + > > >> +8. After dequeuing all remaining buffers from the ``CAPTURE`` queue, > > > > > > This is optional, isn't it ? > > > > I wouldn't call it optional, since it depends on what the client does > > and what the decoder supports. That's why the point above just states > > that the remaining steps do not apply. > > I meant isn't the "After dequeuing all remaining buffers from the CAPTURE > queue" part optional ? As far as I understand, the client may decide not to > dequeue them. > A STREAMOFF would discard the already decoded but not yet dequeued frames. While it's technically fine, it doesn't make sense, because it would lead to a frame drop. Therefore, I'd rather keep it required, for simplicity. > > Also added a note: > > > > To fulfill those requirements, the client may attempt to use > > :c:func:`VIDIOC_CREATE_BUFS` to add more buffers. However, due to > > hardware limitations, the decoder may not support adding buffers at > > this point and the client must be able to handle a failure using the > > steps below. > > I wonder if there could be a way to work around those limitations on the > driver side. At the beginning of step 7, the decoder is effectively stopped. > If the hardware doesn't support adding new buffers on the fly, can't the > driver support the VIDIOC_CREATE_BUFS + V4L2_DEC_CMD_START sequence the same > way it would support the VIDIOC_STREAMOFF + VIDIOC_REQBUFS(0) + > VIDIOC_REQBUFS(n) + VIDIOC_STREAMON ? > I guess that would work. I would only allow it for the case where existing buffers are already big enough and just more buffers are needed. Otherwise it would lead to some weird cases, such as some old buffers already in the CAPTURE queue, blocking the decode of further frames. (While it could be handled by the driver returning them with an error state, it would only complicate the interface.) Best regards, Tomasz
On Thu, Oct 18, 2018 at 7:03 PM Tomasz Figa <tfiga@chromium.org> wrote: > > Hi Laurent, > > On Wed, Oct 17, 2018 at 10:34 PM Laurent Pinchart > <laurent.pinchart@ideasonboard.com> wrote: > > > > Hi Tomasz, > > > > Thank you for the patch. > > Thanks for your comments! Please see my replies inline. > > > > > On Tuesday, 24 July 2018 17:06:20 EEST Tomasz Figa wrote: [snip] > > > +4. At this point, decoding is paused and the driver will accept, but not > > > + process any newly queued ``OUTPUT`` buffers until the client issues > > > + ``V4L2_DEC_CMD_START`` or restarts streaming on any queue. > > > + > > > +* Once the drain sequence is initiated, the client needs to drive it to > > > + completion, as described by the above steps, unless it aborts the process > > > + by issuing :c:func:`VIDIOC_STREAMOFF` on ``OUTPUT`` queue. The client > > > + is not allowed to issue ``V4L2_DEC_CMD_START`` or ``V4L2_DEC_CMD_STOP`` > > > + again while the drain sequence is in progress and they will fail with > > > + -EBUSY error code if attempted. > > > > While this seems OK to me, I think drivers will need help to implement all the > > corner cases correctly without race conditions. > > We went through the possible list of corner cases and concluded that > there is no use in handling them, especially considering how much they > would complicate both the userspace and the drivers. Not even > mentioning some hardware, like s5p-mfc, which actually has a dedicated > flush operation, that needs to complete before the decoder can switch > back to normal mode. Actually I misread your comment. Agreed that the decoder commands are a bit tricky to implement properly. That's one of the reasons I decided to make the return -EBUSY while an existing drain is in progress. Do you have any particular simplification in mind that could avoid some corner cases? Best regards, Tomasz
On Thu, Oct 18, 2018 at 7:03 PM Tomasz Figa <tfiga@chromium.org> wrote: > > Hi Laurent, > > On Wed, Oct 17, 2018 at 10:34 PM Laurent Pinchart > <laurent.pinchart@ideasonboard.com> wrote: > > > > Hi Tomasz, > > > > Thank you for the patch. > > Thanks for your comments! Please see my replies inline. > > > > > On Tuesday, 24 July 2018 17:06:20 EEST Tomasz Figa wrote: [snip] > > > The driver must also set > > > + ``V4L2_BUF_FLAG_LAST`` in :c:type:`v4l2_buffer` ``flags`` field on the > > > + buffer on the ``CAPTURE`` queue containing the last frame (if any) > > > + produced as a result of processing the ``OUTPUT`` buffers queued > > > + before ``V4L2_DEC_CMD_STOP``. If no more frames are left to be > > > + returned at the point of handling ``V4L2_DEC_CMD_STOP``, the driver > > > + must return an empty buffer (with :c:type:`v4l2_buffer` > > > + ``bytesused`` = 0) as the last buffer with ``V4L2_BUF_FLAG_LAST`` set > > > + instead. Any attempts to dequeue more buffers beyond the buffer marked > > > + with ``V4L2_BUF_FLAG_LAST`` will result in a -EPIPE error from > > > + :c:func:`VIDIOC_DQBUF`. > > > + > > > + * If the ``CAPTURE`` queue is NOT streaming, no action is necessary for > > > + ``CAPTURE`` queue and the driver must send a ``V4L2_EVENT_EOS`` > > > + immediately after all ``OUTPUT`` buffers in question have been > > > + processed. > > > > What is the use case for this ? Can't we just return an error if decoder isn't > > streaming ? > > > > Actually this is wrong. We want the queued OUTPUT buffers to be > processed and decoded, so if the CAPTURE queue is not yet set up > (initialization sequence not completed yet), handling the > initialization sequence first will be needed as a part of the drain > sequence. I've updated the document with that. I might want to take this back. The client could just drive the initialization to completion on its own and start the drain sequence after that. Let me think if it makes anything easier. For reference, I don't see any compatibility constraint here, since the existing user space already works like that. Best regards, Tomasz
Hi Tomasz, On Saturday, 20 October 2018 11:52:57 EEST Tomasz Figa wrote: > On Thu, Oct 18, 2018 at 8:22 PM Laurent Pinchart wrote: > > On Thursday, 18 October 2018 13:03:33 EEST Tomasz Figa wrote: > >> On Wed, Oct 17, 2018 at 10:34 PM Laurent Pinchart wrote: > >>> On Tuesday, 24 July 2018 17:06:20 EEST Tomasz Figa wrote: > >>>> Due to complexity of the video decoding process, the V4L2 drivers of > >>>> stateful decoder hardware require specific sequences of V4L2 API > >>>> calls to be followed. These include capability enumeration, > >>>> initialization, decoding, seek, pause, dynamic resolution change, drain > >>>> and end of stream. > >>>> > >>>> Specifics of the above have been discussed during Media Workshops at > >>>> LinuxCon Europe 2012 in Barcelona and then later Embedded Linux > >>>> Conference Europe 2014 in Düsseldorf. The de facto Codec API that > >>>> originated at those events was later implemented by the drivers we > >>>> already have merged in mainline, such as s5p-mfc or coda. > >>>> > >>>> The only thing missing was the real specification included as a part > >>>> of Linux Media documentation. Fix it now and document the decoder part > >>>> of the Codec API. > >>>> > >>>> Signed-off-by: Tomasz Figa <tfiga@chromium.org> > >>>> --- > >>>> > >>>> Documentation/media/uapi/v4l/dev-decoder.rst | 872 +++++++++++++++++++ > >>>> Documentation/media/uapi/v4l/devices.rst | 1 + > >>>> Documentation/media/uapi/v4l/v4l2.rst | 10 +- > >>>> 3 files changed, 882 insertions(+), 1 deletion(-) > >>>> create mode 100644 Documentation/media/uapi/v4l/dev-decoder.rst > >>>> > >>>> diff --git a/Documentation/media/uapi/v4l/dev-decoder.rst > >>>> b/Documentation/media/uapi/v4l/dev-decoder.rst new file mode 100644 > >>>> index 000000000000..f55d34d2f860 > >>>> --- /dev/null > >>>> +++ b/Documentation/media/uapi/v4l/dev-decoder.rst > >>>> @@ -0,0 +1,872 @@ > > > > [snip] > > > >>>> +4. Allocate source (bitstream) buffers via :c:func:`VIDIOC_REQBUFS` > >>>> on > >>>> + ``OUTPUT``. > >>>> + > >>>> + * **Required fields:** > >>>> + > >>>> + ``count`` > >>>> + requested number of buffers to allocate; greater than zero > >>>> + > >>>> + ``type`` > >>>> + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` > >>>> + > >>>> + ``memory`` > >>>> + follows standard semantics > >>>> + > >>>> + ``sizeimage`` > >>>> + follows standard semantics; the client is free to choose > >>>> any > >>>> + suitable size, however, it may be subject to change by the > >>>> + driver > >>>> + > >>>> + * **Return fields:** > >>>> + > >>>> + ``count`` > >>>> + actual number of buffers allocated > >>>> + > >>>> + * The driver must adjust count to minimum of required number of > >>>> + ``OUTPUT`` buffers for given format and count passed. > >>> > >>> Isn't it the maximum, not the minimum ? > >> > >> It's actually neither. All we can generally say here is that the > >> number will be adjusted and the client must note it. > > > > I expect it to be clamp(requested count, driver minimum, driver maximum). > > I'm not sure it's worth capturing this in the document though, but we > > could say > > > > "The driver must clam count to the minimum and maximum number of required > > ``OUTPUT`` buffers for the given format ." > > I'd leave the details to the documentation of VIDIOC_REQBUFS, if > needed. This document focuses on the decoder UAPI and with this note I > want to ensure that the applications don't assume that exactly the > requested number of buffers is always allocated. > > How about making it even simpler: > > The actual number of allocated buffers may differ from the ``count`` > given. The client must check the updated value of ``count`` after the > call returns. That works for me. You may want to see "... given, as specified in the VIDIOC_REQBUFS documentation.". > >>>> The client must > >>>> + check this value after the ioctl returns to get the number of > >>>> + buffers allocated. > >>>> + > >>>> + .. note:: > >>>> + > >>>> + To allocate more than minimum number of buffers (for pipeline > >>>> + depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_OUTPUT``) to > >>>> + get minimum number of buffers required by the driver/format, > >>>> + and pass the obtained value plus the number of additional > >>>> + buffers needed in count to :c:func:`VIDIOC_REQBUFS`. > >>>> + > >>>> +5. Start streaming on ``OUTPUT`` queue via > >>>> :c:func:`VIDIOC_STREAMON`. > >>>> + > >>>> +6. This step only applies to coded formats that contain resolution > >>>> + information in the stream. Continue queuing/dequeuing bitstream > >>>> + buffers to/from the ``OUTPUT`` queue via :c:func:`VIDIOC_QBUF` > >>>> and > >>>> + :c:func:`VIDIOC_DQBUF`. The driver must keep processing and > >>>> returning > >>>> + each buffer to the client until required metadata to configure > >>>> the > >>>> + ``CAPTURE`` queue are found. This is indicated by the driver > >>>> sending > >>>> + a ``V4L2_EVENT_SOURCE_CHANGE`` event with > >>>> + ``V4L2_EVENT_SRC_CH_RESOLUTION`` source change type. There is no > >>>> + requirement to pass enough data for this to occur in the first > >>>> buffer > >>>> + and the driver must be able to process any number. > >>>> + > >>>> + * If data in a buffer that triggers the event is required to > >>>> decode > >>>> + the first frame, the driver must not return it to the client, > >>>> + but must retain it for further decoding. > >>>> + > >>>> + * If the client set width and height of ``OUTPUT`` format to 0, > >>>> calling > >>>> + :c:func:`VIDIOC_G_FMT` on the ``CAPTURE`` queue will return > >>>> -EPERM, > >>>> + until the driver configures ``CAPTURE`` format according to > >>>> stream > >>>> + metadata. > >>> > >>> That's a pretty harsh handling for this condition. What's the > >>> rationale for returning -EPERM instead of for instance succeeding with > >>> width and height set to 0 ? > >> > >> I don't like it, but the error condition must stay for compatibility > >> reasons as that's what current drivers implement and applications > >> expect. (Technically current drivers would return -EINVAL, but we > >> concluded that existing applications don't care about the exact value, > >> so we can change it to make more sense.) > > > > Fair enough :-/ A bit of a shame though. Should we try to use an error > > code that would have less chance of being confused with an actual > > permission problem ? -EILSEQ could be an option for "illegal sequence" of > > operations, but better options could exist. > > In Request API we concluded that -EACCES is the right code to return > for G_EXT_CTRLS on a request that has not finished yet. The case here > is similar - the capture queue is not yet set up. What do you think? Good question. -EPERM is documented as "Operation not permitted", while - EACCES is documented as "Permission denied". The former appears to be understood as "This isn't a good idea, I can't let you do that", and the latter as "You don't have sufficient privileges, if you retry with the correct privileges this will succeed". Neither are a perfect match, but -EACCES might be better if you replace getting privileges by performing the required setup. > >>>> + * If the client subscribes to ``V4L2_EVENT_SOURCE_CHANGE`` > >>>> events and > >>>> + the event is signaled, the decoding process will not continue > >>>> until > >>>> + it is acknowledged by either (re-)starting streaming on > >>>> ``CAPTURE``, > >>>> + or via :c:func:`VIDIOC_DECODER_CMD` with > >>>> ``V4L2_DEC_CMD_START`` > >>>> + command. > >>>> + > >>>> + .. note:: > >>>> + > >>>> + No decoded frames are produced during this phase. > >>>> + [snip] > >> Also added a note: > >> To fulfill those requirements, the client may attempt to use > >> :c:func:`VIDIOC_CREATE_BUFS` to add more buffers. However, due to > >> hardware limitations, the decoder may not support adding buffers > >> at this point and the client must be able to handle a failure > >> using the steps below. > > > > I wonder if there could be a way to work around those limitations on the > > driver side. At the beginning of step 7, the decoder is effectively > > stopped. If the hardware doesn't support adding new buffers on the fly, > > can't the driver support the VIDIOC_CREATE_BUFS + V4L2_DEC_CMD_START > > sequence the same way it would support the VIDIOC_STREAMOFF + > > VIDIOC_REQBUFS(0) + > > VIDIOC_REQBUFS(n) + VIDIOC_STREAMON ? > > I guess that would work. I would only allow it for the case where > existing buffers are already big enough and just more buffers are > needed. Otherwise it would lead to some weird cases, such as some old > buffers already in the CAPTURE queue, blocking the decode of further > frames. (While it could be handled by the driver returning them with > an error state, it would only complicate the interface.) Good point. I wonder if this could be handled in the framework. If it can't, or with non trivial support code on the driver side, then I would agree with you. Otherwise, handling the workaround in the framework would ensure consistent behaviour across drivers with minimal cost, and simplify the userspace API, so I think it would be a good thing.
Hi Tomasz, On Saturday, 20 October 2018 13:24:20 EEST Tomasz Figa wrote: > On Thu, Oct 18, 2018 at 7:03 PM Tomasz Figa wrote: > > On Wed, Oct 17, 2018 at 10:34 PM Laurent Pinchart wrote: > >> Hi Tomasz, > >> > >> Thank you for the patch. > > > > Thanks for your comments! Please see my replies inline. > > > >> On Tuesday, 24 July 2018 17:06:20 EEST Tomasz Figa wrote: > > [snip] > > >>> +4. At this point, decoding is paused and the driver will accept, but > >>> not > >>> + process any newly queued ``OUTPUT`` buffers until the client > >>> issues > >>> + ``V4L2_DEC_CMD_START`` or restarts streaming on any queue. > >>> + > >>> +* Once the drain sequence is initiated, the client needs to drive it > >>> to > >>> + completion, as described by the above steps, unless it aborts the > >>> process > >>> + by issuing :c:func:`VIDIOC_STREAMOFF` on ``OUTPUT`` queue. The client > >>> + is not allowed to issue ``V4L2_DEC_CMD_START`` or > >>> ``V4L2_DEC_CMD_STOP`` > >>> + again while the drain sequence is in progress and they will fail with > >>> + -EBUSY error code if attempted. > >> > >> While this seems OK to me, I think drivers will need help to implement > >> all the corner cases correctly without race conditions. > > > > We went through the possible list of corner cases and concluded that > > there is no use in handling them, especially considering how much they > > would complicate both the userspace and the drivers. Not even > > mentioning some hardware, like s5p-mfc, which actually has a dedicated > > flush operation, that needs to complete before the decoder can switch > > back to normal mode. > > Actually I misread your comment. > > Agreed that the decoder commands are a bit tricky to implement > properly. That's one of the reasons I decided to make the return > -EBUSY while an existing drain is in progress. > > Do you have any particular simplification in mind that could avoid > some corner cases? Not really on the spec side. I think we'll have to implement helper functions for drivers to use if we want to ensure a consistent and bug-free behaviour.
On Sun, Oct 21, 2018 at 6:23 PM Laurent Pinchart <laurent.pinchart@ideasonboard.com> wrote: > > Hi Tomasz, > > On Saturday, 20 October 2018 11:52:57 EEST Tomasz Figa wrote: > > On Thu, Oct 18, 2018 at 8:22 PM Laurent Pinchart wrote: > > > On Thursday, 18 October 2018 13:03:33 EEST Tomasz Figa wrote: > > >> On Wed, Oct 17, 2018 at 10:34 PM Laurent Pinchart wrote: > > >>> On Tuesday, 24 July 2018 17:06:20 EEST Tomasz Figa wrote: > > >>>> Due to complexity of the video decoding process, the V4L2 drivers of > > >>>> stateful decoder hardware require specific sequences of V4L2 API > > >>>> calls to be followed. These include capability enumeration, > > >>>> initialization, decoding, seek, pause, dynamic resolution change, drain > > >>>> and end of stream. > > >>>> > > >>>> Specifics of the above have been discussed during Media Workshops at > > >>>> LinuxCon Europe 2012 in Barcelona and then later Embedded Linux > > >>>> Conference Europe 2014 in Düsseldorf. The de facto Codec API that > > >>>> originated at those events was later implemented by the drivers we > > >>>> already have merged in mainline, such as s5p-mfc or coda. > > >>>> > > >>>> The only thing missing was the real specification included as a part > > >>>> of Linux Media documentation. Fix it now and document the decoder part > > >>>> of the Codec API. > > >>>> > > >>>> Signed-off-by: Tomasz Figa <tfiga@chromium.org> > > >>>> --- > > >>>> > > >>>> Documentation/media/uapi/v4l/dev-decoder.rst | 872 +++++++++++++++++++ > > >>>> Documentation/media/uapi/v4l/devices.rst | 1 + > > >>>> Documentation/media/uapi/v4l/v4l2.rst | 10 +- > > >>>> 3 files changed, 882 insertions(+), 1 deletion(-) > > >>>> create mode 100644 Documentation/media/uapi/v4l/dev-decoder.rst > > >>>> > > >>>> diff --git a/Documentation/media/uapi/v4l/dev-decoder.rst > > >>>> b/Documentation/media/uapi/v4l/dev-decoder.rst new file mode 100644 > > >>>> index 000000000000..f55d34d2f860 > > >>>> --- /dev/null > > >>>> +++ b/Documentation/media/uapi/v4l/dev-decoder.rst > > >>>> @@ -0,0 +1,872 @@ > > > > > > [snip] > > > > > >>>> +4. Allocate source (bitstream) buffers via :c:func:`VIDIOC_REQBUFS` > > >>>> on > > >>>> + ``OUTPUT``. > > >>>> + > > >>>> + * **Required fields:** > > >>>> + > > >>>> + ``count`` > > >>>> + requested number of buffers to allocate; greater than zero > > >>>> + > > >>>> + ``type`` > > >>>> + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` > > >>>> + > > >>>> + ``memory`` > > >>>> + follows standard semantics > > >>>> + > > >>>> + ``sizeimage`` > > >>>> + follows standard semantics; the client is free to choose > > >>>> any > > >>>> + suitable size, however, it may be subject to change by the > > >>>> + driver > > >>>> + > > >>>> + * **Return fields:** > > >>>> + > > >>>> + ``count`` > > >>>> + actual number of buffers allocated > > >>>> + > > >>>> + * The driver must adjust count to minimum of required number of > > >>>> + ``OUTPUT`` buffers for given format and count passed. > > >>> > > >>> Isn't it the maximum, not the minimum ? > > >> > > >> It's actually neither. All we can generally say here is that the > > >> number will be adjusted and the client must note it. > > > > > > I expect it to be clamp(requested count, driver minimum, driver maximum). > > > I'm not sure it's worth capturing this in the document though, but we > > > could say > > > > > > "The driver must clam count to the minimum and maximum number of required > > > ``OUTPUT`` buffers for the given format ." > > > > I'd leave the details to the documentation of VIDIOC_REQBUFS, if > > needed. This document focuses on the decoder UAPI and with this note I > > want to ensure that the applications don't assume that exactly the > > requested number of buffers is always allocated. > > > > How about making it even simpler: > > > > The actual number of allocated buffers may differ from the ``count`` > > given. The client must check the updated value of ``count`` after the > > call returns. > > That works for me. You may want to see "... given, as specified in the > VIDIOC_REQBUFS documentation.". > The "Conventions[...]" section mentions that 1. The general V4L2 API rules apply if not specified in this document otherwise. so I think I'll skip this additional explanation. > > >>>> The client must > > >>>> + check this value after the ioctl returns to get the number of > > >>>> + buffers allocated. > > >>>> + > > >>>> + .. note:: > > >>>> + > > >>>> + To allocate more than minimum number of buffers (for pipeline > > >>>> + depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_OUTPUT``) to > > >>>> + get minimum number of buffers required by the driver/format, > > >>>> + and pass the obtained value plus the number of additional > > >>>> + buffers needed in count to :c:func:`VIDIOC_REQBUFS`. > > >>>> + > > >>>> +5. Start streaming on ``OUTPUT`` queue via > > >>>> :c:func:`VIDIOC_STREAMON`. > > >>>> + > > >>>> +6. This step only applies to coded formats that contain resolution > > >>>> + information in the stream. Continue queuing/dequeuing bitstream > > >>>> + buffers to/from the ``OUTPUT`` queue via :c:func:`VIDIOC_QBUF` > > >>>> and > > >>>> + :c:func:`VIDIOC_DQBUF`. The driver must keep processing and > > >>>> returning > > >>>> + each buffer to the client until required metadata to configure > > >>>> the > > >>>> + ``CAPTURE`` queue are found. This is indicated by the driver > > >>>> sending > > >>>> + a ``V4L2_EVENT_SOURCE_CHANGE`` event with > > >>>> + ``V4L2_EVENT_SRC_CH_RESOLUTION`` source change type. There is no > > >>>> + requirement to pass enough data for this to occur in the first > > >>>> buffer > > >>>> + and the driver must be able to process any number. > > >>>> + > > >>>> + * If data in a buffer that triggers the event is required to > > >>>> decode > > >>>> + the first frame, the driver must not return it to the client, > > >>>> + but must retain it for further decoding. > > >>>> + > > >>>> + * If the client set width and height of ``OUTPUT`` format to 0, > > >>>> calling > > >>>> + :c:func:`VIDIOC_G_FMT` on the ``CAPTURE`` queue will return > > >>>> -EPERM, > > >>>> + until the driver configures ``CAPTURE`` format according to > > >>>> stream > > >>>> + metadata. > > >>> > > >>> That's a pretty harsh handling for this condition. What's the > > >>> rationale for returning -EPERM instead of for instance succeeding with > > >>> width and height set to 0 ? > > >> > > >> I don't like it, but the error condition must stay for compatibility > > >> reasons as that's what current drivers implement and applications > > >> expect. (Technically current drivers would return -EINVAL, but we > > >> concluded that existing applications don't care about the exact value, > > >> so we can change it to make more sense.) > > > > > > Fair enough :-/ A bit of a shame though. Should we try to use an error > > > code that would have less chance of being confused with an actual > > > permission problem ? -EILSEQ could be an option for "illegal sequence" of > > > operations, but better options could exist. > > > > In Request API we concluded that -EACCES is the right code to return > > for G_EXT_CTRLS on a request that has not finished yet. The case here > > is similar - the capture queue is not yet set up. What do you think? > > Good question. -EPERM is documented as "Operation not permitted", while - > EACCES is documented as "Permission denied". The former appears to be > understood as "This isn't a good idea, I can't let you do that", and the > latter as "You don't have sufficient privileges, if you retry with the correct > privileges this will succeed". Neither are a perfect match, but -EACCES might > be better if you replace getting privileges by performing the required setup. > AFAIR that was also the rationale behind it for the Request API. > > >>>> + * If the client subscribes to ``V4L2_EVENT_SOURCE_CHANGE`` > > >>>> events and > > >>>> + the event is signaled, the decoding process will not continue > > >>>> until > > >>>> + it is acknowledged by either (re-)starting streaming on > > >>>> ``CAPTURE``, > > >>>> + or via :c:func:`VIDIOC_DECODER_CMD` with > > >>>> ``V4L2_DEC_CMD_START`` > > >>>> + command. > > >>>> + > > >>>> + .. note:: > > >>>> + > > >>>> + No decoded frames are produced during this phase. > > >>>> + > > [snip] > > > >> Also added a note: > > >> To fulfill those requirements, the client may attempt to use > > >> :c:func:`VIDIOC_CREATE_BUFS` to add more buffers. However, due to > > >> hardware limitations, the decoder may not support adding buffers > > >> at this point and the client must be able to handle a failure > > >> using the steps below. > > > > > > I wonder if there could be a way to work around those limitations on the > > > driver side. At the beginning of step 7, the decoder is effectively > > > stopped. If the hardware doesn't support adding new buffers on the fly, > > > can't the driver support the VIDIOC_CREATE_BUFS + V4L2_DEC_CMD_START > > > sequence the same way it would support the VIDIOC_STREAMOFF + > > > VIDIOC_REQBUFS(0) + > > > VIDIOC_REQBUFS(n) + VIDIOC_STREAMON ? > > > > I guess that would work. I would only allow it for the case where > > existing buffers are already big enough and just more buffers are > > needed. Otherwise it would lead to some weird cases, such as some old > > buffers already in the CAPTURE queue, blocking the decode of further > > frames. (While it could be handled by the driver returning them with > > an error state, it would only complicate the interface.) > > Good point. I wonder if this could be handled in the framework. If it can't, > or with non trivial support code on the driver side, then I would agree with > you. Otherwise, handling the workaround in the framework would ensure > consistent behaviour across drivers with minimal cost, and simplify the > userspace API, so I think it would be a good thing. I think it should be possible to handle in the framework, but right now we don't have a framework for codecs and it would definitely be a non-trivial piece of code. I'd stick to the restricted behavior for now, since it's easy to lift the restrictions in the future, but if we make it mandatory, the userspace could start relying on it. Best regards, Tomasz
diff --git a/Documentation/media/uapi/v4l/dev-decoder.rst b/Documentation/media/uapi/v4l/dev-decoder.rst new file mode 100644 index 000000000000..f55d34d2f860 --- /dev/null +++ b/Documentation/media/uapi/v4l/dev-decoder.rst @@ -0,0 +1,872 @@ +.. -*- coding: utf-8; mode: rst -*- + +.. _decoder: + +**************************************** +Memory-to-memory Video Decoder Interface +**************************************** + +Input data to a video decoder are buffers containing unprocessed video +stream (e.g. Annex-B H.264/HEVC stream, raw VP8/9 stream). The driver is +expected not to require any additional information from the client to +process these buffers. Output data are raw video frames returned in display +order. + +Performing software parsing, processing etc. of the stream in the driver +in order to support this interface is strongly discouraged. In case such +operations are needed, use of Stateless Video Decoder Interface (in +development) is strongly advised. + +Conventions and notation used in this document +============================================== + +1. The general V4L2 API rules apply if not specified in this document + otherwise. + +2. The meaning of words “must”, “may”, “should”, etc. is as per RFC + 2119. + +3. All steps not marked “optional” are required. + +4. :c:func:`VIDIOC_G_EXT_CTRLS`, :c:func:`VIDIOC_S_EXT_CTRLS` may be used + interchangeably with :c:func:`VIDIOC_G_CTRL`, :c:func:`VIDIOC_S_CTRL`, + unless specified otherwise. + +5. Single-plane API (see spec) and applicable structures may be used + interchangeably with Multi-plane API, unless specified otherwise, + depending on driver capabilities and following the general V4L2 + guidelines. + +6. i = [a..b]: sequence of integers from a to b, inclusive, i.e. i = + [0..2]: i = 0, 1, 2. + +7. For ``OUTPUT`` buffer A, A’ represents a buffer on the ``CAPTURE`` queue + containing data (decoded frame/stream) that resulted from processing + buffer A. + +Glossary +======== + +CAPTURE + the destination buffer queue; the queue of buffers containing decoded + frames; ``V4L2_BUF_TYPE_VIDEO_CAPTURE```` or + ``V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE``; data are captured from the + hardware into ``CAPTURE`` buffers + +client + application client communicating with the driver implementing this API + +coded format + encoded/compressed video bitstream format (e.g. H.264, VP8, etc.); see + also: raw format + +coded height + height for given coded resolution + +coded resolution + stream resolution in pixels aligned to codec and hardware requirements; + typically visible resolution rounded up to full macroblocks; + see also: visible resolution + +coded width + width for given coded resolution + +decode order + the order in which frames are decoded; may differ from display order if + coded format includes a feature of frame reordering; ``OUTPUT`` buffers + must be queued by the client in decode order + +destination + data resulting from the decode process; ``CAPTURE`` + +display order + the order in which frames must be displayed; ``CAPTURE`` buffers must be + returned by the driver in display order + +DPB + Decoded Picture Buffer; a H.264 term for a buffer that stores a picture + that is encoded or decoded and available for reference in further + decode/encode steps. + +EOS + end of stream + +IDR + a type of a keyframe in H.264-encoded stream, which clears the list of + earlier reference frames (DPBs) + +keyframe + an encoded frame that does not reference frames decoded earlier, i.e. + can be decoded fully on its own. + +OUTPUT + the source buffer queue; the queue of buffers containing encoded + bitstream; ``V4L2_BUF_TYPE_VIDEO_OUTPUT`` or + ``V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE``; the hardware is fed with data + from ``OUTPUT`` buffers + +PPS + Picture Parameter Set; a type of metadata entity in H.264 bitstream + +raw format + uncompressed format containing raw pixel data (e.g. YUV, RGB formats) + +resume point + a point in the bitstream from which decoding may start/continue, without + any previous state/data present, e.g.: a keyframe (VP8/VP9) or + SPS/PPS/IDR sequence (H.264); a resume point is required to start decode + of a new stream, or to resume decoding after a seek + +source + data fed to the decoder; ``OUTPUT`` + +SPS + Sequence Parameter Set; a type of metadata entity in H.264 bitstream + +visible height + height for given visible resolution; display height + +visible resolution + stream resolution of the visible picture, in pixels, to be used for + display purposes; must be smaller or equal to coded resolution; + display resolution + +visible width + width for given visible resolution; display width + +Querying capabilities +===================== + +1. To enumerate the set of coded formats supported by the driver, the + client may call :c:func:`VIDIOC_ENUM_FMT` on ``OUTPUT``. + + * The driver must always return the full set of supported formats, + irrespective of the format set on the ``CAPTURE``. + +2. To enumerate the set of supported raw formats, the client may call + :c:func:`VIDIOC_ENUM_FMT` on ``CAPTURE``. + + * The driver must return only the formats supported for the format + currently active on ``OUTPUT``. + + * In order to enumerate raw formats supported by a given coded format, + the client must first set that coded format on ``OUTPUT`` and then + enumerate the ``CAPTURE`` queue. + +3. The client may use :c:func:`VIDIOC_ENUM_FRAMESIZES` to detect supported + resolutions for a given format, passing desired pixel format in + :c:type:`v4l2_frmsizeenum` ``pixel_format``. + + * Values returned by :c:func:`VIDIOC_ENUM_FRAMESIZES` on ``OUTPUT`` + must include all possible coded resolutions supported by the decoder + for given coded pixel format. + + * Values returned by :c:func:`VIDIOC_ENUM_FRAMESIZES` on ``CAPTURE`` + must include all possible frame buffer resolutions supported by the + decoder for given raw pixel format and coded format currently set on + ``OUTPUT``. + + .. note:: + + The client may derive the supported resolution range for a + combination of coded and raw format by setting width and height of + ``OUTPUT`` format to 0 and calculating the intersection of + resolutions returned from calls to :c:func:`VIDIOC_ENUM_FRAMESIZES` + for the given coded and raw formats. + +4. Supported profiles and levels for given format, if applicable, may be + queried using their respective controls via :c:func:`VIDIOC_QUERYCTRL`. + +Initialization +============== + +1. *[optional]* Enumerate supported ``OUTPUT`` formats and resolutions. See + capability enumeration. + +2. Set the coded format on ``OUTPUT`` via :c:func:`VIDIOC_S_FMT` + + * **Required fields:** + + ``type`` + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` + + ``pixelformat`` + a coded pixel format + + ``width``, ``height`` + required only if cannot be parsed from the stream for the given + coded format; optional otherwise - set to zero to ignore + + other fields + follow standard semantics + + * For coded formats including stream resolution information, if width + and height are set to non-zero values, the driver will propagate the + resolution to ``CAPTURE`` and signal a source change event + instantly. However, after the decoder is done parsing the + information embedded in the stream, it will update ``CAPTURE`` + format with new values and signal a source change event again, if + the values do not match. + + .. note:: + + Changing ``OUTPUT`` format may change currently set ``CAPTURE`` + format. The driver will derive a new ``CAPTURE`` format from + ``OUTPUT`` format being set, including resolution, colorimetry + parameters, etc. If the client needs a specific ``CAPTURE`` format, + it must adjust it afterwards. + +3. *[optional]* Get minimum number of buffers required for ``OUTPUT`` + queue via :c:func:`VIDIOC_G_CTRL`. This is useful if client intends to + use more buffers than minimum required by hardware/format. + + * **Required fields:** + + ``id`` + set to ``V4L2_CID_MIN_BUFFERS_FOR_OUTPUT`` + + * **Return fields:** + + ``value`` + required number of ``OUTPUT`` buffers for the currently set + format + +4. Allocate source (bitstream) buffers via :c:func:`VIDIOC_REQBUFS` on + ``OUTPUT``. + + * **Required fields:** + + ``count`` + requested number of buffers to allocate; greater than zero + + ``type`` + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` + + ``memory`` + follows standard semantics + + ``sizeimage`` + follows standard semantics; the client is free to choose any + suitable size, however, it may be subject to change by the + driver + + * **Return fields:** + + ``count`` + actual number of buffers allocated + + * The driver must adjust count to minimum of required number of + ``OUTPUT`` buffers for given format and count passed. The client must + check this value after the ioctl returns to get the number of + buffers allocated. + + .. note:: + + To allocate more than minimum number of buffers (for pipeline + depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_OUTPUT``) to + get minimum number of buffers required by the driver/format, + and pass the obtained value plus the number of additional + buffers needed in count to :c:func:`VIDIOC_REQBUFS`. + +5. Start streaming on ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON`. + +6. This step only applies to coded formats that contain resolution + information in the stream. Continue queuing/dequeuing bitstream + buffers to/from the ``OUTPUT`` queue via :c:func:`VIDIOC_QBUF` and + :c:func:`VIDIOC_DQBUF`. The driver must keep processing and returning + each buffer to the client until required metadata to configure the + ``CAPTURE`` queue are found. This is indicated by the driver sending + a ``V4L2_EVENT_SOURCE_CHANGE`` event with + ``V4L2_EVENT_SRC_CH_RESOLUTION`` source change type. There is no + requirement to pass enough data for this to occur in the first buffer + and the driver must be able to process any number. + + * If data in a buffer that triggers the event is required to decode + the first frame, the driver must not return it to the client, + but must retain it for further decoding. + + * If the client set width and height of ``OUTPUT`` format to 0, calling + :c:func:`VIDIOC_G_FMT` on the ``CAPTURE`` queue will return -EPERM, + until the driver configures ``CAPTURE`` format according to stream + metadata. + + * If the client subscribes to ``V4L2_EVENT_SOURCE_CHANGE`` events and + the event is signaled, the decoding process will not continue until + it is acknowledged by either (re-)starting streaming on ``CAPTURE``, + or via :c:func:`VIDIOC_DECODER_CMD` with ``V4L2_DEC_CMD_START`` + command. + + .. note:: + + No decoded frames are produced during this phase. + +7. This step only applies to coded formats that contain resolution + information in the stream. + Receive and handle ``V4L2_EVENT_SOURCE_CHANGE`` from the driver + via :c:func:`VIDIOC_DQEVENT`. The driver must send this event once + enough data is obtained from the stream to allocate ``CAPTURE`` + buffers and to begin producing decoded frames. + + * **Required fields:** + + ``type`` + set to ``V4L2_EVENT_SOURCE_CHANGE`` + + * **Return fields:** + + ``u.src_change.changes`` + set to ``V4L2_EVENT_SRC_CH_RESOLUTION`` + + * Any client query issued after the driver queues the event must return + values applying to the just parsed stream, including queue formats, + selection rectangles and controls. + +8. Call :c:func:`VIDIOC_G_FMT` for ``CAPTURE`` queue to get format for the + destination buffers parsed/decoded from the bitstream. + + * **Required fields:** + + ``type`` + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` + + * **Return fields:** + + ``width``, ``height`` + frame buffer resolution for the decoded frames + + ``pixelformat`` + pixel format for decoded frames + + ``num_planes`` (for _MPLANE ``type`` only) + number of planes for pixelformat + + ``sizeimage``, ``bytesperline`` + as per standard semantics; matching frame buffer format + + .. note:: + + The value of ``pixelformat`` may be any pixel format supported and + must be supported for current stream, based on the information + parsed from the stream and hardware capabilities. It is suggested + that driver chooses the preferred/optimal format for given + configuration. For example, a YUV format may be preferred over an + RGB format, if additional conversion step would be required. + +9. *[optional]* Enumerate ``CAPTURE`` formats via + :c:func:`VIDIOC_ENUM_FMT` on ``CAPTURE`` queue. Once the stream + information is parsed and known, the client may use this ioctl to + discover which raw formats are supported for given stream and select on + of them via :c:func:`VIDIOC_S_FMT`. + + .. note:: + + The driver will return only formats supported for the current stream + parsed in this initialization sequence, even if more formats may be + supported by the driver in general. + + For example, a driver/hardware may support YUV and RGB formats for + resolutions 1920x1088 and lower, but only YUV for higher + resolutions (due to hardware limitations). After parsing + a resolution of 1920x1088 or lower, :c:func:`VIDIOC_ENUM_FMT` may + return a set of YUV and RGB pixel formats, but after parsing + resolution higher than 1920x1088, the driver will not return RGB, + unsupported for this resolution. + + However, subsequent resolution change event triggered after + discovering a resolution change within the same stream may switch + the stream into a lower resolution and :c:func:`VIDIOC_ENUM_FMT` + would return RGB formats again in that case. + +10. *[optional]* Choose a different ``CAPTURE`` format than suggested via + :c:func:`VIDIOC_S_FMT` on ``CAPTURE`` queue. It is possible for the + client to choose a different format than selected/suggested by the + driver in :c:func:`VIDIOC_G_FMT`. + + * **Required fields:** + + ``type`` + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` + + ``pixelformat`` + a raw pixel format + + .. note:: + + Calling :c:func:`VIDIOC_ENUM_FMT` to discover currently available + formats after receiving ``V4L2_EVENT_SOURCE_CHANGE`` is useful to + find out a set of allowed formats for given configuration, but not + required, if the client can accept the defaults. + +11. *[optional]* Acquire visible resolution via + :c:func:`VIDIOC_G_SELECTION`. + + * **Required fields:** + + ``type`` + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` + + ``target`` + set to ``V4L2_SEL_TGT_COMPOSE`` + + * **Return fields:** + + ``r.left``, ``r.top``, ``r.width``, ``r.height`` + visible rectangle; this must fit within frame buffer resolution + returned by :c:func:`VIDIOC_G_FMT`. + + * The driver must expose following selection targets on ``CAPTURE``: + + ``V4L2_SEL_TGT_CROP_BOUNDS`` + corresponds to coded resolution of the stream + + ``V4L2_SEL_TGT_CROP_DEFAULT`` + a rectangle covering the part of the frame buffer that contains + meaningful picture data (visible area); width and height will be + equal to visible resolution of the stream + + ``V4L2_SEL_TGT_CROP`` + rectangle within coded resolution to be output to ``CAPTURE``; + defaults to ``V4L2_SEL_TGT_CROP_DEFAULT``; read-only on hardware + without additional compose/scaling capabilities + + ``V4L2_SEL_TGT_COMPOSE_BOUNDS`` + maximum rectangle within ``CAPTURE`` buffer, which the cropped + frame can be output into; equal to ``V4L2_SEL_TGT_CROP``, if the + hardware does not support compose/scaling + + ``V4L2_SEL_TGT_COMPOSE_DEFAULT`` + equal to ``V4L2_SEL_TGT_CROP`` + + ``V4L2_SEL_TGT_COMPOSE`` + rectangle inside ``OUTPUT`` buffer into which the cropped frame + is output; defaults to ``V4L2_SEL_TGT_COMPOSE_DEFAULT``; + read-only on hardware without additional compose/scaling + capabilities + + ``V4L2_SEL_TGT_COMPOSE_PADDED`` + rectangle inside ``OUTPUT`` buffer which is overwritten by the + hardware; equal to ``V4L2_SEL_TGT_COMPOSE``, if the hardware + does not write padding pixels + +12. *[optional]* Get minimum number of buffers required for ``CAPTURE`` + queue via :c:func:`VIDIOC_G_CTRL`. This is useful if client intends to + use more buffers than minimum required by hardware/format. + + * **Required fields:** + + ``id`` + set to ``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE`` + + * **Return fields:** + + ``value`` + minimum number of buffers required to decode the stream parsed in + this initialization sequence. + + .. note:: + + Note that the minimum number of buffers must be at least the number + required to successfully decode the current stream. This may for + example be the required DPB size for an H.264 stream given the + parsed stream configuration (resolution, level). + +13. Allocate destination (raw format) buffers via :c:func:`VIDIOC_REQBUFS` + on the ``CAPTURE`` queue. + + * **Required fields:** + + ``count`` + requested number of buffers to allocate; greater than zero + + ``type`` + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` + + ``memory`` + follows standard semantics + + * **Return fields:** + + ``count`` + adjusted to allocated number of buffers + + * The driver must adjust count to minimum of required number of + destination buffers for given format and stream configuration and the + count passed. The client must check this value after the ioctl + returns to get the number of buffers allocated. + + .. note:: + + To allocate more than minimum number of buffers (for pipeline + depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE``) to + get minimum number of buffers required, and pass the obtained value + plus the number of additional buffers needed in count to + :c:func:`VIDIOC_REQBUFS`. + +14. Call :c:func:`VIDIOC_STREAMON` to initiate decoding frames. + +Decoding +======== + +This state is reached after a successful initialization sequence. In this +state, client queues and dequeues buffers to both queues via +:c:func:`VIDIOC_QBUF` and :c:func:`VIDIOC_DQBUF`, following standard +semantics. + +Both queues operate independently, following standard behavior of V4L2 +buffer queues and memory-to-memory devices. In addition, the order of +decoded frames dequeued from ``CAPTURE`` queue may differ from the order of +queuing coded frames to ``OUTPUT`` queue, due to properties of selected +coded format, e.g. frame reordering. The client must not assume any direct +relationship between ``CAPTURE`` and ``OUTPUT`` buffers, other than +reported by :c:type:`v4l2_buffer` ``timestamp`` field. + +The contents of source ``OUTPUT`` buffers depend on active coded pixel +format and might be affected by codec-specific extended controls, as stated +in documentation of each format individually. + +The client must not assume any direct relationship between ``CAPTURE`` +and ``OUTPUT`` buffers and any specific timing of buffers becoming +available to dequeue. Specifically: + +* a buffer queued to ``OUTPUT`` may result in no buffers being produced + on ``CAPTURE`` (e.g. if it does not contain encoded data, or if only + metadata syntax structures are present in it), + +* a buffer queued to ``OUTPUT`` may result in more than 1 buffer produced + on ``CAPTURE`` (if the encoded data contained more than one frame, or if + returning a decoded frame allowed the driver to return a frame that + preceded it in decode, but succeeded it in display order), + +* a buffer queued to ``OUTPUT`` may result in a buffer being produced on + ``CAPTURE`` later into decode process, and/or after processing further + ``OUTPUT`` buffers, or be returned out of order, e.g. if display + reordering is used, + +* buffers may become available on the ``CAPTURE`` queue without additional + buffers queued to ``OUTPUT`` (e.g. during drain or ``EOS``), because of + ``OUTPUT`` buffers being queued in the past and decoding result of which + being available only at later time, due to specifics of the decoding + process. + +Seek +==== + +Seek is controlled by the ``OUTPUT`` queue, as it is the source of +bitstream data. ``CAPTURE`` queue remains unchanged/unaffected. + +1. Stop the ``OUTPUT`` queue to begin the seek sequence via + :c:func:`VIDIOC_STREAMOFF`. + + * **Required fields:** + + ``type`` + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` + + * The driver must drop all the pending ``OUTPUT`` buffers and they are + treated as returned to the client (following standard semantics). + +2. Restart the ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON` + + * **Required fields:** + + ``type`` + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT`` + + * The driver must be put in a state after seek and be ready to + accept new source bitstream buffers. + +3. Start queuing buffers to ``OUTPUT`` queue containing stream data after + the seek until a suitable resume point is found. + + .. note:: + + There is no requirement to begin queuing stream starting exactly from + a resume point (e.g. SPS or a keyframe). The driver must handle any + data queued and must keep processing the queued buffers until it + finds a suitable resume point. While looking for a resume point, the + driver processes ``OUTPUT`` buffers and returns them to the client + without producing any decoded frames. + + For hardware known to be mishandling seeks to a non-resume point, + e.g. by returning corrupted decoded frames, the driver must be able + to handle such seeks without a crash or any fatal decode error. + +4. After a resume point is found, the driver will start returning + ``CAPTURE`` buffers with decoded frames. + + * There is no precise specification for ``CAPTURE`` queue of when it + will start producing buffers containing decoded data from buffers + queued after the seek, as it operates independently + from ``OUTPUT`` queue. + + * The driver is allowed to and may return a number of remaining + ``CAPTURE`` buffers containing decoded frames from before the seek + after the seek sequence (STREAMOFF-STREAMON) is performed. + + * The driver is also allowed to and may not return all decoded frames + queued but not decode before the seek sequence was initiated. For + example, given an ``OUTPUT`` queue sequence: QBUF(A), QBUF(B), + STREAMOFF(OUT), STREAMON(OUT), QBUF(G), QBUF(H), any of the + following results on the ``CAPTURE`` queue is allowed: {A’, B’, G’, + H’}, {A’, G’, H’}, {G’, H’}. + + .. note:: + + To achieve instantaneous seek, the client may restart streaming on + ``CAPTURE`` queue to discard decoded, but not yet dequeued buffers. + +Pause +===== + +In order to pause, the client should just cease queuing buffers onto the +``OUTPUT`` queue. This is different from the general V4L2 API definition of +pause, which involves calling :c:func:`VIDIOC_STREAMOFF` on the queue. +Without source bitstream data, there is no data to process and the hardware +remains idle. + +Conversely, using :c:func:`VIDIOC_STREAMOFF` on ``OUTPUT`` queue indicates +a seek, which + +1. drops all ``OUTPUT`` buffers in flight and +2. after a subsequent :c:func:`VIDIOC_STREAMON`, will look for and only + continue from a resume point. + +This is usually undesirable for pause. The STREAMOFF-STREAMON sequence is +intended for seeking. + +Similarly, ``CAPTURE`` queue should remain streaming as well, as the +STREAMOFF-STREAMON sequence on it is intended solely for changing buffer +sets. + +Dynamic resolution change +========================= + +A video decoder implementing this interface must support dynamic resolution +change, for streams, which include resolution metadata in the bitstream. +When the decoder encounters a resolution change in the stream, the dynamic +resolution change sequence is started. + +1. After encountering a resolution change in the stream, the driver must + first process and decode all remaining buffers from before the + resolution change point. + +2. After all buffers containing decoded frames from before the resolution + change point are ready to be dequeued on the ``CAPTURE`` queue, the + driver sends a ``V4L2_EVENT_SOURCE_CHANGE`` event for source change + type ``V4L2_EVENT_SRC_CH_RESOLUTION``. + + * The last buffer from before the change must be marked with + :c:type:`v4l2_buffer` ``flags`` flag ``V4L2_BUF_FLAG_LAST`` as in the + drain sequence. The last buffer might be empty (with + :c:type:`v4l2_buffer` ``bytesused`` = 0) and must be ignored by the + client, since it does not contain any decoded frame. + + * Any client query issued after the driver queues the event must return + values applying to the stream after the resolution change, including + queue formats, selection rectangles and controls. + + * If the client subscribes to ``V4L2_EVENT_SOURCE_CHANGE`` events and + the event is signaled, the decoding process will not continue until + it is acknowledged by either (re-)starting streaming on ``CAPTURE``, + or via :c:func:`VIDIOC_DECODER_CMD` with ``V4L2_DEC_CMD_START`` + command. + + .. note:: + + Any attempts to dequeue more buffers beyond the buffer marked + with ``V4L2_BUF_FLAG_LAST`` will result in a -EPIPE error from + :c:func:`VIDIOC_DQBUF`. + +3. The client calls :c:func:`VIDIOC_G_FMT` for ``CAPTURE`` to get the new + format information. This is identical to calling :c:func:`VIDIOC_G_FMT` + after ``V4L2_EVENT_SRC_CH_RESOLUTION`` in the initialization sequence + and should be handled similarly. + + .. note:: + + It is allowed for the driver not to support the same pixel format as + previously used (before the resolution change) for the new + resolution. The driver must select a default supported pixel format, + return it, if queried using :c:func:`VIDIOC_G_FMT`, and the client + must take note of it. + +4. The client acquires visible resolution as in initialization sequence. + +5. *[optional]* The client is allowed to enumerate available formats and + select a different one than currently chosen (returned via + :c:func:`VIDIOC_G_FMT)`. This is identical to a corresponding step in + the initialization sequence. + +6. *[optional]* The client acquires minimum number of buffers as in + initialization sequence. + +7. If all the following conditions are met, the client may resume the + decoding instantly, by using :c:func:`VIDIOC_DECODER_CMD` with + ``V4L2_DEC_CMD_START`` command, as in case of resuming after the drain + sequence: + + * ``sizeimage`` of new format is less than or equal to the size of + currently allocated buffers, + + * the number of buffers currently allocated is greater than or equal to + the minimum number of buffers acquired in step 6. + + In such case, the remaining steps do not apply. + + However, if the client intends to change the buffer set, to lower + memory usage or for any other reasons, it may be achieved by following + the steps below. + +8. After dequeuing all remaining buffers from the ``CAPTURE`` queue, the + client must call :c:func:`VIDIOC_STREAMOFF` on the ``CAPTURE`` queue. + The ``OUTPUT`` queue must remain streaming (calling STREAMOFF on it + would trigger a seek). + +9. The client frees the buffers on the ``CAPTURE`` queue using + :c:func:`VIDIOC_REQBUFS`. + + * **Required fields:** + + ``count`` + set to 0 + + ``type`` + a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE`` + + ``memory`` + follows standard semantics + +10. The client allocates a new set of buffers for the ``CAPTURE`` queue via + :c:func:`VIDIOC_REQBUFS`. This is identical to a corresponding step in + the initialization sequence. + +11. The client resumes decoding by issuing :c:func:`VIDIOC_STREAMON` on the + ``CAPTURE`` queue. + +During the resolution change sequence, the ``OUTPUT`` queue must remain +streaming. Calling :c:func:`VIDIOC_STREAMOFF` on ``OUTPUT`` queue would +initiate a seek. + +The ``OUTPUT`` queue operates separately from the ``CAPTURE`` queue for the +duration of the entire resolution change sequence. It is allowed (and +recommended for best performance and simplicity) for the client to keep +queuing/dequeuing buffers from/to ``OUTPUT`` queue even while processing +this sequence. + +.. note:: + + It is also possible for this sequence to be triggered without a change + in coded resolution, if a different number of ``CAPTURE`` buffers is + required in order to continue decoding the stream or the visible + resolution changes. + +Drain +===== + +To ensure that all queued ``OUTPUT`` buffers have been processed and +related ``CAPTURE`` buffers output to the client, the following drain +sequence may be followed. After the drain sequence is complete, the client +has received all decoded frames for all ``OUTPUT`` buffers queued before +the sequence was started. + +1. Begin drain by issuing :c:func:`VIDIOC_DECODER_CMD`. + + * **Required fields:** + + ``cmd`` + set to ``V4L2_DEC_CMD_STOP`` + + ``flags`` + set to 0 + + ``pts`` + set to 0 + +2. The driver must process and decode as normal all ``OUTPUT`` buffers + queued by the client before the :c:func:`VIDIOC_DECODER_CMD` was issued. + Any operations triggered as a result of processing these buffers + (including the initialization and resolution change sequences) must be + processed as normal by both the driver and the client before proceeding + with the drain sequence. + +3. Once all ``OUTPUT`` buffers queued before ``V4L2_DEC_CMD_STOP`` are + processed: + + * If the ``CAPTURE`` queue is streaming, once all decoded frames (if + any) are ready to be dequeued on the ``CAPTURE`` queue, the driver + must send a ``V4L2_EVENT_EOS``. The driver must also set + ``V4L2_BUF_FLAG_LAST`` in :c:type:`v4l2_buffer` ``flags`` field on the + buffer on the ``CAPTURE`` queue containing the last frame (if any) + produced as a result of processing the ``OUTPUT`` buffers queued + before ``V4L2_DEC_CMD_STOP``. If no more frames are left to be + returned at the point of handling ``V4L2_DEC_CMD_STOP``, the driver + must return an empty buffer (with :c:type:`v4l2_buffer` + ``bytesused`` = 0) as the last buffer with ``V4L2_BUF_FLAG_LAST`` set + instead. Any attempts to dequeue more buffers beyond the buffer marked + with ``V4L2_BUF_FLAG_LAST`` will result in a -EPIPE error from + :c:func:`VIDIOC_DQBUF`. + + * If the ``CAPTURE`` queue is NOT streaming, no action is necessary for + ``CAPTURE`` queue and the driver must send a ``V4L2_EVENT_EOS`` + immediately after all ``OUTPUT`` buffers in question have been + processed. + +4. At this point, decoding is paused and the driver will accept, but not + process any newly queued ``OUTPUT`` buffers until the client issues + ``V4L2_DEC_CMD_START`` or restarts streaming on any queue. + +* Once the drain sequence is initiated, the client needs to drive it to + completion, as described by the above steps, unless it aborts the process + by issuing :c:func:`VIDIOC_STREAMOFF` on ``OUTPUT`` queue. The client + is not allowed to issue ``V4L2_DEC_CMD_START`` or ``V4L2_DEC_CMD_STOP`` + again while the drain sequence is in progress and they will fail with + -EBUSY error code if attempted. + +* Restarting streaming on ``OUTPUT`` queue will implicitly end the paused + state and reinitialize the decoder (similarly to the seek sequence). + Restarting ``CAPTURE`` queue will not affect an in-progress drain + sequence. + +* The drivers must also implement :c:func:`VIDIOC_TRY_DECODER_CMD`, as a + way to let the client query the availability of decoder commands. + +End of stream +============= + +If the decoder encounters an end of stream marking in the stream, the +driver must send a ``V4L2_EVENT_EOS`` event to the client after all frames +are decoded and ready to be dequeued on the ``CAPTURE`` queue, with the +:c:type:`v4l2_buffer` ``flags`` set to ``V4L2_BUF_FLAG_LAST``. This +behavior is identical to the drain sequence triggered by the client via +``V4L2_DEC_CMD_STOP``. + +Commit points +============= + +Setting formats and allocating buffers triggers changes in the behavior +of the driver. + +1. Setting format on ``OUTPUT`` queue may change the set of formats + supported/advertised on the ``CAPTURE`` queue. In particular, it also + means that ``CAPTURE`` format may be reset and the client must not + rely on the previously set format being preserved. + +2. Enumerating formats on ``CAPTURE`` queue must only return formats + supported for the ``OUTPUT`` format currently set. + +3. Setting/changing format on ``CAPTURE`` queue does not change formats + available on ``OUTPUT`` queue. An attempt to set ``CAPTURE`` format that + is not supported for the currently selected ``OUTPUT`` format must + result in the driver adjusting the requested format to an acceptable + one. + +4. Enumerating formats on ``OUTPUT`` queue always returns the full set of + supported coded formats, irrespective of the current ``CAPTURE`` + format. + +5. After allocating buffers on the ``OUTPUT`` queue, it is not possible to + change format on it. + +To summarize, setting formats and allocation must always start with the +``OUTPUT`` queue and the ``OUTPUT`` queue is the master that governs the +set of supported formats for the ``CAPTURE`` queue. diff --git a/Documentation/media/uapi/v4l/devices.rst b/Documentation/media/uapi/v4l/devices.rst index fb7f8c26cf09..12d43fe711cf 100644 --- a/Documentation/media/uapi/v4l/devices.rst +++ b/Documentation/media/uapi/v4l/devices.rst @@ -15,6 +15,7 @@ Interfaces dev-output dev-osd dev-codec + dev-decoder dev-effect dev-raw-vbi dev-sliced-vbi diff --git a/Documentation/media/uapi/v4l/v4l2.rst b/Documentation/media/uapi/v4l/v4l2.rst index b89e5621ae69..65dc096199ad 100644 --- a/Documentation/media/uapi/v4l/v4l2.rst +++ b/Documentation/media/uapi/v4l/v4l2.rst @@ -53,6 +53,10 @@ Authors, in alphabetical order: - Original author of the V4L2 API and documentation. +- Figa, Tomasz <tfiga@chromium.org> + + - Documented the memory-to-memory decoder interface. + - H Schimek, Michael <mschimek@gmx.at> - Original author of the V4L2 API and documentation. @@ -61,6 +65,10 @@ Authors, in alphabetical order: - Documented the Digital Video timings API. +- Osciak, Pawel <posciak@chromium.org> + + - Documented the memory-to-memory decoder interface. + - Osciak, Pawel <pawel@osciak.com> - Designed and documented the multi-planar API. @@ -85,7 +93,7 @@ Authors, in alphabetical order: - Designed and documented the VIDIOC_LOG_STATUS ioctl, the extended control ioctls, major parts of the sliced VBI API, the MPEG encoder and decoder APIs and the DV Timings API. -**Copyright** |copy| 1999-2016: Bill Dirks, Michael H. Schimek, Hans Verkuil, Martin Rubli, Andy Walls, Muralidharan Karicheri, Mauro Carvalho Chehab, Pawel Osciak, Sakari Ailus & Antti Palosaari. +**Copyright** |copy| 1999-2018: Bill Dirks, Michael H. Schimek, Hans Verkuil, Martin Rubli, Andy Walls, Muralidharan Karicheri, Mauro Carvalho Chehab, Pawel Osciak, Sakari Ailus & Antti Palosaari, Tomasz Figa Except when explicitly stated as GPL, programming examples within this part can be used and distributed without restrictions.
Due to complexity of the video decoding process, the V4L2 drivers of stateful decoder hardware require specific sequences of V4L2 API calls to be followed. These include capability enumeration, initialization, decoding, seek, pause, dynamic resolution change, drain and end of stream. Specifics of the above have been discussed during Media Workshops at LinuxCon Europe 2012 in Barcelona and then later Embedded Linux Conference Europe 2014 in Düsseldorf. The de facto Codec API that originated at those events was later implemented by the drivers we already have merged in mainline, such as s5p-mfc or coda. The only thing missing was the real specification included as a part of Linux Media documentation. Fix it now and document the decoder part of the Codec API. Signed-off-by: Tomasz Figa <tfiga@chromium.org> --- Documentation/media/uapi/v4l/dev-decoder.rst | 872 +++++++++++++++++++ Documentation/media/uapi/v4l/devices.rst | 1 + Documentation/media/uapi/v4l/v4l2.rst | 10 +- 3 files changed, 882 insertions(+), 1 deletion(-) create mode 100644 Documentation/media/uapi/v4l/dev-decoder.rst