Re: [PATCH V2] [media] v4l2: Add AV1 pixel format

From: Hsia-Jun Li
Date: Wed Dec 07 2022 - 02:18:41 EST




On 12/7/22 02:03, Nicolas Dufresne wrote:
CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.


Le mardi 29 novembre 2022 à 18:32 +0800, Hsia-Jun Li a écrit :
Hello

I think we need to add an extra event for VP9 and AV1 which support
frame scaling, which means its frame width and height could be different
to the previous frame or reference frame.

That would be more possible for the VP9 as there is not a sequence
header for VP9.

The solution is unlikely in the form of an event, but yes, to complete VP9
support (and improve AV1 support) a mechanism need to be designed and specified
to handle inter-frame resolution changes.

Why I say improve AV1, this is because VP9 bitstream does not signal SVC spatial
streams (the most common use of inter-frame resolution changes). With SVC
streams, the smaller images are alway decode-only (never displayed). This can be
at least partially supported as long as the maximum image dimension is signalled
by the bitstream. This is the case for AV1, but not VP9.

Stateless decoders are not affected, because userspace is aware of frames being
decoded, but not displayed. It is also aware that these frames are reference
frames. While on stateless decoder, userspace usually does not have this
knowledge. I think one way to solve this, would be for drivers to be able to
mark a buffer done, with a flag telling userspace that its not to be displayed.
For the SVC case, the dimensions and stride are irrelevant.

For true inter-resolution changes, like VP9 supports (though rarely used), this
needs more APIs. It was suggested to extend CREATE_BUFS, which allow allocation
with different FMT, with a DELETE_BUFS ioctl, so that userspace can smoothly
handle the allocation transition.
This could only solve the problem of never display graphics buffers likes golden frame or alternative reference frame.

About the topic timestamp tracking problem in v4l2, maybe we could start a new thread or move them to Gstreamer.
https://gitlab.freedesktop.org/gstreamer/gstreamer/-/issues/1619

My idea here is event attached to buffer or just using the new request supported in CAPTURE side. I know you worry about the v4l2 event, it is out of band, more event could lead to the problem we suffer from OpenMAX. If we could have an order between event and buffer, it won't be a problem.
For VP9 also, it might be required to support
super-frame, VP9 supper frames are the ancestor of AV1 TU, and only the last
frame of a super-frame is every to be displayed. A newly introduced AV1 format
might also requires complete TU, rather then frames, this needs strict
documentation.
I don't think the temporal unit is a good idea here.
Most of hardware could only decode a frame once or less likes a tile(likes slice in ITU codecs).

Considering the MPEG-TS case,
https://aomediacodec.github.io/av1-mpeg2-ts/
Decodable Frame Group could be more a better idea.
Temporal Unit would lead to larger delay.

Decoding frames would mean that un-display and frame of different
sizes get delivered, and we don't have a method to communicate these frame
dimension and strides at the moment.

Nicolas




On 9/12/22 23:45, Nicolas Dufresne wrote:
Hi Shi,

thanks for the patches, check inline for some comments. Generally speaking, we
don't usually add formats ahead of time unless we have a good rationale to do
so. Should be expect a companion series against the amlogic decoder driver that
enables this ?

Le mardi 30 août 2022 à 09:40 +0800, Shi Hao a écrit :
From: "hao.shi" <hao.shi@xxxxxxxxxxx>

Add AV1 compressed pixel format. It is the more common format.

Signed-off-by: Hao Shi <hao.shi@xxxxxxxxxxx>
---
.../userspace-api/media/v4l/pixfmt-compressed.rst | 9 +++++++++
drivers/media/v4l2-core/v4l2-ioctl.c | 1 +
include/uapi/linux/videodev2.h | 1 +
3 files changed, 11 insertions(+)

diff --git a/Documentation/userspace-api/media/v4l/pixfmt-compressed.rst b/Documentation/userspace-api/media/v4l/pixfmt-compressed.rst
index 506dd3c98884..5bdeeebdf9f5 100644
--- a/Documentation/userspace-api/media/v4l/pixfmt-compressed.rst
+++ b/Documentation/userspace-api/media/v4l/pixfmt-compressed.rst
@@ -232,6 +232,15 @@ Compressed Formats
Metadata associated with the frame to decode is required to be passed
through the ``V4L2_CID_STATELESS_FWHT_PARAMS`` control.
See the :ref:`associated Codec Control ID <codec-stateless-fwht>`.
+ * .. _V4L2-PIX-FMT-AV1:
+
+ - ``V4L2_PIX_FMT_AV1``
+ - 'AV1'
+ - AV1 Access Unit. The decoder expects one Access Unit per buffer.

I believe this is using a MPEG LA terminology. Did you mean a Temporal Unit (TU)
? In AV1 a TU represent 1 displayable picture, just like AU in H.264 (if you
ignore interlaced video).
I think it should be a complete tile group obu. From the spec, we have
the term 'frame'.

Currently, AV1 doesn't support interlace.

+ The encoder generates one Access Unit per buffer. This format is
+ adapted for stateful video decoders. AV1 (AOMedia Video 1) is an
+ open video coding format. It was developed as a successor to VP9
+ by the Alliance for Open Media (AOMedia).

.. raw:: latex

diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
index c314025d977e..fc0f43228546 100644
--- a/drivers/media/v4l2-core/v4l2-ioctl.c
+++ b/drivers/media/v4l2-core/v4l2-ioctl.c
@@ -1497,6 +1497,7 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
case V4L2_PIX_FMT_MT21C: descr = "Mediatek Compressed Format"; break;
case V4L2_PIX_FMT_QC08C: descr = "QCOM Compressed 8-bit Format"; break;
case V4L2_PIX_FMT_QC10C: descr = "QCOM Compressed 10-bit Format"; break;
+ case V4L2_PIX_FMT_AV1: descr = "AV1"; break;
default:
if (fmt->description[0])
return;
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 01e630f2ec78..c5ea9f38d807 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -738,6 +738,7 @@ struct v4l2_pix_format {
#define V4L2_PIX_FMT_FWHT_STATELESS v4l2_fourcc('S', 'F', 'W', 'H') /* Stateless FWHT (vicodec) */
#define V4L2_PIX_FMT_H264_SLICE v4l2_fourcc('S', '2', '6', '4') /* H264 parsed slices */
#define V4L2_PIX_FMT_HEVC_SLICE v4l2_fourcc('S', '2', '6', '5') /* HEVC parsed slices */
+#define V4L2_PIX_FMT_AV1 v4l2_fourcc('A', 'V', '1', '0') /* AV1 */

/* Vendor-specific formats */
#define V4L2_PIX_FMT_CPIA1 v4l2_fourcc('C', 'P', 'I', 'A') /* cpia1 YUV */

base-commit: 568035b01cfb107af8d2e4bd2fb9aea22cf5b868





--
Hsia-Jun(Randy) Li
Spelling

Possible spelling mistake found.

OpenVASPentaxOpenStaxOpenAIPenman

Add "OpenMAX" to personal dictionary

Ignore in this text

LanguageTool

basic