Re: [RFC v2 6/8] media: uapi: Remove bit_size field from v4l2_ctrl_hevc_slice_params

From: John Cox
Date: Tue Feb 15 2022 - 11:01:20 EST


On Tue, 15 Feb 2022 10:28:55 -0500, you wrote:

>Le mardi 15 février 2022 à 14:50 +0000, John Cox a écrit :
>> On Tue, 15 Feb 2022 15:35:12 +0100, you wrote:
>>
>> >
>> > Le 15/02/2022 à 15:17, John Cox a écrit :
>> > > Hi
>> > >
>> > > > The bit size of the slice could be deduced from the buffer payload
>> > > > so remove bit_size field to avoid duplicated the information.
>> > > I think this is a bad idea. In the future we are (I hope) going to want
>> > > to have an array (variable) of slice headers all referring to the same
>> > > bit buffer. When we do that we will need this field.
>> >
>> > I wonder if that could be considering like another decode mode and so
>> > use an other control ?
>>
>> I, personally, would be in favour of making the slice header control a
>> variable array just as it is. If userland can't cope with multiple
>> entries then just send them one at a time and the code looks exactly
>> like it does at the moment and if the driver can't then set max array
>> entries to 1.
>>
>> Having implemented this in rpi port of ffmpeg and the RPi V4L2 driver I
>> can say with experience that the code and effort overhead is very low.
>>
>> Either way having a multiple slice header control in the UAPI is
>> important for efficiency.
>
>Just to clarify the idea, we would have a single slice controls, always dynamic:
>
>1. For sliced based decoder
>
>The dynamic array slice control is implemented by the driver and its size must
>be 1.

Yes

>2. For frame based decoder that don't care for slices
>
>The dynamic array slice controls is not implement. Userland detects that at
>runtime, similar to the VP9 compressed headers.

If the driver parses all the slice header then that seems plausible

>3. For frame based decoders that needs slices (or driver that supports offset
>and can gain performance with such mode)
>
>The dynamic array slice controls is implemented, and should contain all the
>slices found in the OUTPUT buffer.
>
>So the reason for this bit_size (not sure why its bits though, perhaps someone
>can educate me ?)

RPi doesn't need bits and would be happy with bytes however
slice_segment data isn't byte aligned at the end so its possible that
there might be decoders out there that want an accurate length for that.

> Would be to let the driver offset inside the the single
>OUTPUT/bitstream buffer in case this is not automatically found by the driver
>(or that no start-code is needed). Is that last bit correct ? If so, should we
>change it to an offset rather then a size ? Shall we allow using offesets inside
>larger buffer (e.g. to avoid some memory copies) for the Sliced Base cases ?

I use (in the current structure) data_bit_offset to find the start of
each slice's slice_segment_data within the OUTPUT buffer and bit_size to
find the end. RPi doesn't / can't parse the slice_header and so wants
all of that. Decoders that do parse the header might plausably want
header offsets too and it would facilitate zero copy of the bit buffer.


>> Regards
>>
>> John Cox
>>
>> > > > Signed-off-by: Benjamin Gaignard <benjamin.gaignard@xxxxxxxxxxxxx>
>> > > > ---
>> > > > .../userspace-api/media/v4l/ext-ctrls-codec.rst | 3 ---
>> > > > drivers/staging/media/sunxi/cedrus/cedrus_h265.c | 11 ++++-------
>> > > > include/uapi/linux/v4l2-controls.h | 3 +--
>> > > > 3 files changed, 5 insertions(+), 12 deletions(-)
>> > > >
>> > > > diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
>> > > > index 3296ac3b9fca..c3ae97657fa7 100644
>> > > > --- a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
>> > > > +++ b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
>> > > > @@ -2965,9 +2965,6 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
>> > > > :stub-columns: 0
>> > > > :widths: 1 1 2
>> > > >
>> > > > - * - __u32
>> > > > - - ``bit_size``
>> > > > - - Size (in bits) of the current slice data.
>> > > > * - __u32
>> > > > - ``data_bit_offset``
>> > > > - Offset (in bits) to the video data in the current slice data.
>> > > > diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
>> > > > index 8ab2d9c6f048..db8c7475eeb8 100644
>> > > > --- a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
>> > > > +++ b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
>> > > > @@ -312,8 +312,8 @@ static void cedrus_h265_setup(struct cedrus_ctx *ctx,
>> > > > const struct v4l2_hevc_pred_weight_table *pred_weight_table;
>> > > > unsigned int width_in_ctb_luma, ctb_size_luma;
>> > > > unsigned int log2_max_luma_coding_block_size;
>> > > > + size_t slice_bytes;
>> > > > dma_addr_t src_buf_addr;
>> > > > - dma_addr_t src_buf_end_addr;
>> > > > u32 chroma_log2_weight_denom;
>> > > > u32 output_pic_list_index;
>> > > > u32 pic_order_cnt[2];
>> > > > @@ -370,8 +370,8 @@ static void cedrus_h265_setup(struct cedrus_ctx *ctx,
>> > > >
>> > > > cedrus_write(dev, VE_DEC_H265_BITS_OFFSET, 0);
>> > > >
>> > > > - reg = slice_params->bit_size;
>> > > > - cedrus_write(dev, VE_DEC_H265_BITS_LEN, reg);
>> > > > + slice_bytes = vb2_get_plane_payload(&run->src->vb2_buf, 0);
>> > > > + cedrus_write(dev, VE_DEC_H265_BITS_LEN, slice_bytes);
>> > > I think one of these must be wrong. bit_size is in bits,
>> > > vb2_get_plane_payload is in bytes?
>> >
>> > You are right it should be vb2_get_plane_payload() * 8 to get the size in bits.
>> >
>> > I will change that in v3.
>> >
>> > >
>> > > Regards
>> > >
>> > > John Cox
>> > >
>> > > > /* Source beginning and end addresses. */
>> > > >
>> > > > @@ -384,10 +384,7 @@ static void cedrus_h265_setup(struct cedrus_ctx *ctx,
>> > > >
>> > > > cedrus_write(dev, VE_DEC_H265_BITS_ADDR, reg);
>> > > >
>> > > > - src_buf_end_addr = src_buf_addr +
>> > > > - DIV_ROUND_UP(slice_params->bit_size, 8);
>> > > > -
>> > > > - reg = VE_DEC_H265_BITS_END_ADDR_BASE(src_buf_end_addr);
>> > > > + reg = VE_DEC_H265_BITS_END_ADDR_BASE(src_buf_addr + slice_bytes);
>> > > > cedrus_write(dev, VE_DEC_H265_BITS_END_ADDR, reg);
>> > > >
>> > > > /* Coding tree block address */
>> > > > diff --git a/include/uapi/linux/v4l2-controls.h b/include/uapi/linux/v4l2-controls.h
>> > > > index b1a3dc05f02f..27f5d272dc43 100644
>> > > > --- a/include/uapi/linux/v4l2-controls.h
>> > > > +++ b/include/uapi/linux/v4l2-controls.h
>> > > > @@ -2457,7 +2457,6 @@ struct v4l2_hevc_pred_weight_table {
>> > > > #define V4L2_HEVC_SLICE_PARAMS_FLAG_DEPENDENT_SLICE_SEGMENT (1ULL << 9)
>> > > >
>> > > > struct v4l2_ctrl_hevc_slice_params {
>> > > > - __u32 bit_size;
>> > > > __u32 data_bit_offset;
>> > > >
>> > > > /* ISO/IEC 23008-2, ITU-T Rec. H.265: NAL unit header */
>> > > > @@ -2484,7 +2483,7 @@ struct v4l2_ctrl_hevc_slice_params {
>> > > > /* ISO/IEC 23008-2, ITU-T Rec. H.265: Picture timing SEI message */
>> > > > __u8 pic_struct;
>> > > >
>> > > > - __u8 reserved;
>> > > > + __u8 reserved[5];
>> > > >
>> > > > /* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
>> > > > __u32 slice_segment_addr;