Re: [RFC PATCH 0/1] Add LTR controls

From: Hans Verkuil
Date: Fri Jun 12 2020 - 05:12:06 EST


Hi Dikshita, Nicolas,

On 11/06/2020 16:22, Nicolas Dufresne wrote:
> Le jeudi 11 juin 2020 Ã 15:55 +0530, Dikshita Agarwal a Ãcrit :
>> LTR (Long Term Reference) frames are the frames that are encoded sometime in the past
>> and stored in the DPB buffer list to be used as reference to encode future frames.
>> One usage of LTR encoding is to reduce error propagation for video transmission
>> in packet lossy networks. For example, encoder may want to specify some key frames as
>> LTR pictures and use them as reference frames for encoding. With extra protection
>> selectively on these LTR frames or synchronization with the receiver of reception of
>> the LTR frames during transmission, decoder can receive reference frames more reliably
>> than other non-reference frames. As a result, transmission error can be effectively
>> restricted within certain frames rather than propagated to future frames.
>>
>> We are introducing below V4l2 Controls for this feature
>> 1. V4L2_CID_MPEG_VIDEO_LTRCOUNT
>> a. This is used to query or configure the number of LTR frames.
>> This is a static control and is controlled by the client.
>> b. The LTR index varies from 0 to the max LTR-1.
>> c. If LTR Count is more than max supported LTR count (max LTR) by driver, it will be rejected.
>> d. Auto Marking : If LTR count is non zero,
>> 1) first LTR count frames would be mark as LTR automatically after
>> every IDR frame (inclusive).
>> 2) For multilayer encoding: first LTR count base layer reference frames starting after
>> every IDR frame (inclusive) in encoding order would be marked as LTR frames by the encoder.
>> 3) Auto marking of LTR due to IDR should consider following conditions:
>> 1. The frame is not already set to be marked as LTR.
>> 2. The frame is part of the base layer in the hierarchical layer case.
>> 3. The number of frames currently marked as LTR is less than the maximum LTR frame index plus 1.
>> e. Encoder needs to handle explicit Mark/Use command when encoder is still doing "auto" marking

I don't follow this, quite possibly due to lack of experience with encoders.

I kind of would expect to see two modes: either automatic where encoders can
mark up to LTR_COUNT frames as long term reference, and userspace just sets
LTR_COUNT and doesn't have to do anything else.

Or it is manual mode where userspace explicitly marks long term reference
frames.

>From the proposal above it looks like you can mix auto and manual modes.

BTW, how do you 'unmark' long term reference frames?

This feature is for stateful encoders, right?

>
> Perhaps we are missing a LONG_TERM_REFERENCE_MODE ? I bet some encoder
> can select by themself long term references and even some encoders may
> not let the user decide.
>
> (not huge han of LTR acronyme, but that could be fine too, assuming you
> add more _).
>
>>
>> 2. V4L2_CID_MPEG_VIDEO_MARKLTRFRAME :
>> a. This signals to mark the current frame as LTR frame. It is a dynamic control and also provide the LTR index to be used.
>> b. the LTR index provided by this control should never exceed the max LTR-1. Else it will be rejected.
>
> The "current" frame seems a bit loose. Perhaps you wanted to use buffer
> flags ? A bit like what we have to signal TOP/BOTTOM fields in
> alternate interlacing.

I was thinking the same thing. Using a control for this doesn't seem right.

>
>>
>> 3. V4L2_CID_MPEG_VIDEO_USELTRFRAME :
>> a. This specifies the LTR frame(s) to be used for encoding the current frame. This is a dynamic control.
>> b. LTR Use Bitmap : this consists of bits [0, 15]. A total of N LSB bits of this field are valid,
>> where N is the maximum number of LTRs supported. All the other bits are invalid and should be rejected.
>> The LSB corresponds to the LTR index 0. Bit N-1 from the LSB corresponds to the LTR index max LTR-1.

How would userspace know this? Esp. with auto marking since userspace would have
to predict how auto marking works (if I understand this correctly).

For which HW encoder is this meant?

>
> Note, I haven't captured very well the userspace control flow, perhaps
> this could be enhanced through writing some documentation.
>
> As per all other generic encoder controls, we need to make sure it will
> be usable and flexible enough for multiple HW blocks, as it can be
> tedious to extend later otherwise. It is important that along with this
> RFC you provide some comparisons with with other HW / SW APIs in order
> to help justify the design decisions. I also think there should be
> link made V4L2_CID_MPEG_VIDEO_GOP_* , number of B-Frames etc.

I agree with Nicolas.

Regards,

Hans

>
> regards,
> Nicolas
>
>>
>> Dikshita Agarwal (1):
>> media: v4l2-ctrls: add control for ltr
>>
>> drivers/media/v4l2-core/v4l2-ctrls.c | 6 ++++++
>> include/uapi/linux/v4l2-controls.h | 4 ++++
>> 2 files changed, 10 insertions(+)
>>
>