Re: [RFC PATCH v6 09/11] media: uapi: Add audio rate controls support

From: Hans Verkuil
Date: Wed Oct 18 2023 - 09:10:02 EST


On 18/10/2023 14:52, Shengjiu Wang wrote:
> On Wed, Oct 18, 2023 at 3:58 PM Hans Verkuil <hverkuil@xxxxxxxxx> wrote:
>>
>> On 18/10/2023 09:40, Shengjiu Wang wrote:
>>> On Wed, Oct 18, 2023 at 3:31 PM Hans Verkuil <hverkuil@xxxxxxxxx> wrote:
>>>>
>>>> On 18/10/2023 09:23, Shengjiu Wang wrote:
>>>>> On Wed, Oct 18, 2023 at 10:27 AM Shengjiu Wang <shengjiu.wang@xxxxxxxxx> wrote:
>>>>>>
>>>>>> On Tue, Oct 17, 2023 at 9:37 PM Hans Verkuil <hverkuil@xxxxxxxxx> wrote:
>>>>>>>
>>>>>>> On 17/10/2023 15:11, Shengjiu Wang wrote:
>>>>>>>> On Mon, Oct 16, 2023 at 9:16 PM Hans Verkuil <hverkuil@xxxxxxxxx> wrote:
>>>>>>>>>
>>>>>>>>> Hi Shengjiu,
>>>>>>>>>
>>>>>>>>> On 13/10/2023 10:31, Shengjiu Wang wrote:
>>>>>>>>>> Fixed point controls are used by the user to configure
>>>>>>>>>> the audio sample rate to driver.
>>>>>>>>>>
>>>>>>>>>> Add V4L2_CID_ASRC_SOURCE_RATE and V4L2_CID_ASRC_DEST_RATE
>>>>>>>>>> new IDs for ASRC rate control.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Shengjiu Wang <shengjiu.wang@xxxxxxx>
>>>>>>>>>> ---
>>>>>>>>>> .../userspace-api/media/v4l/common.rst | 1 +
>>>>>>>>>> .../media/v4l/ext-ctrls-fixed-point.rst | 36 +++++++++++++++++++
>>>>>>>>>> .../media/v4l/vidioc-g-ext-ctrls.rst | 4 +++
>>>>>>>>>> .../media/v4l/vidioc-queryctrl.rst | 7 ++++
>>>>>>>>>> .../media/videodev2.h.rst.exceptions | 1 +
>>>>>>>>>> drivers/media/v4l2-core/v4l2-ctrls-core.c | 5 +++
>>>>>>>>>> drivers/media/v4l2-core/v4l2-ctrls-defs.c | 4 +++
>>>>>>>>>> include/media/v4l2-ctrls.h | 2 ++
>>>>>>>>>> include/uapi/linux/v4l2-controls.h | 13 +++++++
>>>>>>>>>> include/uapi/linux/videodev2.h | 3 ++
>>>>>>>>>> 10 files changed, 76 insertions(+)
>>>>>>>>>> create mode 100644 Documentation/userspace-api/media/v4l/ext-ctrls-fixed-point.rst
>>>>>>>>>>
>>>>>>>>>> diff --git a/Documentation/userspace-api/media/v4l/common.rst b/Documentation/userspace-api/media/v4l/common.rst
>>>>>>>>>> index ea0435182e44..35707edffb13 100644
>>>>>>>>>> --- a/Documentation/userspace-api/media/v4l/common.rst
>>>>>>>>>> +++ b/Documentation/userspace-api/media/v4l/common.rst
>>>>>>>>>> @@ -52,6 +52,7 @@ applicable to all devices.
>>>>>>>>>> ext-ctrls-fm-rx
>>>>>>>>>> ext-ctrls-detect
>>>>>>>>>> ext-ctrls-colorimetry
>>>>>>>>>> + ext-ctrls-fixed-point
>>>>>>>>>
>>>>>>>>> Rename this to ext-ctrls-audio-m2m.
>>>>>>>>>
>>>>>>>>>> fourcc
>>>>>>>>>> format
>>>>>>>>>> planar-apis
>>>>>>>>>> diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls-fixed-point.rst b/Documentation/userspace-api/media/v4l/ext-ctrls-fixed-point.rst
>>>>>>>>>> new file mode 100644
>>>>>>>>>> index 000000000000..2ef6e250580c
>>>>>>>>>> --- /dev/null
>>>>>>>>>> +++ b/Documentation/userspace-api/media/v4l/ext-ctrls-fixed-point.rst
>>>>>>>>>> @@ -0,0 +1,36 @@
>>>>>>>>>> +.. SPDX-License-Identifier: GFDL-1.1-no-invariants-or-later
>>>>>>>>>> +
>>>>>>>>>> +.. _fixed-point-controls:
>>>>>>>>>> +
>>>>>>>>>> +***************************
>>>>>>>>>> +Fixed Point Control Reference
>>>>>>>>>
>>>>>>>>> This is for audio controls. "Fixed Point" is just the type, and it doesn't make
>>>>>>>>> sense to group fixed point controls. But it does make sense to group the audio
>>>>>>>>> controls.
>>>>>>>>>
>>>>>>>>> V4L2 controls can be grouped into classes. Basically it is a way to put controls
>>>>>>>>> into categories, and for each category there is also a control that gives a
>>>>>>>>> description of the class (see 2.15.15 in
>>>>>>>>> https://linuxtv.org/downloads/v4l-dvb-apis-new/driver-api/v4l2-controls.html#introduction)
>>>>>>>>>
>>>>>>>>> If you use e.g. 'v4l2-ctl -l' to list all the controls, then you will see that
>>>>>>>>> they are grouped based on what class of control they are.
>>>>>>>>>
>>>>>>>>> So I think it would be a good idea to create a new control class for M2M audio controls,
>>>>>>>>> instead of just adding them to the catch-all 'User Controls' class.
>>>>>>>>>
>>>>>>>>> Search e.g. for V4L2_CTRL_CLASS_COLORIMETRY and V4L2_CID_COLORIMETRY_CLASS to see how
>>>>>>>>> it is done.
>>>>>>>>>
>>>>>>>>> M2M_AUDIO would probably be a good name for the class.
>>>>>>>>>
>>>>>>>>>> +***************************
>>>>>>>>>> +
>>>>>>>>>> +These controls are intended to support an asynchronous sample
>>>>>>>>>> +rate converter.
>>>>>>>>>
>>>>>>>>> Add ' (ASRC).' at the end to indicate the common abbreviation for
>>>>>>>>> that.
>>>>>>>>>
>>>>>>>>>> +
>>>>>>>>>> +.. _v4l2-audio-asrc:
>>>>>>>>>> +
>>>>>>>>>> +``V4L2_CID_ASRC_SOURCE_RATE``
>>>>>>>>>> + sets the resampler source rate.
>>>>>>>>>> +
>>>>>>>>>> +``V4L2_CID_ASRC_DEST_RATE``
>>>>>>>>>> + sets the resampler destination rate.
>>>>>>>>>
>>>>>>>>> Document the unit (Hz) for these two controls.
>>>>>>>>>
>>>>>>>>>> +
>>>>>>>>>> +.. c:type:: v4l2_ctrl_fixed_point
>>>>>>>>>> +
>>>>>>>>>> +.. cssclass:: longtable
>>>>>>>>>> +
>>>>>>>>>> +.. tabularcolumns:: |p{1.5cm}|p{5.8cm}|p{10.0cm}|
>>>>>>>>>> +
>>>>>>>>>> +.. flat-table:: struct v4l2_ctrl_fixed_point
>>>>>>>>>> + :header-rows: 0
>>>>>>>>>> + :stub-columns: 0
>>>>>>>>>> + :widths: 1 1 2
>>>>>>>>>> +
>>>>>>>>>> + * - __u32
>>>>>>>>>
>>>>>>>>> Hmm, shouldn't this be __s32?
>>>>>>>>>
>>>>>>>>>> + - ``integer``
>>>>>>>>>> + - integer part of fixed point value.
>>>>>>>>>> + * - __s32
>>>>>>>>>
>>>>>>>>> and this __u32?
>>>>>>>>>
>>>>>>>>> You want to be able to use this generic type as a signed value.
>>>>>>>>>
>>>>>>>>>> + - ``fractional``
>>>>>>>>>> + - fractional part of fixed point value, which is Q31.
>>>>>>>>>> diff --git a/Documentation/userspace-api/media/v4l/vidioc-g-ext-ctrls.rst b/Documentation/userspace-api/media/v4l/vidioc-g-ext-ctrls.rst
>>>>>>>>>> index f9f73530a6be..1811dabf5c74 100644
>>>>>>>>>> --- a/Documentation/userspace-api/media/v4l/vidioc-g-ext-ctrls.rst
>>>>>>>>>> +++ b/Documentation/userspace-api/media/v4l/vidioc-g-ext-ctrls.rst
>>>>>>>>>> @@ -295,6 +295,10 @@ still cause this situation.
>>>>>>>>>> - ``p_av1_film_grain``
>>>>>>>>>> - A pointer to a struct :c:type:`v4l2_ctrl_av1_film_grain`. Valid if this control is
>>>>>>>>>> of type ``V4L2_CTRL_TYPE_AV1_FILM_GRAIN``.
>>>>>>>>>> + * - struct :c:type:`v4l2_ctrl_fixed_point` *
>>>>>>>>>> + - ``p_fixed_point``
>>>>>>>>>> + - A pointer to a struct :c:type:`v4l2_ctrl_fixed_point`. Valid if this control is
>>>>>>>>>> + of type ``V4L2_CTRL_TYPE_FIXED_POINT``.
>>>>>>>>>> * - void *
>>>>>>>>>> - ``ptr``
>>>>>>>>>> - A pointer to a compound type which can be an N-dimensional array
>>>>>>>>>> diff --git a/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst b/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
>>>>>>>>>> index 4d38acafe8e1..9285f4f39eed 100644
>>>>>>>>>> --- a/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
>>>>>>>>>> +++ b/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
>>>>>>>>>> @@ -549,6 +549,13 @@ See also the examples in :ref:`control`.
>>>>>>>>>> - n/a
>>>>>>>>>> - A struct :c:type:`v4l2_ctrl_av1_film_grain`, containing AV1 Film Grain
>>>>>>>>>> parameters for stateless video decoders.
>>>>>>>>>> + * - ``V4L2_CTRL_TYPE_FIXED_POINT``
>>>>>>>>>> + - n/a
>>>>>>>>>> + - n/a
>>>>>>>>>> + - n/a
>>>>>>>>>> + - A struct :c:type:`v4l2_ctrl_fixed_point`, containing parameter which has
>>>>>>>>>> + integer part and fractional part, i.e. audio sample rate.
>>>>>>>>>> +
>>>>>>>>>>
>>>>>>>>>> .. raw:: latex
>>>>>>>>>>
>>>>>>>>>> diff --git a/Documentation/userspace-api/media/videodev2.h.rst.exceptions b/Documentation/userspace-api/media/videodev2.h.rst.exceptions
>>>>>>>>>> index e61152bb80d1..2faa5a2015eb 100644
>>>>>>>>>> --- a/Documentation/userspace-api/media/videodev2.h.rst.exceptions
>>>>>>>>>> +++ b/Documentation/userspace-api/media/videodev2.h.rst.exceptions
>>>>>>>>>> @@ -167,6 +167,7 @@ replace symbol V4L2_CTRL_TYPE_AV1_SEQUENCE :c:type:`v4l2_ctrl_type`
>>>>>>>>>> replace symbol V4L2_CTRL_TYPE_AV1_TILE_GROUP_ENTRY :c:type:`v4l2_ctrl_type`
>>>>>>>>>> replace symbol V4L2_CTRL_TYPE_AV1_FRAME :c:type:`v4l2_ctrl_type`
>>>>>>>>>> replace symbol V4L2_CTRL_TYPE_AV1_FILM_GRAIN :c:type:`v4l2_ctrl_type`
>>>>>>>>>> +replace symbol V4L2_CTRL_TYPE_FIXED_POINT :c:type:`v4l2_ctrl_type`
>>>>>>>>>>
>>>>>>>>>> # V4L2 capability defines
>>>>>>>>>> replace define V4L2_CAP_VIDEO_CAPTURE device-capabilities
>>>>>>>>>> diff --git a/drivers/media/v4l2-core/v4l2-ctrls-core.c b/drivers/media/v4l2-core/v4l2-ctrls-core.c
>>>>>>>>>> index a662fb60f73f..7a616ac91059 100644
>>>>>>>>>> --- a/drivers/media/v4l2-core/v4l2-ctrls-core.c
>>>>>>>>>> +++ b/drivers/media/v4l2-core/v4l2-ctrls-core.c
>>>>>>>>>> @@ -1168,6 +1168,8 @@ static int std_validate_compound(const struct v4l2_ctrl *ctrl, u32 idx,
>>>>>>>>>> if (!area->width || !area->height)
>>>>>>>>>> return -EINVAL;
>>>>>>>>>> break;
>>>>>>>>>> + case V4L2_CTRL_TYPE_FIXED_POINT:
>>>>>>>>>> + break;
>>>>>>>>>
>>>>>>>>> Hmm, this would need this patch 'v4l2-ctrls: add support for V4L2_CTRL_WHICH_MIN/MAX_VAL':
>>>>>>>>>
>>>>>>>>> https://patchwork.linuxtv.org/project/linux-media/patch/20231010022136.1504015-7-yunkec@xxxxxxxxxx/
>>>>>>>>>
>>>>>>>>> since min and max values are perfectly fine for a fixed point value.
>>>>>>>>>
>>>>>>>>> Even a step value (currently not supported in that patch) would make sense.
>>>>>>>>>
>>>>>>>>> But I wonder if we couldn't simplify this: instead of creating a v4l2_ctrl_fixed_point,
>>>>>>>>> why not represent the fixed point value as a Q31.32. Then the standard
>>>>>>>>> minimum/maximum/step values can be used, and it acts like a regular V4L2_TYPE_INTEGER64.
>>>>>>>>>
>>>>>>>>> Except that both userspace and drivers need to multiply it with 2^-32 to get the actual
>>>>>>>>> value.
>>>>>>>>>
>>>>>>>>> So in enum v4l2_ctrl_type add:
>>>>>>>>>
>>>>>>>>> V4L2_CTRL_TYPE_FIXED_POINT = 10,
>>>>>>>>>
>>>>>>>>> (10, because it is no longer a compound type).
>>>>>>>>
>>>>>>>> Seems we don't need V4L2_CTRL_TYPE_FIXED_POINT, just use V4L2_TYPE_INTEGER64?
>>>>>>>>
>>>>>>>> The reason I use the 'integer' and 'fractional' is that I want
>>>>>>>> 'integer' to be the normal sample
>>>>>>>> rate, for example 48kHz. The 'fractional' is the difference with
>>>>>>>> normal sample rate.
>>>>>>>>
>>>>>>>> For example, the rate = 47998.12345. so integer = 48000, fractional= -1.87655.
>>>>>>>>
>>>>>>>> So if we use s64 for rate, then in driver need to convert the rate to
>>>>>>>> the closed normal
>>>>>>>> sample rate + fractional.
>>>>>>>
>>>>>>> That wasn't what the documentation said :-)
>>>>>>>
>>>>>>> So this is really two controls: one for the 'normal sample rate' (whatever 'normal'
>>>>>>> means in this context) and the offset to the actual sample rate.
>>>>>>>
>>>>>>> Presumably the 'normal' sample rate is set once, while the offset changes
>>>>>>> regularly.
>>>>>>>
>>>>>>> But why do you need the 'normal' sample rate? With audio resampling I assume
>>>>>>> you resample from one rate to another, so why do you need a third 'normal'
>>>>>>> rate?
>>>>>>>
>>>>>>
>>>>>> 'Normal' rate is used to select the prefilter table.
>>>>>>
>>>>>
>>>>> Currently I think we may define
>>>>> V4L2_CID_M2M_AUDIO_SOURCE_RATE
>>>>> V4L2_CID_M2M_AUDIO_DEST_RATE
>>>>
>>>> That makes sense.
>>>>
>>>>> V4L2_CID_M2M_AUDIO_ASRC_RATIO_MOD
>>>>
>>>> OK, can you document this control? Just write it down in the reply, I just want
>>>> to understand how the integer value you set here is used.
>>>>
>>>
>>> It is Q31 value. It is equal to:
>>> in_rate_new / out_rate_new - in_rate_old / out_rate_old
>>
>> So that's not an integer. Also, Q31 is limited to -1...1, and I think
>> that's too limiting.
>>
>> For this having a Q31.32 fixed point type still makes a lot of sense.
>>
>> I still feel this is a overly complicated API.
>>
>> See more below...
>>
>>>
>>> Best regards
>>> Wang shengjiu
>>>
>>>> Regards,
>>>>
>>>> Hans
>>>>
>>>>>
>>>>> All of them can be V4L2_CTRL_TYPE_INTEGER.
>>>>>
>>>>> RATIO_MOD was defined in the very beginning version.
>>>>> I think it is better to let users calculate this value.
>>>>>
>>>>> The reason is:
>>>>> if we define the offset for source rate and dest rate in
>>>>> driver separately, when offset of source rate is set,
>>>>> driver don't know if it needs to wait or not the dest rate
>>>>> offset, then go to calculate the ratio_mod.
>>
>> Ah, in order to update the ratio mod userspace needs to set both source and
>> dest rate at the same time to avoid race conditions.
>>
>> That is perfectly possible in the V4L2 control framework. See:
>>
>> https://linuxtv.org/downloads/v4l-dvb-apis-new/driver-api/v4l2-controls.html#control-clusters
>>
>> In practice, isn't it likely that you would fix either the source or
>> destination rate, and let the other rate fluctuate? It kind of feels weird
>> to me that both source AND destination rates can fluctuate over time.
>>
> Right, the source and dest rates needn't change in same time.
>
>> In any case, with a control cluster it doesn't really matter, you can set
>> one rate or both rates, and it will be handled atomically.
>>
>> I feel that the RATIO_MOD control is too hardware specific. This is something
>> that should be hidden in the driver.
>>
>
> I will use:
>
> V4L2_CID_M2M_AUDIO_SOURCE_RATE
> V4L2_CID_M2M_AUDIO_DEST_RATE
> V4L2_CID_M2M_AUDIO_SOURCE_RATE_OFFSET
> V4L2_CID_M2M_AUDIO_DEST_RATE_OFFSET
>
> 'OFFSET' is V4L2_CTRL_TYPE_FIXED_POINT, which is Q31.32.

So now I come back to my original question: why do you need both
the rate and the offset? Isn't it enough to set just the rates,
as long as that is in fixed point format?

Why does the driver need both the 'ideal' rate + the offset?

I'm not opposed to this, I'm just trying to understand whether this
makes sense.

Can't you take e.g. the source and dest rate as starting points
when you start streaming? And every time userspace updates one or both
of these rates you calculate the ratio_mod compared to the previous rates?

Or is there a reason why you need the ideal rates as well? E.g. 48000 or
44100, etc.

Regards,

Hans