Re: [PATCH 1/8] soundwire: fix enumeration completion

From: Pierre-Louis Bossart
Date: Wed Jul 05 2023 - 10:45:19 EST




On 7/5/23 16:30, Johan Hovold wrote:
> On Wed, Jul 05, 2023 at 02:53:17PM +0200, Pierre-Louis Bossart wrote:
>> On 7/5/23 14:30, Johan Hovold wrote:
>>> The soundwire subsystem uses two completion structures that allow
>>> drivers to wait for soundwire device to become enumerated on the bus and
>>> initialised by their drivers, respectively.
>>>
>>> The code implementing the signalling is currently broken as it does not
>>> signal all current and future waiters and also uses the wrong
>>> reinitialisation function, which can potentially lead to memory
>>> corruption if there are still waiters on the queue.
>>
>> That change sounds good, but I am not following the two paragraphs below.
>>
>>> Not signalling future waiters specifically breaks sound card probe
>>> deferrals as codec drivers can not tell that the soundwire device is
>>> already attached when being reprobed.
>>
>> What makes you say that? There is a test in the probe and the codec
>> driver will absolutely be notified, see bus_type.c
>>
>> if (drv->ops && drv->ops->update_status) {
>> ret = drv->ops->update_status(slave, slave->status);
>> if (ret < 0)
>> dev_warn(dev, "%s: update_status failed with status %d\n", __func__,
>> ret);
>> }
>
> I'm talking about signalling the codec driver using the soundwire device
> via the completion structs. Unless the underling device is detached and
> reattached, trying to wait for completion a second time will currently
> timeout instead of returning immediately.
>
> This affects codecs like rt5682, which wait for completion in component
> probe (see rt5682_probe()).
>
>>> Some codec runtime PM
>>> implementations suffer from similar problems as waiting for enumeration
>>> during resume can also timeout despite the device already having been
>>> enumerated.
>>
>> I am not following this either. Are you saying the wait_for_completion()
>> times out because of the init_completion/reinit_completion confusion, or
>> something else.
>
> It times out because the completion counter is not saturated unless you
> use complete_all().
>
> Drivers that wait unconditionally in resume, will time out the second
> time they are runtime resumed unless the underlying device has been
> detached and reattached in the meantime (e.g. wsa881x_runtime_resume()).

Makes sense. The default on Intel platforms is to reset the bus in all
resume cases, that forces the attachment so we never saw the issue.

For this patch:

Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@xxxxxxxxxxxxxxx>