Re: [PATCH v2 0/4] soundwire: qcom: stablity fixes

From: Johan Hovold
Date: Thu Jun 08 2023 - 08:40:07 EST


On Thu, Jun 08, 2023 at 12:11:45PM +0200, Johan Hovold wrote:
> On Wed, Jun 07, 2023 at 10:36:40AM +0100, Srinivas Kandagatla wrote:

> No, not yet, but I just triggered the above once more after not having
> seen with my latest -rc5 branch for a while (e.g. 20 reboots?):
>
> [ 11.430131] qcom-soundwire 3210000.soundwire-controller: Qualcomm Soundwire controller v1.6.0 Registered
> [ 11.431741] wcd938x_codec audio-codec: bound sdw:0:0217:010d:00:4 (ops wcd938x_sdw_component_ops [snd_soc_wcd938x_sdw])
> [ 11.431933] wcd938x_codec audio-codec: bound sdw:0:0217:010d:00:3 (ops wcd938x_sdw_component_ops [snd_soc_wcd938x_sdw])
> [ 11.435406] qcom-soundwire 3330000.soundwire-controller: Qualcomm Soundwire controller v1.6.0 Registered
> [ 11.449286] qcom-soundwire 3250000.soundwire-controller: Qualcomm Soundwire controller v1.6.0 Registered
> [ 11.450632] wsa883x-codec sdw:0:0217:0202:00:1: WSA883X Version 1_1, Variant: WSA8835_V2
> [ 11.453155] wsa883x-codec sdw:0:0217:0202:00:1: WSA883X Version 1_1, Variant: WSA8835_V2
> [ 11.456511] wsa883x-codec sdw:0:0217:0202:00:2: WSA883X Version 1_1, Variant: WSA8835_V2
> [ 11.562623] q6apm-dai 3000000.remoteproc:glink-edge:gpr:service@1:dais: Adding to iommu group 23
> [ 11.585766] snd-sc8280xp sound: ASoC: adding FE link failed
> [ 11.585872] snd-sc8280xp sound: ASoC: topology: could not load header: -517
> [ 11.586021] qcom-apm gprsvc:service:2:1: tplg component load failed-517
> [ 11.586100] qcom-apm gprsvc:service:2:1: ASoC: error at snd_soc_component_probe on gprsvc:service:2:1: -22
> [ 11.586530] snd-sc8280xp sound: ASoC: failed to instantiate card -22
> [ 11.591831] snd-sc8280xp: probe of sound failed with error -22
>
> I don't think I've ever seen it before dropping the runtime PM patch as
> you did in v2, and I hit it twice fairly quickly after dropping it. And
> now again.
>
> I'm not saying that the runtime PM patch is necessarily correct, and
> perhaps it is just changes in timing that lead to the above, but we
> definitely have a bug here.

I searched my notes and realised that I have seen this once also with
the runtime pm patch. So the fact that happened to see it more often
after dropping it is likely due to changes in timing.

Looking at the above log it seems like we hit a probe deferral somewhere
as some resource is not available yet, and this is eventually turned
into a hard failure that breaks audio as the error is propagated up the
stack.

Johan