Re: [PATCH] arm64: dts: qcom: sdm845-db845c: Move LVS regulator nodes up

From: Krzysztof Kozlowski
Date: Thu Jun 15 2023 - 11:04:01 EST


On 15/06/2023 15:47, Amit Pundir wrote:
> On Thu, 15 Jun 2023 at 00:38, Amit Pundir <amit.pundir@xxxxxxxxxx> wrote:
>>
>> On Thu, 15 Jun 2023 at 00:17, Krzysztof Kozlowski
>> <krzysztof.kozlowski@xxxxxxxxxx> wrote:
>>>
>>> On 14/06/2023 20:18, Linux regression tracking (Thorsten Leemhuis) wrote:
>>>> On 02.06.23 18:12, Amit Pundir wrote:
>>>>> Move lvs1 and lvs2 regulator nodes up in the rpmh-regulators
>>>>> list to workaround a boot regression uncovered by the upstream
>>>>> commit ad44ac082fdf ("regulator: qcom-rpmh: Revert "regulator:
>>>>> qcom-rpmh: Use PROBE_FORCE_SYNCHRONOUS"").
>>>>>
>>>>> Without this fix DB845c fail to boot at times because one of the
>>>>> lvs1 or lvs2 regulators fail to turn ON in time.
>>>>
>>>> /me waves friendly
>>>>
>>>> FWIW, as it's not obvious: this...
>>>>
>>>>> Link: https://lore.kernel.org/all/CAMi1Hd1avQDcDQf137m2auz2znov4XL8YGrLZsw5edb-NtRJRw@xxxxxxxxxxxxxx/
>>>>
>>>> ...is a report about a regression. One that we could still solve before
>>>> 6.4 is out. One I'll likely will point Linus to, unless a fix comes into
>>>> sight.
>>>>
>>>> When I noticed the reluctant replies to this patch I earlier today asked
>>>> in the thread with the report what the plan forward was:
>>>> https://lore.kernel.org/all/CAD%3DFV%3DV-h4EUKHCM9UivsFHRsJPY5sAiwXV3a1hUX9DUMkkxdg@xxxxxxxxxxxxxx/
>>>>
>>>> Dough there replied:
>>>>
>>>> ```
>>>> Of the two proposals made (the revert vs. the reordering of the dts),
>>>> the reordering of the dts seems better. It only affects the one buggy
>>>> board (rather than preventing us to move to async probe for everyone)
>>>> and it also has a chance of actually fixing something (changing the
>>>> order that regulators probe in rpmh-regulator might legitimately work
>>>> around the problem). That being said, just like the revert the dts
>>>> reordering is still just papering over the problem and is fragile /
>>>> not guaranteed to work forever.
>>>> ```
>>>>
>>>> Papering over obviously is not good, but has anyone a better idea to fix
>>>> this? Or is "not fixing" for some reason an viable option here?
>>>>
>>>
>>> I understand there is a regression, although kernel is not mainline
>>> (hash df7443a96851 is unknown) and the only solutions were papering the
>>> problem. Reverting commit is a temporary workaround. Moving nodes in DTS
>>> is not acceptable because it hides actual problem and only solves this
>>> one particular observed problem, while actual issue is still there. It
>>> would be nice to be able to reproduce it on real mainline with normal
>>> operating system (not AOSP) - with ramdiks/without/whatever. So far no
>>> one did it, right?
>>
>> No, I did not try non-AOSP system yet. I'll try it tomorrow, if that
>> helps. With mainline hash.
>
> Hi, here is the crash report on db845c running vanilla v6.4-rc6 with a
> debian build https://bugs.linaro.org/attachment.cgi?id=1142
>
> And fwiw here is the db845c crash log with AOSP running vanilla
> v6.4-rc6 https://bugs.linaro.org/attachment.cgi?id=1141
>
> Regards,
> Amit Pundir
>
> PS: rootfs in this bug report doesn't matter much because I'm loading
> all the kernel modules from a ramdisk and in the case of a crash the
> UFS doesn't probe anyway.

I just tried current next with defconfig (I could not find your config,
neither here, nor in your previous mail thread nor in bugzilla). Also
with REGULATOR_QCOM_RPMH as module.

I tried also v6.4-rc6 - also defconfig with default and module
REGULATOR_QCOM_RPMH.

All the cases work on my RB3 - no warnings reported.

If you do not use defconfig, then in all reports please mention the
differences (the best) or at least attach it.



Best regards,
Krzysztof