Re: [PATCH] arm64: dts: qcom: sdm845-db845c: Move LVS regulator nodes up

From: Amit Pundir
Date: Thu Jun 15 2023 - 12:16:24 EST


On Thu, 15 Jun 2023 at 21:39, Amit Pundir <amit.pundir@xxxxxxxxxx> wrote:
>
> On Thu, 15 Jun 2023 at 20:33, Krzysztof Kozlowski
> <krzysztof.kozlowski@xxxxxxxxxx> wrote:
> >
> > On 15/06/2023 15:47, Amit Pundir wrote:
> > > On Thu, 15 Jun 2023 at 00:38, Amit Pundir <amit.pundir@xxxxxxxxxx> wrote:
> > >>
> > >> On Thu, 15 Jun 2023 at 00:17, Krzysztof Kozlowski
> > >> <krzysztof.kozlowski@xxxxxxxxxx> wrote:
> > >>>
> > >>> On 14/06/2023 20:18, Linux regression tracking (Thorsten Leemhuis) wrote:
> > >>>> On 02.06.23 18:12, Amit Pundir wrote:
> > >>>>> Move lvs1 and lvs2 regulator nodes up in the rpmh-regulators
> > >>>>> list to workaround a boot regression uncovered by the upstream
> > >>>>> commit ad44ac082fdf ("regulator: qcom-rpmh: Revert "regulator:
> > >>>>> qcom-rpmh: Use PROBE_FORCE_SYNCHRONOUS"").
> > >>>>>
> > >>>>> Without this fix DB845c fail to boot at times because one of the
> > >>>>> lvs1 or lvs2 regulators fail to turn ON in time.
> > >>>>
> > >>>> /me waves friendly
> > >>>>
> > >>>> FWIW, as it's not obvious: this...
> > >>>>
> > >>>>> Link: https://lore.kernel.org/all/CAMi1Hd1avQDcDQf137m2auz2znov4XL8YGrLZsw5edb-NtRJRw@xxxxxxxxxxxxxx/
> > >>>>
> > >>>> ...is a report about a regression. One that we could still solve before
> > >>>> 6.4 is out. One I'll likely will point Linus to, unless a fix comes into
> > >>>> sight.
> > >>>>
> > >>>> When I noticed the reluctant replies to this patch I earlier today asked
> > >>>> in the thread with the report what the plan forward was:
> > >>>> https://lore.kernel.org/all/CAD%3DFV%3DV-h4EUKHCM9UivsFHRsJPY5sAiwXV3a1hUX9DUMkkxdg@xxxxxxxxxxxxxx/
> > >>>>
> > >>>> Dough there replied:
> > >>>>
> > >>>> ```
> > >>>> Of the two proposals made (the revert vs. the reordering of the dts),
> > >>>> the reordering of the dts seems better. It only affects the one buggy
> > >>>> board (rather than preventing us to move to async probe for everyone)
> > >>>> and it also has a chance of actually fixing something (changing the
> > >>>> order that regulators probe in rpmh-regulator might legitimately work
> > >>>> around the problem). That being said, just like the revert the dts
> > >>>> reordering is still just papering over the problem and is fragile /
> > >>>> not guaranteed to work forever.
> > >>>> ```
> > >>>>
> > >>>> Papering over obviously is not good, but has anyone a better idea to fix
> > >>>> this? Or is "not fixing" for some reason an viable option here?
> > >>>>
> > >>>
> > >>> I understand there is a regression, although kernel is not mainline
> > >>> (hash df7443a96851 is unknown) and the only solutions were papering the
> > >>> problem. Reverting commit is a temporary workaround. Moving nodes in DTS
> > >>> is not acceptable because it hides actual problem and only solves this
> > >>> one particular observed problem, while actual issue is still there. It
> > >>> would be nice to be able to reproduce it on real mainline with normal
> > >>> operating system (not AOSP) - with ramdiks/without/whatever. So far no
> > >>> one did it, right?
> > >>
> > >> No, I did not try non-AOSP system yet. I'll try it tomorrow, if that
> > >> helps. With mainline hash.
> > >
> > > Hi, here is the crash report on db845c running vanilla v6.4-rc6 with a
> > > debian build https://bugs.linaro.org/attachment.cgi?id=1142
> > >
> > > And fwiw here is the db845c crash log with AOSP running vanilla
> > > v6.4-rc6 https://bugs.linaro.org/attachment.cgi?id=1141
> > >
> > > Regards,
> > > Amit Pundir
> > >
> > > PS: rootfs in this bug report doesn't matter much because I'm loading
> > > all the kernel modules from a ramdisk and in the case of a crash the
> > > UFS doesn't probe anyway.
> >
> > I just tried current next with defconfig (I could not find your config,
> > neither here, nor in your previous mail thread nor in bugzilla). Also
> > with REGULATOR_QCOM_RPMH as module.
> >
> > I tried also v6.4-rc6 - also defconfig with default and module
> > REGULATOR_QCOM_RPMH.
> >
> > All the cases work on my RB3 - no warnings reported.
> >
> > If you do not use defconfig, then in all reports please mention the
> > differences (the best) or at least attach it.
>
> Argh.. Sorry about that. Big mistake from my side. I did want to
> upload my defconfig but forgot. Defconfig plays a key role because, as
> I mentioned in one of my previous email, it is a timing/race bug and
> if I do any much changes in my defconfig (i.e. enable ftrace for
> example or as little as add printk in qcom_rpmh_regulator code) then I
> can't reproduce this bug. So needless to say that I can't reproduce
> this bug with default arm64 defconfig.
>
> Please find my custom (but upstream) defconfig here
> https://bugs.linaro.org/attachment.cgi?id=1143 and prebuilt binaries
> here https://people.linaro.org/~amit.pundir/db845c-userdebug/rpmh_bug/.
> "fastboot flash boot ./boot.img-6.4-rc6 reboot" and/or a few (<5)
> reboots should be enough to trigger the crash.
>
> I have downloaded the initrd from here
> https://snapshots.linaro.org/96boards/dragonboard845c/linaro/debian/569/initrd.img-5.15.0-qcomlt-arm64
> but edited ramdisk/init to run "load_module" function early in the
> boot and ramdisk/conf/initramfs.conf has "MODULES=list" instead of
> "MODULES=most", where all the kernel modules are listed at
> /etc/initramfs-tools/modules.

Sorry it is ramdisk/conf/modules not ramdisk/etc/initramfs-tools/modules.

>
> Regards,
> Amit Pundir