Re: [PATCH] Revert "arm64: dts: qcom: sm8250: Add cpuidle states"

From: Amit Pundir
Date: Tue Oct 18 2022 - 10:10:27 EST


On Tue, 18 Oct 2022 at 16:00, Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote:
>
> On Mon, 17 Oct 2022 at 22:17, Sudeep Holla <sudeep.holla@xxxxxxx> wrote:
> >
> > On Mon, Oct 17, 2022 at 10:10:05PM +0530, Amit Pundir wrote:
> > > This reverts commit 32bc936d732171d48c9c8f96c4fa25ac3ed7e1c7.
> > >
> > > This patch was part of a patch series to add APSS RSC to
> > > Cluster power domain
> > > https://patchwork.kernel.org/project/linux-pm/cover/1641749107-31979-1-git-send-email-quic_mkshah@xxxxxxxxxxx/
> > > but the rest of the patches in this series got NACKed and didn't land.
> > >
> > > These cpuidle states made RB5 (sm8250) highly unstable and I run into
> > > following crash every now and then:
> > >
> > > [ T1] vreg_l11c_3p3: failed to enable: -ETIMEDOUT
> > > [ T1] qcom-rpmh-regulator 18200000.rsc:pm8150l-rpmh-regulators: ldo11: devm_regulator_register() failed, ret=-110
> > > [ T1] qcom-rpmh-regulator: probe of 18200000.rsc:pm8150l-rpmh-regulators failed with error -110
> > >
> > > I reported this breakage earlier this year as well:
> > > https://lore.kernel.org/all/CAMi1Hd2Sngya_2m2odkjq4fdV8OiiXsFMEX1bb807cWMC7H-sg@xxxxxxxxxxxxxx/
> > > I can confirm that if I cherry-pick the rest of the patches from the
> > > series then I can't reproduce this crash, but I'm not sure when the rest
> > > of the patches are going to land though.
>
> I have been talking to Maulik (offlist) about re-posting the series,
> but apparently she has been too busy to move this forward.
>
> I assume a better option, than reverting, is to get the above series
> merged. If I recall, there were only a few minor comments from me on
> the genpd patch [1]. That said, let me help out and refresh the
> series, I will do it asap!
>
> > >
> > > Signed-off-by: Amit Pundir <amit.pundir@xxxxxxxxxx>
> > > ---
> > > arch/arm64/boot/dts/qcom/sm8250.dtsi | 105 ---------------------------
> > > 1 file changed, 105 deletions(-)
> > >
> > > diff --git a/arch/arm64/boot/dts/qcom/sm8250.dtsi b/arch/arm64/boot/dts/qcom/sm8250.dtsi
> > > index a5b62cadb129..a2c15da1a450 100644
> > > --- a/arch/arm64/boot/dts/qcom/sm8250.dtsi
> > > +++ b/arch/arm64/boot/dts/qcom/sm8250.dtsi
> > > @@ -101,8 +101,6 @@ CPU0: cpu@0 {
> > > capacity-dmips-mhz = <448>;
> > > dynamic-power-coefficient = <205>;
> > > next-level-cache = <&L2_0>;
> > > - power-domains = <&CPU_PD0>;
> > > - power-domain-names = "psci";
> > > qcom,freq-domain = <&cpufreq_hw 0>;
> > > operating-points-v2 = <&cpu0_opp_table>;
> > > interconnects = <&gem_noc MASTER_AMPSS_M0 &mc_virt SLAVE_EBI_CH0>,
> > > @@ -125,8 +123,6 @@ CPU1: cpu@100 {
> > > capacity-dmips-mhz = <448>;
> > > dynamic-power-coefficient = <205>;
> > > next-level-cache = <&L2_100>;
> > > - power-domains = <&CPU_PD1>;
> > > - power-domain-names = "psci";
> > > qcom,freq-domain = <&cpufreq_hw 0>;
> > > operating-points-v2 = <&cpu0_opp_table>;
> > > interconnects = <&gem_noc MASTER_AMPSS_M0 &mc_virt SLAVE_EBI_CH0>,
> > > @@ -146,8 +142,6 @@ CPU2: cpu@200 {
> > > capacity-dmips-mhz = <448>;
> > > dynamic-power-coefficient = <205>;
> > > next-level-cache = <&L2_200>;
> > > - power-domains = <&CPU_PD2>;
> > > - power-domain-names = "psci";
> > > qcom,freq-domain = <&cpufreq_hw 0>;
> > > operating-points-v2 = <&cpu0_opp_table>;
> > > interconnects = <&gem_noc MASTER_AMPSS_M0 &mc_virt SLAVE_EBI_CH0>,
> > > @@ -167,8 +161,6 @@ CPU3: cpu@300 {
> > > capacity-dmips-mhz = <448>;
> > > dynamic-power-coefficient = <205>;
> > > next-level-cache = <&L2_300>;
> > > - power-domains = <&CPU_PD3>;
> > > - power-domain-names = "psci";
> > > qcom,freq-domain = <&cpufreq_hw 0>;
> > > operating-points-v2 = <&cpu0_opp_table>;
> > > interconnects = <&gem_noc MASTER_AMPSS_M0 &mc_virt SLAVE_EBI_CH0>,
> > > @@ -188,8 +180,6 @@ CPU4: cpu@400 {
> > > capacity-dmips-mhz = <1024>;
> > > dynamic-power-coefficient = <379>;
> > > next-level-cache = <&L2_400>;
> > > - power-domains = <&CPU_PD4>;
> > > - power-domain-names = "psci";
> > > qcom,freq-domain = <&cpufreq_hw 1>;
> > > operating-points-v2 = <&cpu4_opp_table>;
> > > interconnects = <&gem_noc MASTER_AMPSS_M0 &mc_virt SLAVE_EBI_CH0>,
> > > @@ -209,8 +199,6 @@ CPU5: cpu@500 {
> > > capacity-dmips-mhz = <1024>;
> > > dynamic-power-coefficient = <379>;
> > > next-level-cache = <&L2_500>;
> > > - power-domains = <&CPU_PD5>;
> > > - power-domain-names = "psci";
> > > qcom,freq-domain = <&cpufreq_hw 1>;
> > > operating-points-v2 = <&cpu4_opp_table>;
> > > interconnects = <&gem_noc MASTER_AMPSS_M0 &mc_virt SLAVE_EBI_CH0>,
> > > @@ -231,8 +219,6 @@ CPU6: cpu@600 {
> > > capacity-dmips-mhz = <1024>;
> > > dynamic-power-coefficient = <379>;
> > > next-level-cache = <&L2_600>;
> > > - power-domains = <&CPU_PD6>;
> > > - power-domain-names = "psci";
> > > qcom,freq-domain = <&cpufreq_hw 1>;
> > > operating-points-v2 = <&cpu4_opp_table>;
> > > interconnects = <&gem_noc MASTER_AMPSS_M0 &mc_virt SLAVE_EBI_CH0>,
> > > @@ -252,8 +238,6 @@ CPU7: cpu@700 {
> > > capacity-dmips-mhz = <1024>;
> > > dynamic-power-coefficient = <444>;
> > > next-level-cache = <&L2_700>;
> > > - power-domains = <&CPU_PD7>;
> > > - power-domain-names = "psci";
> > > qcom,freq-domain = <&cpufreq_hw 2>;
> > > operating-points-v2 = <&cpu7_opp_table>;
> > > interconnects = <&gem_noc MASTER_AMPSS_M0 &mc_virt SLAVE_EBI_CH0>,
> > > @@ -300,42 +284,6 @@ core7 {
> > > };
> > > };
> > > };
> > > -
> > > - idle-states {
> > > - entry-method = "psci";
> > > -
> > > - LITTLE_CPU_SLEEP_0: cpu-sleep-0-0 {
> > > - compatible = "arm,idle-state";
> > > - idle-state-name = "silver-rail-power-collapse";
> > > - arm,psci-suspend-param = <0x40000004>;
> > > - entry-latency-us = <360>;
> > > - exit-latency-us = <531>;
> > > - min-residency-us = <3934>;
> > > - local-timer-stop;
> >
> > If this is temporary fix for some broke firmware or setup, I suggest to
> > just add status = "disabled" for these states. Also worth checking if keeping
> > the cpu states is okay and only cluster state is the issue or everything
> > needs to be disabled. That way it would avoid the churn when re-enabling it.
>
> That's a good option, unless we can get the other series (that fixes
> this issue) merged soon. As stated, I will help to re-spin it and then
> we can take it from there.

Hi Ulf, I just verified over multiple reboots that disabling the
cpuidle states, as suggested by Sudeep, does the trick and I no longer
see the crash.

Do you suggest we wait for the re-spin of the other series or should I
go ahead and submit that RB5 workaround for the time being?

Regards,
Amit Pundir



>
> >
> > --
> > Regards,
> > Sudeep
>
> Kind regards
> Uffe