Re: [PATCH v2 2/3] cpufreq: qcom-nvmem: Enable virtual power domain devices

From: Ulf Hansson
Date: Thu Oct 19 2023 - 10:13:37 EST


On Thu, 19 Oct 2023 at 15:05, Stephan Gerhold <stephan@xxxxxxxxxxx> wrote:
>
> On Thu, Oct 19, 2023 at 01:26:19PM +0200, Ulf Hansson wrote:
> > On Thu, 19 Oct 2023 at 12:24, Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote:
> > > On Wed, 18 Oct 2023 at 10:06, Stephan Gerhold
> > > <stephan.gerhold@xxxxxxxxxxxxxxx> wrote:
> > > >
> > > > The genpd core caches performance state votes from devices that are
> > > > runtime suspended as of commit 3c5a272202c2 ("PM: domains: Improve
> > > > runtime PM performance state handling"). They get applied once the
> > > > device becomes active again.
> > > >
> > > > To attach the power domains needed by qcom-cpufreq-nvmem the OPP core
> > > > calls genpd_dev_pm_attach_by_id(). This results in "virtual" dummy
> > > > devices that use runtime PM only to control the enable and performance
> > > > state for the attached power domain.
> > > >
> > > > However, at the moment nothing ever resumes the virtual devices created
> > > > for qcom-cpufreq-nvmem. They remain permanently runtime suspended. This
> > > > means that performance state votes made during cpufreq scaling get
> > > > always cached and never applied to the hardware.
> > > >
> > > > Fix this by enabling the devices after attaching them and use
> > > > dev_pm_syscore_device() to ensure the power domains also stay on when
> > > > going to suspend. Since it supplies the CPU we can never turn it off
> > > > from Linux. There are other mechanisms to turn it off when needed,
> > > > usually in the RPM firmware (RPMPD) or the cpuidle path (CPR genpd).
> > >
> > > I believe we discussed using dev_pm_syscore_device() for the previous
> > > version. It's not intended to be used for things like the above.
> > >
>
> Sorry, looks like we still had a misunderstanding in the conclusion of
> the previous discussion. :')
>
> > > Moreover, I was under the impression that it wasn't really needed. In
> > > fact, I would think that this actually breaks things for system
> > > suspend/resume, as in this case the cpr driver's genpd
> > > ->power_on|off() callbacks are no longer getting called due this,
> > > which means that the cpr state machine isn't going to be restored
> > > properly. Or did I get this wrong?
> >
>
> We strictly need the RPMPDs to be always-on, also across system suspend
> [1]. The RPM firmware will drop the votes internally as soon as the
> CPU(s) have entered deep cpuidle. We can't do this from Linux, because
> we need the CPU to continue running until it was shut down cleanly.
>
> For CPR, we strictly need the backing regulator to be always-on, also
> across system suspend. Typically the hardware will turn off the
> regulator as soon as the CPU(s) enter deep cpuidle. Similarly, we can't
> do this from Linux, because we need the CPU to continue running until it
> was shut down cleanly.
>
> My understanding was that we're going to pause the CPR state machine
> using the system suspend/resume callbacks on the driver, instead of
> using the genpd->power_on|off() callbacks [2]. I can submit a separate
> patch for this.

If we are going to do 1) as described below, this looks to me that
it's going to be needed.

How will otherwise the cpr state machine be saved/restored during
system suspend/resume? Note that, beyond 1), the genpd's
->power_on|off() callbacks are no longer going to be called during
system suspend/resume.

In a way this also means that the cpr genpd provider might as well
also have GENPD_FLAG_ALWAYS_ON set for it.

>
> I didn't prioritize this because QCS404 (as the only current user of
> CPR) doesn't have proper deep cpuidle/power management set up yet. It's
> not entirely clear to me if there is any advantage (or perhaps even
> disadvantage) if we pause the CPR state machine while the shared L2
> cache is still being actively powered by the CPR power rail during
> system suspend. I suspect this is a configuration that was never
> considered in the hardware design.

I see.

>
> Given the strict requirement for the RPMPDs, I only see two options:
>
> 1. Have an always-on consumer that prevents the power domains to be
> powered off during system suspend. This is what this patch tries to
> achieve.
>
> Or:
>
> 2. Come up with a way to register the RPMPDs used by the CPU with
> GENPD_FLAG_ALWAYS_ON. This would also be doable, but isn't as
> straightfoward as "regulator-always-on" in the DT because the rpmpd
> DT node represents multiple genpds in a single DT node [3].

Yes, it sounds like it may be easier to do 1).

>
> What do you think? Do you see some other solution perhaps? I hope we can
> clear up the misunderstanding. :-)

Yes, thanks!

>
> [1]: https://lore.kernel.org/linux-arm-msm/ZQGqfMigCFZP_HLA@xxxxxxxxxxx/
> [2]: https://lore.kernel.org/linux-arm-msm/CAPDyKFoiup8KNv=1LFGKDdDLA1pHsdJUgTTWMdgxnikEmReXzg@xxxxxxxxxxxxxx/
> [3]: https://lore.kernel.org/linux-arm-msm/ZSg-XtwMxg3_fWxc@xxxxxxxxxxx/
>
> > BTW, if you really need something like the above, the proper way to do
> > it would instead be to call device_set_awake_path() for the device.
> >
> > This informs genpd that the device needs to stay powered-on during
> > system suspend (assuming that GENPD_FLAG_ACTIVE_WAKEUP has been set
> > for it), hence it will keep the corresponding PM domain powered-on
> > too.
> >
>
> Thanks, I can try if this works as alternative to the
> dev_pm_syscore_device()!

Yes, please. We don't want to abuse the dev_pm_syscore_device() thingy.

>
> I will wait for your thoughts on the above before accidentally going
> into the wrong direction again. :-)

No worries, we are moving forward. :-)

Kind regards
Uffe