Re: [PATCH v2 2/3] cpufreq: qcom-nvmem: Enable virtual power domain devices

From: Ulf Hansson
Date: Thu Oct 19 2023 - 11:20:36 EST


On Thu, 19 Oct 2023 at 16:49, Stephan Gerhold <stephan@xxxxxxxxxxx> wrote:
>
> On Thu, Oct 19, 2023 at 04:12:56PM +0200, Ulf Hansson wrote:
> > On Thu, 19 Oct 2023 at 15:05, Stephan Gerhold <stephan@xxxxxxxxxxx> wrote:
> > > On Thu, Oct 19, 2023 at 01:26:19PM +0200, Ulf Hansson wrote:
> > > > On Thu, 19 Oct 2023 at 12:24, Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote:
> > > > > On Wed, 18 Oct 2023 at 10:06, Stephan Gerhold
> > > > > <stephan.gerhold@xxxxxxxxxxxxxxx> wrote:
> > > > > >
> > > > > > The genpd core caches performance state votes from devices that are
> > > > > > runtime suspended as of commit 3c5a272202c2 ("PM: domains: Improve
> > > > > > runtime PM performance state handling"). They get applied once the
> > > > > > device becomes active again.
> > > > > >
> > > > > > To attach the power domains needed by qcom-cpufreq-nvmem the OPP core
> > > > > > calls genpd_dev_pm_attach_by_id(). This results in "virtual" dummy
> > > > > > devices that use runtime PM only to control the enable and performance
> > > > > > state for the attached power domain.
> > > > > >
> > > > > > However, at the moment nothing ever resumes the virtual devices created
> > > > > > for qcom-cpufreq-nvmem. They remain permanently runtime suspended. This
> > > > > > means that performance state votes made during cpufreq scaling get
> > > > > > always cached and never applied to the hardware.
> > > > > >
> > > > > > Fix this by enabling the devices after attaching them and use
> > > > > > dev_pm_syscore_device() to ensure the power domains also stay on when
> > > > > > going to suspend. Since it supplies the CPU we can never turn it off
> > > > > > from Linux. There are other mechanisms to turn it off when needed,
> > > > > > usually in the RPM firmware (RPMPD) or the cpuidle path (CPR genpd).
> > > > >
> > > > > I believe we discussed using dev_pm_syscore_device() for the previous
> > > > > version. It's not intended to be used for things like the above.
> > > > >
> > >
> > > Sorry, looks like we still had a misunderstanding in the conclusion of
> > > the previous discussion. :')
> > >
> > > > > Moreover, I was under the impression that it wasn't really needed. In
> > > > > fact, I would think that this actually breaks things for system
> > > > > suspend/resume, as in this case the cpr driver's genpd
> > > > > ->power_on|off() callbacks are no longer getting called due this,
> > > > > which means that the cpr state machine isn't going to be restored
> > > > > properly. Or did I get this wrong?
> > > >
> > >
> > > We strictly need the RPMPDs to be always-on, also across system suspend
> > > [1]. The RPM firmware will drop the votes internally as soon as the
> > > CPU(s) have entered deep cpuidle. We can't do this from Linux, because
> > > we need the CPU to continue running until it was shut down cleanly.
> > >
> > > For CPR, we strictly need the backing regulator to be always-on, also
> > > across system suspend. Typically the hardware will turn off the
> > > regulator as soon as the CPU(s) enter deep cpuidle. Similarly, we can't
> > > do this from Linux, because we need the CPU to continue running until it
> > > was shut down cleanly.
> > >
> > > My understanding was that we're going to pause the CPR state machine
> > > using the system suspend/resume callbacks on the driver, instead of
> > > using the genpd->power_on|off() callbacks [2]. I can submit a separate
> > > patch for this.
> >
> > If we are going to do 1) as described below, this looks to me that
> > it's going to be needed.
> >
>
> Yep.
>
> > How will otherwise the cpr state machine be saved/restored during
> > system suspend/resume? Note that, beyond 1), the genpd's
> > ->power_on|off() callbacks are no longer going to be called during
> > system suspend/resume.
> >
>
> (Side note: I think "save/restore" might be the wrong words for
> suspend/resume of CPR. Looking at the code most of the configuration
> appears to be preserved across suspend/resume. Nothing is saved, it
> literally just disables the state machine during suspend and re-enables
> it during resume.
>
> I'm not entirely sure what's the reason for doing this. Perhaps the
> main goal is just to prevent the CPR state machine from getting stuck
> or sending pointless IRQs that won't be handled while Linux is
> suspended.)

If only the latter, that is a very good reason too. Drivers should
take care of their devices to make sure they are not triggering
spurious irqs during system suspend.

>
> > In a way this also means that the cpr genpd provider might as well
> > also have GENPD_FLAG_ALWAYS_ON set for it.
>
> Conceptually I would consider CPR to be a generic power domain provider
> that could supply any kind of device. I know at least of CPUs and GPUs.
> We need "always-on" only for the CPU, but not necessarily for other
> devices.
>
> For a GPU, the Linux driver (running on the CPU) can stop the GPU, wait
> for completion and then invoke the ->power_off() callback of CPR. In
> that case it is also safe to disable the backing regulator from Linux.
> (I briefly mentioned this already in the previous discussion I think.)
>
> We could set GENPD_FLAG_ALWAYS_ON for the CPR compatibles where we know
> that they are only used to supply CPUs, but if we're going to do (1)
> anyway there might not be much of an advantage for the extra complexity.

Okay, fair enough. Let's just stick with 1) and skip using
GENPD_FLAG_ALWAYS_ON bit for the cpr genpd provider.

>
> >
> > >
> > > I didn't prioritize this because QCS404 (as the only current user of
> > > CPR) doesn't have proper deep cpuidle/power management set up yet. It's
> > > not entirely clear to me if there is any advantage (or perhaps even
> > > disadvantage) if we pause the CPR state machine while the shared L2
> > > cache is still being actively powered by the CPR power rail during
> > > system suspend. I suspect this is a configuration that was never
> > > considered in the hardware design.
> >
> > I see.
> >
> > >
> > > Given the strict requirement for the RPMPDs, I only see two options:
> > >
> > > 1. Have an always-on consumer that prevents the power domains to be
> > > powered off during system suspend. This is what this patch tries to
> > > achieve.
> > >
> > > Or:
> > >
> > > 2. Come up with a way to register the RPMPDs used by the CPU with
> > > GENPD_FLAG_ALWAYS_ON. This would also be doable, but isn't as
> > > straightfoward as "regulator-always-on" in the DT because the rpmpd
> > > DT node represents multiple genpds in a single DT node [3].
> >
> > Yes, it sounds like it may be easier to do 1).
> >
> > >
> > > What do you think? Do you see some other solution perhaps? I hope we can
> > > clear up the misunderstanding. :-)
> >
> > Yes, thanks!
> >
> > >
> > > [1]: https://lore.kernel.org/linux-arm-msm/ZQGqfMigCFZP_HLA@xxxxxxxxxxx/
> > > [2]: https://lore.kernel.org/linux-arm-msm/CAPDyKFoiup8KNv=1LFGKDdDLA1pHsdJUgTTWMdgxnikEmReXzg@xxxxxxxxxxxxxx/
> > > [3]: https://lore.kernel.org/linux-arm-msm/ZSg-XtwMxg3_fWxc@xxxxxxxxxxx/
> > >
> > > > BTW, if you really need something like the above, the proper way to do
> > > > it would instead be to call device_set_awake_path() for the device.
> > > >
> > > > This informs genpd that the device needs to stay powered-on during
> > > > system suspend (assuming that GENPD_FLAG_ACTIVE_WAKEUP has been set
> > > > for it), hence it will keep the corresponding PM domain powered-on
> > > > too.
> > > >
> > >
> > > Thanks, I can try if this works as alternative to the
> > > dev_pm_syscore_device()!
> >
> > Yes, please. We don't want to abuse the dev_pm_syscore_device() thingy.
> >
>
> Could you clarify the idea behind GENPD_FLAG_ACTIVE_WAKEUP? Would I set
> it conditionally for all RPMPDs or just the ones consumed by the CPU?
> How does the genpd *provider* know if one of its *consumer* devices
> needs to have its power domain kept on for wakeup?

We are thinking of the GENPD_FLAG_ACTIVE_WAKEUP as a platform
configuration type of flag for the genpd in question. The consumer
driver shouldn't need to know about the details of what is happening
on the PM domain level - only whether it needs its device to remain
powered-on during system suspend or not.

I suspect that the GENPD_FLAG_ACTIVE_WAKEUP is probably okay to set
for most genpds, but there may be some exceptions.

Kind regards
Uffe