Re: [PATCH 1/2] cpufreq: intel_pstate: Avoid missing HWP max updates in passive mode

From: Rafael J. Wysocki
Date: Fri Oct 23 2020 - 07:58:24 EST


On Fri, Oct 23, 2020 at 8:10 AM Viresh Kumar <viresh.kumar@xxxxxxxxxx> wrote:
>
> On 22-10-20, 13:57, Rafael J. Wysocki wrote:
> > Index: linux-pm/drivers/cpufreq/cpufreq.c
> > ===================================================================
> > --- linux-pm.orig/drivers/cpufreq/cpufreq.c
> > +++ linux-pm/drivers/cpufreq/cpufreq.c
> > @@ -2182,6 +2182,9 @@ int __cpufreq_driver_target(struct cpufr
> > pr_debug("target for CPU %u: %u kHz, relation %u, requested %u kHz\n",
> > policy->cpu, target_freq, relation, old_target_freq);
> >
> > + if (cpufreq_driver->target)
> > + return cpufreq_driver->target(policy, target_freq, relation);
> > +
> > /*
> > * This might look like a redundant call as we are checking it again
> > * after finding index. But it is left intentionally for cases where
> > @@ -2194,9 +2197,6 @@ int __cpufreq_driver_target(struct cpufr
> > /* Save last value to restore later on errors */
> > policy->restore_freq = policy->cur;
> >
> > - if (cpufreq_driver->target)
> > - return cpufreq_driver->target(policy, target_freq, relation);
> > -
> > if (!cpufreq_driver->target_index)
> > return -EINVAL;
>
> From what I understood, you want your driver to get notified about
> policy->min/max changes and right now they are making it work from
> within the target() callback.

Not exactly.

The driver needs to update some internal upper and lower boundary
values along with the target freq and skipping the update when target
freq doesn't change prevents it from doing that.

The policy min and max changes are communicated to the driver via the
governor ->limits() callback and that can only call
__cpufreq_driver_target() then or defer the update to the next
->fast_switch() invocation. Either way, the driver needs to have a
chance to carry out the full update even if the target frequency
doesn't change, but the policy min or max limits may have changed.

> Your commit log talks about policy->max and powersave combination,

Yes, it does, because that's a very clearly visible symptom.

> I think the same will be true in case of policy->min and performance ?

It might in theory, but it is not in practice, because the HWP min is
set to the target freq (and the target freq is already clamped between
the policy min and max).

But generally speaking you are right, this would be a problem for any
driver having to update some internal upper and lower boundary
settings along with the target freq.

> And also with any other governor (like schedutil) if the target_freq doesn't change for a while.

Well, yes.

A change of one of the limits that doesn't cause the target to change
may be missed in general.

> And IMHO, this change is more like a band-aid which is going to remove
> the check of target != cur for all target() type drivers (which aren't
> many) and it feels like a penalty on them (which is also there for
> intel-cpufreq without hwp), and that we will get into the same problem
> for target_index() drivers as well if they want to do something
> similar in future, i.e. skip checking for same-freq.
>
> Maybe adding a new flag for the cpufreq-driver for force-updates would
> be a better solution ? Which will make this very much driver
> dependent.

Fair enough, I'll add a driver flag for that.

Also I split the patch into the core part and the driver-specific part
for clarity.

Thanks!