Re: problem in changing from active to passive mode

From: Rafael J. Wysocki
Date: Thu Oct 28 2021 - 14:17:15 EST


On Thu, Oct 28, 2021 at 7:57 PM Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
>
> On Thu, Oct 28, 2021 at 7:29 PM Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
> >
> > On Thu, Oct 28, 2021 at 7:10 PM Julia Lawall <julia.lawall@xxxxxxxx> wrote:
> > >
> > > > Now, for your graph 3, are you saying this pseudo
> > > > code of the process is repeatable?:
> > > >
> > > > Power up the system, booting kernel 5.9
> > > > switch to passive/schedutil.
> > > > wait X minutes for system to settle
> > > > do benchmark, result ~13 seconds
> > > > re-boot to kernel 5.15-RC
> > > > switch to passive/schedutil.
> > > > wait X minutes for system to settle
> > > > do benchmark, result ~40 seconds
> > > > re-boot to kernel 5.9
> > > > switch to passive/schedutil.
> > > > wait X minutes for system to settle
> > > > do benchmark, result ~28 seconds
> > >
> > > In the first boot of 5.9, the des (desired?) field of the HWP_REQUEST
> > > register is 0 and in the second boot (after booting 5.15 and entering
> > > passive mode) it is 10. I don't know though if this is a bug or a
> > > feature...
> >
> > It looks like a bug.
> >
> > I think that the desired value is not cleared on driver exit which
> > should happen. Let me see if I can do a quick patch for that.
>
> Please check the behavior with the attached patch applied.

Well, actually, the previous one won't do anything, because the
desired perf field is already cleared in this function before writing
the MSR, so please try the one attached to this message instead.
---
drivers/cpufreq/intel_pstate.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

Index: linux-pm/drivers/cpufreq/intel_pstate.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/intel_pstate.c
+++ linux-pm/drivers/cpufreq/intel_pstate.c
@@ -1005,9 +1005,12 @@ static void intel_pstate_hwp_offline(str
*/
value &= ~GENMASK_ULL(31, 24);
value |= HWP_ENERGY_PERF_PREFERENCE(cpu->epp_cached);
- WRITE_ONCE(cpu->hwp_req_cached, value);
}

+ /* Clear the desired perf field in the cached HWP request value. */
+ value &= ~HWP_DESIRED_PERF(~0L);
+ WRITE_ONCE(cpu->hwp_req_cached, value);
+
value &= ~GENMASK_ULL(31, 0);
min_perf = HWP_LOWEST_PERF(READ_ONCE(cpu->hwp_cap_cached));