Re: [PATCH v2 2/2] cpufreq: intel_pstate: Conditional frequency invariant accounting

From: Rafael J. Wysocki
Date: Thu Oct 03 2019 - 14:05:46 EST


On Wednesday, October 2, 2019 2:29:26 PM CEST Giovanni Gherdovich wrote:
> From: Srinivas Pandruvada <srinivas.pandruvada@xxxxxxxxxxxxxxx>
>
> intel_pstate has two operating modes: active and passive. In "active"
> mode, the in-built scaling governor is used and in "passive" mode,
> the driver can be used with any governor like "schedutil". In "active"
> mode the utilization values from schedutil is not used and there is
> a requirement from high performance computing use cases, not to read
> any APERF/MPERF MSRs.

Well, this isn't quite convincing.

In particular, I don't see why the "don't read APERF/MPERF MSRs" argument
applies *only* to intel_pstate in the "active" mode. What about intel_pstate
in the "passive" mode combined with the "performance" governor? Or any other
governor different from "schedutil" for that matter?

And what about acpi_cpufreq combined with any governor different from
"schedutil"?

Scale invariance is not really needed in all of those cases right now AFAICS,
or is it?

So is the real concern that intel_pstate in the "active" mode reads the MPERF
and APERF MSRs by itself and that kind of duplicates what the scale invariance
code does and is redundant etc?