RE: sched/cpufreq: Rework schedutil governor performance estimation - Regression bisected

From: Doug Smythies
Date: Sun Feb 11 2024 - 11:44:07 EST


On 2024.02.11 05:36 Vincent wrote:
> On Sat, 10 Feb 2024 at 00:16, Doug Smythies <dsmythies@xxxxxxxxx> wrote:
>> On 2024.02.09.14:11 Vincent wrote:
>>> On Fri, 9 Feb 2024 at 22:38, Doug Smythies <dsmythies@xxxxxxxxx> wrote:
>>>>
>>>> I noticed a regression in the 6.8rc series kernels. Bisecting the kernel pointed to:
>>>>
>>>> # first bad commit: [9c0b4bb7f6303c9c4e2e34984c46f5a86478f84d]
>>>> sched/cpufreq: Rework schedutil governor performance estimation
>>>>
>>>> There was previous bisection and suggestion of reversion,
>>>> but I guess it wasn't done in the end. [1]
>>>
>>> This has been fixed with
>>> https://lore.kernel.org/all/170539970061.398.16662091173685476681.tip-bot2@tip-bot2/
>>
>> Okay, thanks. I didn't find that one.
>>
>>>> The regression: reduced maximum CPU frequency is ignored.

Perhaps I should have said "sometimes ignored".
With a maximum CPU frequency for all CPUs set to 2.4 GHz and
a 100% load on CPU 5, its frequency was sampled 1000 times:
28.6% of samples were 2.4 GHz.
71.4% of samples were 4.8 GHz (the max turbo frequency)
The results are highly non-repeatable, for example another sample:
32.8% of samples were 2.4 GHz.
76.2% of samples were 4.8 GHz

Another interesting side note: If load is added to the other CPUs,
the set maximum CPU frequency is enforced.

>>
>>> This seems to be something new.
>>> schedutil doesn't impact the max_freq and it's up to cpufreq driver
>>> select the final freq which should stay within the limits
>>
>> Okay. All I know is this is the commit that caused the regression.
>
> Could you check if the fix solved your problem ?

Given the tags for that commit:

$ git tag --contains e37617c8e53a
v6.8-rc1
v6.8-rc2
v6.8-rc3

It does not solve issue I have raised herein, as it exists in v6.8-rc1 but not v6.7

>> I do not know why, but I do wonder if there could any relationship with
>> the old, never fixed, problem of incorrect stale frequencies reported
>> under the same operating conditions. See the V2 note:
>> https://lore.kernel.org/all/001d01d9d3a7$71736f50$545a4df0$@telus.net/
>
> IIUC the problem is that policy->cur is not used by intel_cpufreq and
> stays set to the last old/init value.

Yes, exactly.

> Do I get it right that this is only informative ?

I don't know, that is what I was wondering. I do not know if the two issues
are related or not.

> Normally cpufreq governor checks the new limits and updates current
> freq if necessary except when fast switch is enabled.

>> where I haven't been able to figure out a solution.

>>>> Conditions:
>>>> CPU frequency scaling driver: intel_cpufreq (a.k.a intel_pstate in passive mode)
>>>> CPU frequency scaling governor: schedutil
>>>> HWP (HardWare Pstate) control (a.k.a. Intel_speedshift): Enabled
>>>> Processor: Intel(R) Core(TM) i5-10600K CPU @ 4.10GHz
>>>>
>>>> I did not check any other conditions, i.e. HWP disabled or the acpi-cpufreq driver.

Changing from HWP enabled to HWP disabled, it works properly.

..

>>>> [1] https://lore.kernel.org/all/CAKfTPtDCQuJjpi6=zjeWPcLeP+ZY5Dw7XDrZ-LpXqEAAUbXLhA@xxxxxxxxxxxxxx/