Re: [PATCH 1/1] sched: Consider CPU contention in frequency & load-balance busiest CPU selection

From: Qais Yousef
Date: Mon May 15 2023 - 15:18:45 EST


On 05/11/23 17:25, Dietmar Eggemann wrote:
> On 04/05/2023 17:23, Qais Yousef wrote:
> > On 05/03/23 19:13, Dietmar Eggemann wrote:
> >> On 29/04/2023 16:58, Peter Zijlstra wrote:
> >>> On Thu, Apr 06, 2023 at 05:50:30PM +0200, Dietmar Eggemann wrote:
>
> [...]
>
> >>> But why, and how does it affect? That is, isn't this Changelog a wee bit
> >>> sparse?
> >>
> >> Absolutely.
> >>
> >> I have compelling test data based on JankbenchX on Pixel6 for
> >> sugov_get_util() case I will share with v2.
> >
> > I am actually still concerned this is a global win. This higher contention can
> > potentially lead to higher power usage. Not every high contention worth
> > reacting to faster. The blanket 25% headroom in map_util_perf() is already
> > problematic. And Jankbench is not a true representative of a gaming workload
> > which is what started this whole discussion. It'd be good if mediatek can
> > confirm this helps their case. Or for us to find a way to run something more
> > representative. The original ask was to be selective about being more reactive
> > for specific scenarios/workloads.
>
> I contacted MTK beginning of March this year and specifically asked them
> to see whether this patch helps their gaming use-cases or not.
> Unfortunately I haven't heard back from them.

Hmm I'm not sure if gfxbench would be benchmark to try to help here..

>
> I'm actually happy to have compelling Jankbench (which _the_ UI

I'm glad you're getting these good numbers. But I am still worried this is
might not be sufficient. My worry here is how this could impact thermal and
power in all other cases. You're assuming any contention is worth a boost.

> benchmark app) numbers on a recent mobile device (Pixel6) with v5.18
> mainline based kernel including schedutil. And I'm able to remove a lot
> of extra product-oriented features, like up/down frequency transition
> rate-limits or ADPF (Android Dynamic Performance Framework) 'CPU
> performance hints' feature. Bridging product and mainline world for
> mobile isn't easy as we all know.
>
> ---
>
> Testcase is Jankbench (all subtests, 10 iterations) on Pixel6 (Android
> 12) with mainline v5.18 kernel and forward ported task scheduler
> patches, uclamp has been deactivated to disable ADPF's 'CPU performance
> hints'.
>
> Max_frame_duration:
> +-----------------+------------+
> | kernel | value [ms] |
> +-----------------+------------+
> | base | 163.061513 |
> | runnable | 157.821346 |
> +-----------------+------------+
>
> Mean_frame_duration:
> +-----------------+------------+----------+
> | kernel | value [ms] | diff [%] |
> +-----------------+------------+----------+
> | base | 18.0 | 0.0 |
> | runnable | 12.5 | -30.64 |
> +-----------------+------------+----------+
>
> Jank percentage (Jank deadline 16ms):
> +-----------------+------------+----------+
> | kernel | value [%] | diff [%] |
> +-----------------+------------+----------+
> | base | 3.6 | 0.0 |
> | runnable | 0.8 | -76.59 |
> +-----------------+------------+----------+
>
> Power usage [mW] (total - all CPUs):
> +-----------------+------------+----------+
> | kernel | value [mW] | diff [%] |
> +-----------------+------------+----------+
> | base | 129.5 | 0.0 |
> | runnable | 129.3 | -0.15 |
> +-----------------+------------+----------+
>
> ---
>
> I assume that the MTK folks will also profit from the fact that CPU
> frequency can ramp up faster with this 'runnable boosting', especially
> when activity starts from an (almost) idle little CPU. Seeing their test
> results here would be nice though.

My worry is that this is another optimization for performance first with
disregard to potential bad power and thermal impact.

>
> If we can't make this selective we need more
> > data it won't hurt general power consumption. I plan to help with that, but my
> > focus now is on other areas first, namely getting uclamp_max usable in
> > production.
>
> This is the stalled discussion under
> https://lkml.kernel.org/r/20230205224318.2035646-1-qyousef@xxxxxxxxxxx I
> assume?
>
> IIRC, the open question was should EAS CPU selection be performed in
> case there is no CPU spare capacity (due to uclamp capping) left.
>
> [...]
>
>