Re: [RFT][PATCH v2 0/3] cpuidle: teo: Do not check timers unconditionally every time

From: Doug Smythies
Date: Wed Aug 09 2023 - 21:08:38 EST


Hi Rafael,

Please bear with me. As you know I have many tests
that search over a wide range of operating conditions
looking for areas to focus on in more detail.

On Tue, Aug 8, 2023 at 3:40 PM Doug Smythies <dsmythies@xxxxxxxxx> wrote:
> On Tue, Aug 8, 2023 at 9:43 AM Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
> > On Tue, Aug 8, 2023 at 5:22 PM Doug Smythies <dsmythies@xxxxxxxxx> wrote:
> > > On 2023.08.03 14:33 Rafael wrote:
> > > > On Thu, Aug 3, 2023 at 11:12 PM Rafael J. Wysocki <rjw@xxxxxxxxxxxxx> wrote:
> > > >>
> > > >> Hi Folks,
> > > >>
> > > >> This is the second iteration of:
> > > >>
> > > >> https://lore.kernel.org/linux-pm/4511619.LvFx2qVVIh@kreacher/
> > > >>
> > > >> with an additional patch.
> > > >>
> > > >> There are some small modifications of patch [1/3] and the new
> > > >> patch causes governor statistics to play a role in deciding whether
> > > >> or not to stop the scheduler tick.
> > > >>
> > > >> Testing would be much appreciated!
> > > >
> > > > For convenience, this series is now available in the following git branch:
> > > >
> > > > git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \
> > > > pm-cpuidle-teo
> > >
> > > Hi Rafael,
> > >
> > > Thank you for the git branch link.
> > >
> > > I did some testing:
>
>
> ... deleted ...
>
> > > Test 2: 6 core ping pong sweep:
> > >
> > > Pass a token between 6 CPUs on 6 different cores.
> > > Do a variable amount of work at each stop.
> > >
> > > Purpose: To utilize the midrange idle states
> > > and observe the transitions from between use of
> > > idle states.
> > >
> > > Results: There is some instability in the results
> > > in the early stages.
> > > For unknown reasons, the rjw governor sometimes works
> > > slower and at lower power. The condition is not 100%
> > > repeatable.
> > >
> > > Overall teo completed the test fastest (54.9 minutes)
> > > Followed by menu (56.2 minutes), then rjw (56.7 minutes),
> > > then ladder (58.4 minutes). teo is faster throughout the
> > > latter stages of the test, but at the cost of more power.
> > > The differences seem to be in the transition from idle
> > > state 1 to idle state 2 usage.
>
> the magnitude of the later stages differences are significant.
>
> ... deleted ...
>
> > Thanks a lot for doing this work, much appreciated!
> >
> > > Conclusions: Overall, I am not seeing a compelling reason to
> > > proceed with this patch set.
> >
> > On the other hand, if there is a separate compelling reason to do
> > that, it doesn't appear to lead to a major regression.
>
> Agreed.
>
> Just for additional information, a 6 core dwell test was run.
> The test conditions were cherry picked for dramatic effect:
>
> teo: average: 1162.13 uSec/loop ; Std dev: 0.38
> ryw: average: 1266.45 uSec/loop ; Std dev: 6.53 ; +9%
>
> teo: average: 29.98 watts
> rjw: average: 30.30 watts
> (the same within thermal experimental error)
>
> Details (power and idle stats over the 45 minute test period):
> http://smythies.com/~doug/linux/idle/teo-util2/6-13568-147097/perf/

Okay, so while differences in the sometimes selection of a deeper
idle state might be detrimental to latency sensitive workflow such as
above, it is an overwhelming benefit to periodic workflows:

Test 8: low load periodic workflow.

There is an enormous range of work/sleep frequencies and loads
to pick from. There was no cherry picking for this test.

The only criteria is that the periodic fixed packet of work is
completed before the start of the next period.

Test 8 A: 1 load at about 3% and 347 Hz work/sleep frequency:
teo average processor package power: 16.38 watts
rjw average processor package power: 4.29 watts
or 73.8% improvement!!!!!

Test 8 B: 2 loads at about 3% and 347 Hz work/sleep frequency:
teo average processor package power: 18.35 watts
rjw average processor package power: 6.67 watts
or 63.7% improvement!!!!!