Re: [RFC][PATCH 0/5] select_idle_sibling() wreckage

From: Mel Gorman
Date: Mon Jan 04 2021 - 10:41:43 EST


On Wed, Dec 23, 2020 at 02:23:41PM +0100, Vincent Guittot wrote:
> > Tests are still running on my side but early results shows perf
> > regression for hackbench
>
> Few more results before being off:
> On small embedded system, the problem seems to be mainly a matter of
> setting the right number of loops.
>
> On large smt system, The system on which I usually run my tests if
> off for now so i haven't been able to finalize tests yet but the
> problem might be that we don't loop all core anymore with this
> patchset compare to current algorithm
>

Tests ran over the holidays and are available at http://www.skynet.ie/~mel/postings/peterz-20210104/dashboard.html

I am thrawling through the data but by and large the two main
observations I've had so far are

1. The last patch seems the most problematic and the most likely to make
a large change, particularly to hackbench. For example;
http://www.skynet.ie/~mel/postings/peterz-20210104/scheduler-unbound/bing2/index.html#hackbench-thread-pipes

The idle cpu cutoff is reasonably effective even though it triggers a
lot of false positives meaning that it may be better to treat that in
isolation

2. The cost accounting one had variable impact. Generally it was small
gains and losses but tbench for low client counts is an exception as
low thread counts say variable impact. Some big losses although EPYC1
is a counter-example (toto in the dashboard)

The second issue might be responsible for the first issue, not sure.
However, it does not suprise me that properly accounting would have an
impact on the SMT depth search and likely needs tweaking.

--
Mel Gorman
SUSE Labs