Re: [PATCH 2/2 v2] sched: use load_avg for selecting idlest group

From: Vincent Guittot
Date: Thu Dec 08 2016 - 09:33:55 EST


On 8 December 2016 at 15:09, Matt Fleming <matt@xxxxxxxxxxxxxxxxxxx> wrote:
> On Mon, 05 Dec, at 01:35:46PM, Matt Fleming wrote:
>> On Mon, 05 Dec, at 10:27:36AM, Vincent Guittot wrote:
>> >
>> > Hi Matt,
>> >
>> > Thanks for the results.
>> >
>> > During the review, it has been pointed out by Morten that the test condition
>> > (100*this_avg_load < imbalance_scale*min_avg_load) makes more sense than
>> > (100*min_avg_load > imbalance_scale*this_avg_load). But i see lower
>> > performances with this change. Coud you run tests with the change below on
>> > top of the patchset ?
>> >
>> > ---
>> > kernel/sched/fair.c | 2 +-
>> > 1 file changed, 1 insertion(+), 1 deletion(-)
>> >
>> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> > index e8d1ae7..0129fbb 100644
>> > --- a/kernel/sched/fair.c
>> > +++ b/kernel/sched/fair.c
>> > @@ -5514,7 +5514,7 @@ find_idlest_group(struct sched_domain *sd, struct task_struct *p,
>> > if (!idlest ||
>> > (min_runnable_load > (this_runnable_load + imbalance)) ||
>> > ((this_runnable_load < (min_runnable_load + imbalance)) &&
>> > - (100*min_avg_load > imbalance_scale*this_avg_load)))
>> > + (100*this_avg_load < imbalance_scale*min_avg_load)))
>> > return NULL;
>> > return idlest;
>> > }
>>
>> Queued for testing.
>
> Most of the machines didn't notice the difference with this new patch.
> However, I did come across one test that showed a negative change,

OK so you don't see perf impact between the 2 test condition except
for the machine below
So IMHO, we should use the new test condition which tries to keep task
local if there is no other CPU that is obviously less loaded.

I'm going to prepare a new version of the patchset and apply all comments

Thanks
Vincent

>
>
> hackbench-thread-pipes
> 4.9.0-rc6 4.9.0-rc6 4.9.0-rc6 4.9.0-rc6
> tip-sched fix-fig-for-fork fix-sig fix-fig-for-fork-v2
> Amean 1 0.1266 ( 0.00%) 0.1269 ( -0.23%) 0.1287 ( -1.69%) 0.1357 ( -7.22%)
> Amean 4 0.4989 ( 0.00%) 0.5174 ( -3.72%) 0.5251 ( -5.27%) 0.5117 ( -2.58%)
> Amean 7 0.8510 ( 0.00%) 0.8517 ( -0.08%) 0.8964 ( -5.34%) 0.8801 ( -3.42%)
> Amean 12 1.0699 ( 0.00%) 1.0484 ( 2.00%) 1.0147 ( 5.15%) 1.0759 ( -0.56%)
> Amean 21 1.2816 ( 0.00%) 1.2140 ( 5.27%) 1.1879 ( 7.31%) 1.2414 ( 3.13%)
> Amean 30 1.4440 ( 0.00%) 1.4003 ( 3.03%) 1.3969 ( 3.26%) 1.4057 ( 2.65%)
> Amean 48 1.5773 ( 0.00%) 1.5983 ( -1.33%) 1.3984 ( 11.34%) 1.5624 ( 0.94%)
> Amean 79 2.2343 ( 0.00%) 2.3066 ( -3.24%) 2.0053 ( 10.25%) 2.2630 ( -1.29%)
> Amean 96 2.6736 ( 0.00%) 2.4313 ( 9.06%) 2.4181 ( 9.55%) 2.4717 ( 7.55%)
>
> 4.9.0-rc6 4.9.0-rc6 4.9.0-rc6 4.9.0-rc6
> tip-schedfix-fig-for-fork fix-sigfix-fig-for-fork-v2
> User 129.53 128.64 127.70 131.00
> System 1784.54 1744.21 1654.08 1744.00
> Elapsed 92.07 90.44 86.95 91.00
>
> Looking at the 48 and 79 groups rows for mean there's a noticeable
> drop off in performance of ~10%, which should be outside of the noise
> for this test. This is a 2 socket, 4 NUMA node (yes, really), 24 cpus
> AMD opteron circa 2010.
>
> Given the age of this machine, I don't think it's worth holding up the
> patch.