Re: [PATCH v2 1/2] sched/fair: Fix how load gets propagated from cfs_rq to its sched_entity

From: Peter Zijlstra
Date: Thu May 04 2017 - 06:58:16 EST


On Thu, May 04, 2017 at 10:49:51AM +0100, Dietmar Eggemann wrote:
> On 04/05/17 07:21, Peter Zijlstra wrote:
> > On Thu, May 04, 2017 at 07:51:29AM +0200, Peter Zijlstra wrote:
> >
> >> Urgh, and my numbers were so pretty :/
> >
> > Just to clarify on how to run schbench, I limited to a single socket (as
> > that is what you have) and set -t to the number of cores in the socket
> > (not the number of threads).
> >
> > Furthermore, my machine is _idle_, if I don't do anything, it doesn't do
> > _anything_.
> >
>
> I can't recreate this problem running 'numactl -N 0 ./schbench -m 2 -t
> 10 -s 10000 -c 15000 -r 30' on my E5-2690 v2 (IVB-EP, 2 sockets, 10
> cores / socket, 2 threads / core)
>
> I tried tip/sched/core comparing running in 'cpu:/' and 'cpu:/foo' and

I'm running tip/master (I think, possibly with the numa topology fixes
in, which should be no-op on the EP).

Also, I run debian sysvinit, so nobody creating cgroups I don't know about.

> using your patch on top with all the combinations of {NO_}FUDGE,
> {NO_}FUDGE2 with prop_type=shares_avg or prop_type_runnable.
>
> Where you able to see the issue on tip/sched/core w/o your patch on your
> machine?

I see the 99.5th percentile shoot up when I run it in a cgroup.
With FUDGE2 its all good again like not using cgroups.



But yes, last time I played with schbench (when prodding at
select_idle_sibling) the thing was finicky too, I never quite got the same
numbers Chris did. But in the end we found something that worked
at both ends.