Re: [PATCH v1] sched/fair: update_pick_idlest() Select group with lowest group_util when idle_cpus are equal

From: Phil Auld
Date: Mon Nov 09 2020 - 10:24:31 EST


Hi,

On Fri, Nov 06, 2020 at 04:00:10PM +0000 Mel Gorman wrote:
> On Fri, Nov 06, 2020 at 02:33:56PM +0100, Vincent Guittot wrote:
> > On Fri, 6 Nov 2020 at 13:03, Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > On Wed, Nov 04, 2020 at 09:42:05AM +0000, Mel Gorman wrote:
> > > > While it's possible that some other factor masked the impact of the patch,
> > > > the fact it's neutral for two workloads in 5.10-rc2 is suspicious as it
> > > > indicates that if the patch was implemented against 5.10-rc2, it would
> > > > likely not have been merged. I've queued the tests on the remaining
> > > > machines to see if something more conclusive falls out.
> > > >
> > >
> > > It's not as conclusive as I would like. fork_test generally benefits
> > > across the board but I do not put much weight in that.
> > >
> > > Otherwise, it's workload and machine-specific.
> > >
> > > schbench: (wakeup latency sensitive), all machines benefitted from the
> > > revert at the low utilisation except one 2-socket haswell machine
> > > which showed higher variability when the machine was fully
> > > utilised.
> >
> > There is a pending patch to should improve this bench:
> > https://lore.kernel.org/patchwork/patch/1330614/
> >
>
> Ok, I've slotted this one in with a bunch of other stuff I wanted to run
> over the weekend. That particular patch was on my radar anyway. It just
> got bumped up the schedule a little bit.
>


We've run some of our perf tests against various kernels in this thread.
By default RHEL configs run with the performance governor.


For 5.8 to 5.9 we can confirm Mel's results. But mostly in microbenchmarks.
We see microbenchmark hits with fork, exec and unmap. Real workloads showed
no difference between the two except for the EPYC first generation (Naples)
servers. On those systems NAS and SPECjvm2008 show a drop of about 10% but
with very high variance.


With the spread llc patch from Vincent on 5.9 we saw no performance change
in our benchmarks.


On 5.9 with and without Julia's patch showed no real performance change.
The only difference was an increase in hackbench latency on the same EPYC
first gen servers.


As I mentioned earlier in the thread we have all the 5.9 patches in this area
in our development distro kernel (plus a handful from 5.10-rc) and don't see
the same effect we see here between 5.8 and 5.9 caused by this patch. But
there are other variables there. We've queued up a comparison between that
kernel and one with just the patch in question reverted. That may tell us
if there is an effect that is otherwise being masked.


Jirka - feel free to correct me if I mis-summarized your results :)

Cheers,
Phil

--