Re: [PATCH 0/4] Fix ebizzy performance regression due to X86 TLBrange flush v2

From: Mel Gorman
Date: Tue Dec 17 2013 - 12:54:50 EST


On Tue, Dec 17, 2013 at 03:42:14PM +0100, Ingo Molnar wrote:
>
> * Mel Gorman <mgorman@xxxxxxx> wrote:
>
> > [...]
> >
> > At that point it'll be time to look at profiles and see where we are
> > actually spending time because the possibilities of finding things
> > to fix through bisection will be exhausted.
>
> Yeah.
>
> One (heavy handed but effective) trick that can be used in such a
> situation is to just revert everything that is causing problems, and
> continue reverting until we get back to a v3.4 baseline performance.
>

Very tempted but the potential timeframe here is very large and the number
of patches could be considerable. Some patches cause a lot of noise. For
example, one patch enabled ACPI cpufreq driver loading which looks like
a regression during that window but it's a side-effect that gets fixed
later. It'll take time to identify all the patches that potentially cause
problems.

> Once such a 'clean' tree (or queue of patches) is achived, that can be
> used as a measurement base and the individual features can be
> re-applied again, one by one, with measurement and analysis becoming a
> lot easier.
>

Ordinarily I would agree with you but would prefer a shorter window for
that type of strategy.

> > > Also it appears the Ebizzy numbers ought to be stable enough now
> > > to make the range-TLB-flush measurements more precise?
> >
> > Right now, the tlbflush microbenchmark figures look awful on the
> > 8-core machine when the tlbflush shift patch and the schedule domain
> > fix are both applied.
>
> I think that furthr strengthens the case for the 'clean base' approach
> I outlined above - but it's your call obviously ...
>

I'll keep it as plan b if it cannot be fixed with a direct approach.

> Thanks again for going through all this. Tracking multi-commit
> performance regressions across 1.5 years worth of commits is generally
> very hard. Does your testing effort comes from enterprise Linux QA
> testing, or did you ran into this problem accidentally?
>

It does not come from enterprise Linux QA testing but it's motivated by
it. I want to catch as many "obvious" performance bugs before they do as
it saves time and stress in the long run. To assist that, I setup continual
performance regression testing and ebizzy was included in the first report
I opened. It makes me worry what the rest of the reports contain.

--
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/