Re: [PATCH 0/3] TLB flush multiple pages per IPI v5

From: Mel Gorman
Date: Tue Jun 09 2015 - 09:05:53 EST


On Tue, Jun 09, 2015 at 02:43:28PM +0200, Ingo Molnar wrote:
>
> * Mel Gorman <mgorman@xxxxxxx> wrote:
>
> > > Sorry, I don't buy this, at all.
> > >
> > > Please measure this, the code would become a lot simpler, as I'm not convinced
> > > that we need pfn (or struct page) or even range based flushing.
> >
> > The code will be simplier and the cost of reclaim will be lower and that is the
> > direct case but shows nothing about the indirect cost. The mapped reader will
> > benefit as it is not reusing the TLB entries and will look artifically very
> > good. It'll be very difficult for even experienced users to determine that a
> > slowdown during kswapd activity is due to increased TLB misses incurred by the
> > full flush.
>
> If so then the converse is true just as much: if you were to introduce finegrained
> flushing today, you couldn't justify it because you claim it's very hard to
> measure!
>

I'm claiming the *INDIRECT COST* is impossible to measure as part of this
series because it depends on the workload and exact CPU used. The direct
cost is measurable and can be quantified.

> Really, in such cases we should IMHO fall back to the simplest approach, and
> iterate from there.
>
> I cited very real numbers about the direct costs of TLB flushes, and plausible
> speculation about why the indirect costs are low on the achitecture you are trying
> to modify here.
>
> I think since it is you who wants to introduce additional complexity into the x86
> MM code the burden is on you to provide proof that the complexity of pfn (or
> struct page) tracking is worth it.
>

I'm taking a situation whereby IPIs are sent like crazy with interrupt
storms and replacing it with something that is a lot more efficient that
minimises the number of potential surprises. I'm stating that the benefit
of PFN tracking is unknowable in the general case because it depends on the
workload, timing and the exact CPU used so any example provided can be naked
with a counter-example such as a trivial sequential reader that shows no
benefit. The series as posted is approximately in line with current behaviour
minimising the chances of surprise regressions from excessive TLB flush.

You are actively blocking a measurable improvement and forcing it to be
replaced with something whose full impact is unquantifiable. Any regressions
in this area due to increased TLB misses could take several kernel releases
as the issue will be so difficult to detect.

I'm going to implement the approach you are forcing because there is an
x86 part of the patch and you are the maintainer that could indefinitely
NAK it. However, I'm extremely pissed about being forced to introduce
these indirect unpredictable costs because I know the alternative is you
dragging this out for weeks with no satisfactory conclusion in an argument
that I cannot prove in the general case.

--
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/