Re: [PATCH 0/3] TLB flush multiple pages per IPI v5

From: Ingo Molnar
Date: Tue Jun 09 2015 - 08:43:43 EST



* Mel Gorman <mgorman@xxxxxxx> wrote:

> > Sorry, I don't buy this, at all.
> >
> > Please measure this, the code would become a lot simpler, as I'm not convinced
> > that we need pfn (or struct page) or even range based flushing.
>
> The code will be simplier and the cost of reclaim will be lower and that is the
> direct case but shows nothing about the indirect cost. The mapped reader will
> benefit as it is not reusing the TLB entries and will look artifically very
> good. It'll be very difficult for even experienced users to determine that a
> slowdown during kswapd activity is due to increased TLB misses incurred by the
> full flush.

If so then the converse is true just as much: if you were to introduce finegrained
flushing today, you couldn't justify it because you claim it's very hard to
measure!

Really, in such cases we should IMHO fall back to the simplest approach, and
iterate from there.

I cited very real numbers about the direct costs of TLB flushes, and plausible
speculation about why the indirect costs are low on the achitecture you are trying
to modify here.

I think since it is you who wants to introduce additional complexity into the x86
MM code the burden is on you to provide proof that the complexity of pfn (or
struct page) tracking is worth it.

> > I.e. please first implement the simplest remote batching variant, then
> > complicate it if the numbers warrant it. Not the other way around. It's not
> > like the VM code needs the extra complexity!
>
> The simplest remote batching variant is a much more drastic change from what we
> do today and an unpredictable one. If we were to take that direction, it goes
> against the notion of making incremental changes. Even if we ultimately ended up
> with your proposal, it would make sense to separte it from this series by at
> least one release for bisection purposes. That way we get;
>
> Current: Send one IPI per page to unmap, active TLB entries preserved
> This series: Send one IPI per BATCH_TLBFLUSH_SIZE pages to unmap, active TLB entries preserved
> Your proposal: Send one IPI, flush everything, active TLB entries must refill

Not quite, my take of it is:

Current: Simplest method: send one IPI per page to unmap, active TLB
entries preserved. Remote TLB flushing cost is so high that it
probably moots any secondary effects of TLB preservation.

This series: Send one IPI per BATCH_TLBFLUSH_SIZE pages to unmap, add complex
tracking of pfn's with expensive flushing, active TLB entries
preserved. Cost of the more complex flushing are probably
higher than the refill cost, based on the numbers I gave.

My proposal: Send one IPI per BATCH_TLBFLUSH_SIZE pages to unmap that flushes
everything. TLB entries not preserved but this is expected to be
more than offset by the reduction in remote flushing costs and the
simplicity of the flushing scheme. It can still be complicated to
your proposed pfn tracking scheme, based on numbers.

Btw., have you measured the full TLB flush variant as well? If so, mind sharing
the numbers?

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/