Re: [PATCH 4/7] x86,tlb: make lazy TLB mode lazier

From: Rik van Riel
Date: Mon Jul 23 2018 - 08:26:40 EST


On Fri, 2018-07-20 at 10:30 +0200, Peter Zijlstra wrote:
> On Thu, Jul 19, 2018 at 10:04:09AM -0700, Andy Lutomirski wrote:
> > I added some more arch maintainers. The idea here is that, on x86
> > at
> > least, task->active_mm and all its refcounting is pure
> > overhead. When
> > a process exits, __mmput() gets called, but the core kernel has a
> > longstanding "optimization" in which other tasks (kernel threads
> > and
> > idle tasks) may have ->active_mm pointing at this mm. This is
> > nasty,
> > complicated, and hurts performance on large systems, since it
> > requires
> > extra atomic operations whenever a CPU switches between real users
> > threads and idle/kernel threads.
> >
> > It's also almost completely worthless on x86 at least, since
> > __mmput()
> > frees pagetables, and that operation *already* forces a remote TLB
> > flush, so we might as well zap all the active_mm references at the
> > same time.
>
> So I disagree that active_mm is complicated (the code is less than
> ideal
> but that is actually fixable). And aside from the process exit case,
> it
> does avoid CR3 writes when switching between user and kernel threads
> (which can be far more often than exit if you have longer running
> tasks).
>
> Now agreed, recent x86 work has made that less important.
>
> And I of course also agree that not doing those refcount atomics is
> better.

It might be cleaner to keep the ->active_mm pointer
in place for now (at least in the first patch), even
on architectures where we end up dropping the refcounting.

That way the code is more similar everywhere, and
we just get rid of the expensive instructions.

Let me try coding this up...

--
All Rights Reversed.

Attachment: signature.asc
Description: This is a digitally signed message part