Re: [RFC PATCH] mm/rmap: do not call mmu_notifier_invalidate_page() v3

From: Linus Torvalds
Date: Tue Aug 29 2017 - 15:58:52 EST


On Tue, Aug 29, 2017 at 12:16 PM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> And then you can check if something actually happened by catching the
> *ATOMIC* call to mmu_notifier_invalidate_page(), setting a flag, and
> then doing something blocking at mmu_notifier_invalidate_range_end()
> time.
>
> Maybe.

Note that now I have looked more at the users, I think we actually
just want to get rid of mmu_notifier_invalidate_page() entirely in
favor of just calling mmu_notifier_invalidate_range_start()/end().

Nobody seems to want an atomic version of
mmu_notifier_invalidate_page(), they are perfectly happy just getting
those range_start/end() call instead.

HOWEVER.

There do seem to be places (eg powernv/npu-dma.c, iommu/amd_iommu_v2.c
and ommu/intel-svm.c) that want to get the "invalidate_page()" or
"invalidate_range()" calls, but do *not* catch the begin/end() ones.
The "range" calls were for atomic cases, and the "page" call was for
the few places that weren't (but should have been). They seem to do
the same things.

So just switching from mmu_notifier_invalidate_page() to the
"invalidate_range_start()/end()" pair instead could break those cases.

But the mmu_notifier_invalidate_range() call has always been atomic,
afaik. It's called from the ptep_clear_flush_notify(), which is
called while holdin gthe ptl lock as far as I can tell.

So to handle the powernv/npu-dma.c, iommu/amd_iommu_v2.c and
ommu/intel-svm.c correctly, _and_ get he KVM case right, we probably
need to:

- replace the existing mmu_notifier_invalidate_page() call with
mmu_notifier_invalidate_range(), and make sure it's inside the locked
region (ie fs/dax.c too - actually move it inside the lock)

- surround the locked region with those
mmu_notifier_invalidate_range_start()/end() calls.

- get rid of mmu_notifier_invalidate_page() entirely, it had bad
semantics anyway.

and from all I can tell that should work for everybody.

But maybe I'm missing something.

Linus