Re: [PATCH 02/13] mm/rmap: update to new mmu_notifier semantic

From: Andrea Arcangeli
Date: Wed Aug 30 2017 - 16:55:49 EST


On Wed, Aug 30, 2017 at 11:40:08AM -0700, Nadav Amit wrote:
> The mmu_notifier users would have to be aware that invalidations may be
> deferred. If they perform their ``invalidationsââ unconditionally, it may be
> ok. If the notifier users avoid invalidations based on the PTE in the
> secondary page-table, it can be a problem.

invalidate_page was always deferred post PT lock release.

This ->invalidate_range post PT lock release, is not a new thing,
we're still back to squre one to find out if invalidate_page callout
after PT lock release has always been broken here or not.

> On another note, you may want to consider combining the secondary page-table
> mechanisms with the existing TLB-flush mechanisms. Right now, it is
> partially done: tlb_flush_mmu_tlbonly(), for example, calls
> mmu_notifier_invalidate_range(). However, tlb_gather_mmu() does not call
> mmu_notifier_invalidate_range_start().

If you implement ->invalidate_range_start you don't care about tlb
gather at all and you must not implement ->invalidate_range.

> This can also prevent all kind of inconsistencies, and potential bugs. For
> instance, clear_refs_write() calls mmu_notifier_invalidate_range_start/end()
> but in between there is no call for mmu_notifier_invalidate_range().

It's done in mmu_notifier_invalidate_range_end which is again fully
equivalent except run after PT lock release.