Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

From: Peter Zijlstra
Date: Tue Jan 12 2021 - 11:21:44 EST


On Tue, Jan 05, 2021 at 01:03:48PM -0500, Andrea Arcangeli wrote:
> On Tue, Jan 05, 2021 at 04:37:27PM +0100, Peter Zijlstra wrote:
> > (your other email clarified this point; the COW needs to copy while
> > holding the PTL and we need TLBI under PTL if we're to change this)
>
> The COW doesn't need to hold the PT lock, the TLBI broadcast doesn't
> need to be delivered under PT lock either.
>
> Simply there need to be a TLBI broadcast before the copy. The patch I
> sent here https://lkml.kernel.org/r/X+QLr1WmGXMs33Ld@xxxxxxxxxx that
> needs to be cleaned up with some abstraction and better commentary
> also misses a smp_mb() in the case flush_tlb_page is not called, but
> that's a small detail.

That's horrific crap. All of that tlb-pending stuff is batshit, and this
makes it worse.

> > And I'm thinking the speculative page fault series steps right into all
> > this, it fundamentally avoids mmap_sem and entirely relies on the PTL.
>
> I thought about that but that only applies to some kind of "anon" page
> fault.

That must be something new; it used to handle all faults. I specifically
spend quite a bit of time getting the file crud right (which Linus
initially fingered for being horrible broken).

SPF fundamentally elides the mmap_sem, which Linus said must serialize
faults.

> Here the problem isn't just the page fault, the problem is not to
> regress clear_refs to block on page fault I/O, and all

IIRC we do the actual reads without any locks held, just like
VM_FAULT_RETRY does today. You take the fault, find you need IO, drop
locks, do IO, retake fault.

> MAP_PRIVATE/MAP_SHARED filebacked faults bitting the disk to read
> /usr/ will still prevent clear_refs from running (and the other way
> around) if it has to take the mmap_sem for writing.
>
> I don't look at the speculative page fault for a while but last I
> checked there was nothing there that can tame the above major
> regression from CPU speed to disk I/O speed that would be inflicted on
> both clear_refs on huge mm and on uffd-wp.

All of the clear_refs nonsense is immaterial to SPF. Also, who again
cares about clear_refs? Why is it important?