Re: [RFC PATCH] mm: hold PTL from the first PTE while reclaiming a large folio

From: Ryan Roberts
Date: Tue Mar 05 2024 - 04:08:43 EST


On 05/03/2024 08:56, Barry Song wrote:
> are writing pte to zero(break) before writing a new value(make). while

As an aside, "break-before-make" as defined in the Arm architecture would also
require a TLBI, which usually isn't done for these
write-0-modify-prots-write-back operations. Arm doesn't require
"break-before-make" in these situations so its legal (as long as only certain
bits are changed). To my understanding purpose of doing this is to avoid races
with HW access/dirty flag updates; if the MMU wants to set either flag and finds
the PTE is 0 (invalid) it will cause an exception which will be queued waiting
for the PTL.

So I don't think you really mean break-before-make here.

> this behavior is within PTL in another thread, page_vma_mapped_walk()
> of try_to_unmap_one thread won't take PTL till it meets a present PTE.
> for example, if another threads are modifying nr_pages PTEs under PTL,
> but we don't hold PTL, we might skip one or two PTEs at the beginning of
> a large folio.
> For a large folio, after try_to_unmap_one(), we may result in PTE0 and PTE1
> untouched but PTE2~nr_pages-1 are set to swap entries.
>
> by holding PTL from PTE0 for large folios, we won't get these intermediate
> values. At the moment we get PTL, other threads have done.