Re: [RFC PATCH] mm: hold PTL from the first PTE while reclaiming a large folio

From: Barry Song
Date: Tue Mar 05 2024 - 04:11:57 EST


On Tue, Mar 5, 2024 at 10:08 PM Ryan Roberts <ryan.roberts@xxxxxxx> wrote:
>
> On 05/03/2024 08:56, Barry Song wrote:
> > are writing pte to zero(break) before writing a new value(make). while
>
> As an aside, "break-before-make" as defined in the Arm architecture would also
> require a TLBI, which usually isn't done for these
> write-0-modify-prots-write-back operations. Arm doesn't require
> "break-before-make" in these situations so its legal (as long as only certain
> bits are changed). To my understanding purpose of doing this is to avoid races
> with HW access/dirty flag updates; if the MMU wants to set either flag and finds
> the PTE is 0 (invalid) it will cause an exception which will be queued waiting
> for the PTL.
>
> So I don't think you really mean break-before-make here.

I agree I use a stronger term. will change it to something lighter in v2.

>
> > this behavior is within PTL in another thread, page_vma_mapped_walk()
> > of try_to_unmap_one thread won't take PTL till it meets a present PTE.
> > for example, if another threads are modifying nr_pages PTEs under PTL,
> > but we don't hold PTL, we might skip one or two PTEs at the beginning of
> > a large folio.
> > For a large folio, after try_to_unmap_one(), we may result in PTE0 and PTE1
> > untouched but PTE2~nr_pages-1 are set to swap entries.
> >
> > by holding PTL from PTE0 for large folios, we won't get these intermediate
> > values. At the moment we get PTL, other threads have done.
>

Thanks
Barry