Re: [PATCH v1 9/9] mm/memory: optimize unmap/zap with PTE-mapped THP

From: David Hildenbrand
Date: Wed Jan 31 2024 - 05:25:38 EST



+
+#ifndef clear_full_ptes
+/**
+ * clear_full_ptes - Clear PTEs that map consecutive pages of the same folio.

I know its implied from "pages of the same folio" (and even more so for the
above variant due to mention of access/dirty), but I wonder if its useful to
explicitly state that "all ptes being cleared are present at the time of the call"?

"Clear PTEs" -> "Clear present PTEs" ?

That should make it clearer.

[...]

if (!delay_rmap) {
- folio_remove_rmap_pte(folio, page, vma);
+ folio_remove_rmap_ptes(folio, page, nr, vma);
+
+ /* Only sanity-check the first page in a batch. */
if (unlikely(page_mapcount(page) < 0))
print_bad_pte(vma, addr, ptent, page);

Is there a case for either removing this all together or moving it into
folio_remove_rmap_ptes()? It seems odd to only check some pages.


I really wanted to avoid another nasty loop here.

In my thinking, for 4k folios, or when zapping subpages of large folios, we still perform the exact same checks. Only when batching we don't. So if there is some problem, there are ways to get it triggered. And these problems are barely ever seen.

folio_remove_rmap_ptes() feels like the better place -- especially because the delayed-rmap handling is effectively unchecked. But in there, we cannot "print_bad_pte()".

[background: if we had a total mapcount -- iow cheap folio_mapcount(), I'd check here that the total mapcount does not underflow, instead of checking per-subpage]


}
- if (unlikely(__tlb_remove_page(tlb, page, delay_rmap))) {
+ if (unlikely(__tlb_remove_folio_pages(tlb, page, nr, delay_rmap))) {
*force_flush = true;
*force_break = true;
}
}
-static inline void zap_present_pte(struct mmu_gather *tlb,
+/*
+ * Zap or skip one present PTE, trying to batch-process subsequent PTEs that map

Zap or skip *at least* one... ?

Ack

--
Cheers,

David / dhildenb