Re: [RFC PATCH] mm: support large folio numa balancing

From: David Hildenbrand
Date: Wed Nov 15 2023 - 05:47:22 EST


On 15.11.23 11:46, David Hildenbrand wrote:
On 13.11.23 11:45, Baolin Wang wrote:
Currently, the file pages already support large folio, and supporting for
anonymous pages is also under discussion[1]. Moreover, the numa balancing
code are converted to use a folio by previous thread[2], and the migrate_pages
function also already supports the large folio migration.

So now I did not see any reason to continue restricting NUMA balancing for
large folio.

[1] https://lkml.org/lkml/2023/9/29/342
[2] https://lore.kernel.org/all/20230921074417.24004-4-wangkefeng.wang@xxxxxxxxxx/T/#md9d10fe34587229a72801f0d731f7457ab3f4a6e
Signed-off-by: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx>
---

I'll note that another piece is missing, and I'd be curious how you
tested your patch set or what I am missing. (no anonymous pages?)

change_pte_range() contains:

if (prot_numa) {
...
/* Also skip shared copy-on-write pages */
if (is_cow_mapping(vma->vm_flags) &&
folio_ref_count(folio) != 1)
continue;

So we'll never end up mapping an anon PTE-mapped THP prot-none (well, unless a
single PTE remains) and consequently never trigger NUMA hinting faults.

Now, that change has some history [1], but the original problem has been
sorted out in the meantime. But we should consider Linus' original feedback.

For pte-mapped THP, we might want to do something like the following
(completely untested):

diff --git a/mm/mprotect.c b/mm/mprotect.c
index 81991102f785..c4e6b9032e40 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -129,7 +129,8 @@ static long change_pte_range(struct mmu_gather *tlb,
/* Also skip shared copy-on-write pages */
if (is_cow_mapping(vma->vm_flags) &&
- folio_ref_count(folio) != 1)
+ (folio_maybe_dma_pinned(folio) ||
+ folio_estimated_sharers(folio) != 1))

Actually, > 1 might be better if the first subpage is not mapped; it's a mess.

--
Cheers,

David / dhildenb