Re: [RFC PATCH] mm: support large folio numa balancing

From: Baolin Wang
Date: Sun Nov 19 2023 - 22:28:07 EST




On 11/15/2023 6:47 PM, David Hildenbrand wrote:
On 15.11.23 11:46, David Hildenbrand wrote:
On 13.11.23 11:45, Baolin Wang wrote:
Currently, the file pages already support large folio, and supporting for
anonymous pages is also under discussion[1]. Moreover, the numa balancing
code are converted to use a folio by previous thread[2], and the migrate_pages
function also already supports the large folio migration.

So now I did not see any reason to continue restricting NUMA balancing for
large folio.

[1] https://lkml.org/lkml/2023/9/29/342
[2] https://lore.kernel.org/all/20230921074417.24004-4-wangkefeng.wang@xxxxxxxxxx/T/#md9d10fe34587229a72801f0d731f7457ab3f4a6e
Signed-off-by: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx>
---

I'll note that another piece is missing, and I'd be curious how you
tested your patch set or what I am missing. (no anonymous pages?)

I tested it with file large folio (order = 4) created by XFS filesystem.

change_pte_range() contains:

if (prot_numa) {
    ...
    /* Also skip shared copy-on-write pages */
    if (is_cow_mapping(vma->vm_flags) &&
        folio_ref_count(folio) != 1)
        continue;

So we'll never end up mapping an anon PTE-mapped THP prot-none (well, unless a
single PTE remains) and consequently never trigger NUMA hinting faults.

Now, that change has some history [1], but the original problem has been
sorted out in the meantime. But we should consider Linus' original feedback.

For pte-mapped THP, we might want to do something like the following
(completely untested):

Thanks for pointing out. I have not tried pte-mapped THP yet, and will look at it in detail.

diff --git a/mm/mprotect.c b/mm/mprotect.c
index 81991102f785..c4e6b9032e40 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -129,7 +129,8 @@ static long change_pte_range(struct mmu_gather *tlb,
                                  /* Also skip shared copy-on-write pages */
                                  if (is_cow_mapping(vma->vm_flags) &&
-                                   folio_ref_count(folio) != 1)
+                                   (folio_maybe_dma_pinned(folio) ||
+                                    folio_estimated_sharers(folio) != 1))

Actually, > 1 might be better if the first subpage is not mapped; it's a mess.