Re: [PATCH v1 2/5] mm/rmap: introduce and use hugetlb_remove_rmap()

From: David Hildenbrand
Date: Tue Nov 28 2023 - 12:42:59 EST


On 28.11.23 18:13, Peter Xu wrote:
On Tue, Nov 28, 2023 at 05:39:35PM +0100, David Hildenbrand wrote:
Quoting from the cover letter:

"We have hugetlb special-casing/checks in the callers in all cases either
way already in place: it doesn't make too much sense to call generic-looking
functions that end up doing hugetlb specific things from hugetlb
special-cases."

I'll take this one as an example: I think one goal (of my understanding of
the mm community) is to make the generic looking functions keep being
generic, dropping any function named as "*hugetlb*" if possible one day
within that generic implementation. I said that in my previous reply.

Yes, and I am one of the people asking for that. However, only where it makes sense (e.g., like page table walking, GUP as you said), and only when it is actually unified.

I don't think that rmap handling or fault handling will ever be completely unified to that extreme, and it might also not be desirable. Just like we have separate paths for anon and file in areas where they are reasonable different.

What doesn't make sense is using patterns like:

page_remove_rmap(subpage, vma, folio_test_hugetlb(folio));

and then, inside page_remove_rmap() have an initial folio_test_hugetlb() check that does something completely different.

So each and everyone calling page_remove_rmap (and knowing that it's certainly not a hugetlb folio) has to run through that check.

Then, we have functions like page_add_file_rmap() that look like they can be used for hugetlb, but hugetlb is smart enough and only calls page_dup_file_rmap(), because it doesn't want to touch any file/anon counters. And to handle that we would have to add folio_test_hugetlb() inside there, which adds the same as above, trying to unify without unifying.

Once we're in the area of folio_add_file_rmap_range(), it all stops making sense, because there is no way we could possibly partially map a folio today. (and if we can in the future, we might still want separate handling, because most caller know with which pages they are dealing, below)

Last but not least, it's all inconsistent right now with hugetlb_add_anon_rmap()/hugetlb_add_new_anon_rmap() being there because they differ reasonably well from the "ordinary" counterparts.


Having that "*hugetlb*" code already in the code base may or may not be a
good reason to further move it upward the stack.

If you see a path forward in the foreseeable future where we would have code that doesn't special-case hugetlb in rmap calling code already, I'd be interested in that.

hugetlb.c knows that it's dealing with hugetlb pages.

huge_memory.c knows that it's dealing with PMD-mapped thp.

memory.c knows that it it's dealing with PTE-mapped thp or small folios.

Only migrate.c (and e.g., try_to_unmap()) in rmap.c handle different types. But there is plenty of hugetlb special-casing in there that I don't really see going away.


Strong feelings? No, I don't have. I'm not knowledged enough to do so.

I'm sure you are, so I'm trusting your judgment :)

I don't think going in the other direction and e.g., removing hugetlb_add_anon_rmap / hugetlb_add_new_anon_rmap is making a unification that is not really reasonable. It will only make things slower and the individual functions more complicated.

--
Cheers,

David / dhildenb