Re: [PATCH v4 12/17] mm: remember exclusively mapped anonymous pages with PG_anon_exclusive

From: David Hildenbrand
Date: Tue Dec 06 2022 - 03:45:06 EST




Hi David, sorry for the late respond and a possible inconsequential question. :)

Better late than never! Thanks for the review, independently at which time it happens :)


<snip>

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 7a71ed679853..5add8bbd47cd 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4772,7 +4772,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
is_hugetlb_entry_hwpoisoned(entry))) {
swp_entry_t swp_entry = pte_to_swp_entry(entry);
- if (is_writable_migration_entry(swp_entry) && cow) {
+ if (!is_readable_migration_entry(swp_entry) && cow) {
/*
* COW mappings require pages in both
* parent and child to be set to read.
@@ -5172,6 +5172,8 @@ static vm_fault_t hugetlb_cow(struct mm_struct *mm, struct vm_area_struct *vma,
set_huge_ptep_writable(vma, haddr, ptep);
return 0;
}
+ VM_BUG_ON_PAGE(PageAnon(old_page) && PageAnonExclusive(old_page),
+ old_page);
/*
* If the process that created a MAP_PRIVATE mapping is about to
@@ -6169,12 +6171,17 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma,
}
if (unlikely(is_hugetlb_entry_migration(pte))) {
swp_entry_t entry = pte_to_swp_entry(pte);
+ struct page *page = pfn_swap_entry_to_page(entry);
- if (is_writable_migration_entry(entry)) {
+ if (!is_readable_migration_entry(entry)) {

In hugetlb_change_protection(), is_writable_migration_entry() is changed to !is_readable_migration_entry(),
but

pte_t newpte;
- entry = make_readable_migration_entry(
- swp_offset(entry));
+ if (PageAnon(page))
+ entry = make_readable_exclusive_migration_entry(
+ swp_offset(entry));
+ else
+ entry = make_readable_migration_entry(
+ swp_offset(entry));
newpte = swp_entry_to_pte(entry);
set_huge_swap_pte_at(mm, address, ptep,
newpte, huge_page_size(h));

<snip>

diff --git a/mm/mprotect.c b/mm/mprotect.c
index b69ce7a7b2b7..56060acdabd3 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -152,6 +152,7 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
pages++;
} else if (is_swap_pte(oldpte)) {
swp_entry_t entry = pte_to_swp_entry(oldpte);
+ struct page *page = pfn_swap_entry_to_page(entry);
pte_t newpte;
if (is_writable_migration_entry(entry)) {

In change_pte_range(), is_writable_migration_entry() is not changed to !is_readable_migration_entry().

Yes, and also in change_huge_pmd(), is_writable_migration_entry() stays unchanged.

Is this done intentionally? Could you tell me why there's such a difference? I'm confused. It's very
kind of you if you can answer my puzzle.

For change protection, the only relevant part is to convert writable -> readable or writable -> readable_exclusive.

If an entry is already readable or readable_exclusive, there is nothing to do. The only issues would be when turning a readable one into a readable_exclusive one or a readable_exclusive one into a readable one.


In hugetlb_change_protection(), the "!is_readable_migration_entry" could in fact be turned into a "is_writable_migration_entry()". Right now, it would convert writable -> readable or writable -> readable_exclusive AND readable -> readable AND readable_exclusive -> readable_exclusive, which isn't necessary but also shouldn't hurt either.


So yeah, it's not consistent but shouldn't be problematic. Do you see an issue with that?

It would be great to extend the "selftest/vm cow" test to also cover migration entries, however, that requires slightly more work and "luck" to fork() while migration is happening.

Thanks!

--
Thanks,

David / dhildenb