Re: [PATCH] hugetlb: unshare some PMDs when splitting VMAs

From: Mike Kravetz
Date: Tue Jan 03 2023 - 17:28:39 EST


On 01/03/23 20:26, James Houghton wrote:
> > Thanks James. I am just trying to determine if we may have any issues/bugs/
> > undesired behavior based on this today. Consider the cases mentioned above:
> > mbind - I do not think this would cause any user visible issues. mbind is
> > only dealing with newly allocated pages. We do not unshare as the
> > result of a mbind call today.
> > madvise(MADV_DONTDUMP) - It looks like this results in a flag (VM_DONTDUMP)
> > being set on the vma. So, I do not believe sharing page tables
> > would cause any user visible issue.
> >
> > One somewhat strange things about two vmas after split sharing a PMD is
> > that operations on one VMA can impact the other. For example, suppose
> > A VMA split via mbind happens. Then later, mprotect is done on one of
> > the VMAs in the range that is shared. That would result in the area being
> > unshared in both VMAs. So, the 'other' vma could see minor faults after
> > the mprotect.
> >
> > Just curious if you (or anyone) knows of a user visible issue caused by this
> > today. Trying to determine if we need a Fixes: tag.
>
> I think I've come up with one... :) It only took many many hours of
> staring at code to come up with:
>
> 1. Fault in PUD_SIZE-aligned hugetlb mapping
> 2. fork() (to actually share the PMDs)
> Erm, I mean: mmap(), then fork(), then fault in both processes.
> 3. Split VMA with MADV_DONTDUMP
> 4. Register the lower piece of the newly split VMA with
> UFFDIO_REGISTER_MODE_WRITEPROTECT (this will call
> hugetlb_unshare_all_pmds, but it will not attempt to unshare in the
> unaligned bits now)
> 5. Now calling UFFDIO_WRITEPROTECT will drop into
> hugetlb_change_protection and succeed in unsharing. That will hit the
> WARN_ON_ONCE and *not write-protect anything*.
>
> I'll see if I can confirm that this is indeed possible and send a
> repro if it is.

I think your analysis above is correct. The key being the failure to unshare
in the non-PUD_SIZE vma after the split.

To me, the fact it was somewhat difficult to come up with this scenario is an
argument what we should just unshare at split time as you propose. Who
knows what other issues may exist.

> 60dfaad65a ("mm/hugetlb: allow uffd wr-protect none ptes") is the
> commit that introduced the WARN_ON_ONCE; perhaps it's a good choice
> for a Fixes: tag (if above is indeed true).

If the key issue in your above scenario is indeed the failure of
hugetlb_unshare_all_pmds in the non-PUD_SIZE vma, then perhaps we tag?

6dfeaff93be1 ("hugetlb/userfaultfd: unshare all pmds for hugetlbfs when
register wp")

--
Mike Kravetz