Re: [PATCH 1/2] mm/hugetlb: Restore the reservation if needed

From: Breno Leitao
Date: Mon Jan 29 2024 - 07:45:47 EST


On Wed, Jan 24, 2024 at 09:22:03AM +0000, Ryan Roberts wrote:
> On 17/01/2024 17:10, Breno Leitao wrote:
> > Currently there is a bug that a huge page could be stolen, and when the
> > original owner tries to fault in it, it causes a page fault.
> >
> > You can achieve that by:
> > 1) Creating a single page
> > echo 1 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> >
> > 2) mmap() the page above with MAP_HUGETLB into (void *ptr1).
> > * This will mark the page as reserved
> > 3) touch the page, which causes a page fault and allocates the page
> > * This will move the page out of the free list.
> > * It will also unreserved the page, since there is no more free
> > page
> > 4) madvise(MADV_DONTNEED) the page
> > * This will free the page, but not mark it as reserved.
> > 5) Allocate a secondary page with mmap(MAP_HUGETLB) into (void *ptr2).
> > * it should fail, but, since there is no more available page.
> > * But, since the page above is not reserved, this mmap() succeed.
> > 6) Faulting at ptr1 will cause a SIGBUS
> > * it will try to allocate a huge page, but there is none
> > available
> >
> > A full reproducer is in selftest. See
> > https://lore.kernel.org/all/20240105155419.1939484-1-leitao@xxxxxxxxxx/
> >
> > Fix this by restoring the reserved page if necessary. If the page being
> > unmapped has HPAGE_RESV_OWNER set, and needs a reservation, set the
> > restore_reserve flag, which will move the page from free to reserved.
> >
> > Suggested-by: Rik van Riel <riel@xxxxxxxxxxx>
> > Signed-off-by: Breno Leitao <leitao@xxxxxxxxxx>
> > ---
> > mm/hugetlb.c | 10 ++++++++++
> > 1 file changed, 10 insertions(+)
> >
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index ed1581b670d4..fa2c17767e44 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -5677,6 +5677,16 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
> > hugetlb_count_sub(pages_per_huge_page(h), mm);
> > hugetlb_remove_rmap(page_folio(page));
> >
> > + if (is_vma_resv_set(vma, HPAGE_RESV_OWNER) &&
> > + vma_needs_reservation(h, vma, start)) {
> > + /*
> > + * Restore the reservation if needed, otherwise the
> > + * backing page could be stolen by someone.
> > + */
> > + folio_set_hugetlb_restore_reserve(page_folio(page));
> > + vma_add_reservation(h, vma, address);
> > + }
> > +
> > spin_unlock(ptl);
> > tlb_remove_page_size(tlb, page, huge_page_size(h));
> > /*
>
> Hi Breno,
>
> I'm seeing a kernel bug fire when running the "map_hugetlb" mm selftest against latest mm-unstable. Bisect tells me this patch is culprit. I'm running on arm64 with defconfig plus the following:

Hello Ryan,

Thanks for the heads-up. I was able to reproduce the problem, and I am
working on a solution.

Thanks,