Re: [PATCH 4/4] hugetlb: Do early cow when page pinned on src mm

From: Peter Xu
Date: Wed Feb 03 2021 - 17:32:09 EST


On Wed, Feb 03, 2021 at 02:04:30PM -0800, Mike Kravetz wrote:
> > @@ -3816,6 +3832,54 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
> > }
> > set_huge_swap_pte_at(dst, addr, dst_pte, entry, sz);
> > } else {
> > + entry = huge_ptep_get(src_pte);
> > + ptepage = pte_page(entry);
> > + get_page(ptepage);
> > +
> > + if (unlikely(page_needs_cow_for_dma(vma, ptepage))) {
> > + /* This is very possibly a pinned huge page */
> > + if (!prealloc) {
> > + /*
> > + * Preallocate the huge page without
> > + * tons of locks since we could sleep.
> > + * Note: we can't use any reservation
> > + * because the page will be exclusively
> > + * owned by the child later.
> > + */
> > + put_page(ptepage);
> > + spin_unlock(src_ptl);
> > + spin_unlock(dst_ptl);
> > + prealloc = alloc_huge_page(vma, addr, 0);
>
> One quick question:
>
> The comment says we can't use any reservation, and I agree. However, the
> alloc_huge_page call has 0 as the avoid_reserve argument. Shouldn't that
> be !0 to avoid reserves?

Good point.. so I obviously wanted to skip reservation check but successfully
got cheated by the inverted name. :)

Though I do checked the reservation, so it seems not extremely important - when
we fork and copy the vma, we have already dropped the vma resv map:

if (is_vm_hugetlb_page(tmp))
reset_vma_resv_huge_pages(tmp);

Then in alloc_huge_page() we checked vma_resv_map() mostly everywhere we'd
check avoid_reserve too (either in vma_needs_reservation, or calculating
deferred_reserve). It seems to be mostly useful when vma_resv_map() existed.

But I completely agree I should pass in "1" here in v2.

Thanks,

--
Peter Xu