Re: [PATCH v3 4/6] mm: hugetlb: add return -EAGAIN for dissolve_free_huge_page

From: Mike Kravetz
Date: Mon Jan 11 2021 - 20:22:03 EST


On 1/10/21 4:40 AM, Muchun Song wrote:
> There is a race between dissolve_free_huge_page() and put_page(),
> and the race window is quite small. Theoretically, we should return
> -EBUSY when we encounter this race. In fact, we have a chance to
> successfully dissolve the page if we do a retry. Because the race
> window is quite small. If we seize this opportunity, it is an
> optimization for increasing the success rate of dissolving page.
>
> If we free a HugeTLB page from a non-task context, it is deferred
> through a workqueue. In this case, we need to flush the work.
>
> The dissolve_free_huge_page() can be called from memory hotplug,
> the caller aims to free the HugeTLB page to the buddy allocator
> so that the caller can unplug the page successfully.
>
> Signed-off-by: Muchun Song <songmuchun@xxxxxxxxxxxxx>
> ---
> mm/hugetlb.c | 26 +++++++++++++++++++++-----
> 1 file changed, 21 insertions(+), 5 deletions(-)

I am unsure about the need for this patch. The code is OK, there are no
issues with the code.

As mentioned in the commit message, this is an optimization and could
potentially cause a memory offline operation to succeed instead of fail.
However, we are very unlikely to ever exercise this code. Adding an
optimization that is unlikely to be exercised is certainly questionable.

Memory offline is the only code that could benefit from this optimization.
As someone with more memory offline user experience, what is your opinion
Michal?

--
Mike Kravetz

>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 4a9011e12175..a176ceed55f1 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1763,10 +1763,11 @@ static int free_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed,
> * nothing for in-use hugepages and non-hugepages.
> * This function returns values like below:
> *
> - * -EBUSY: failed to dissolved free hugepages or the hugepage is in-use
> - * (allocated or reserved.)
> - * 0: successfully dissolved free hugepages or the page is not a
> - * hugepage (considered as already dissolved)
> + * -EAGAIN: race with __free_huge_page() and can do a retry
> + * -EBUSY: failed to dissolved free hugepages or the hugepage is in-use
> + * (allocated or reserved.)
> + * 0: successfully dissolved free hugepages or the page is not a
> + * hugepage (considered as already dissolved)
> */
> int dissolve_free_huge_page(struct page *page)
> {
> @@ -1793,8 +1794,10 @@ int dissolve_free_huge_page(struct page *page)
> * We should make sure that the page is already on the free list
> * when it is dissolved.
> */
> - if (unlikely(!PageHugeFreed(head)))
> + if (unlikely(!PageHugeFreed(head))) {
> + rc = -EAGAIN;
> goto out;
> + }
>
> /*
> * Move PageHWPoison flag from head page to the raw error page,
> @@ -1813,6 +1816,14 @@ int dissolve_free_huge_page(struct page *page)
> }
> out:
> spin_unlock(&hugetlb_lock);
> +
> + /*
> + * If the freeing of the HugeTLB page is put on a work queue, we should
> + * flush the work before retrying.
> + */
> + if (unlikely(rc == -EAGAIN))
> + flush_work(&free_hpage_work);
> +
> return rc;
> }
>
> @@ -1835,7 +1846,12 @@ int dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn)
>
> for (pfn = start_pfn; pfn < end_pfn; pfn += 1 << minimum_order) {
> page = pfn_to_page(pfn);
> +retry:
> rc = dissolve_free_huge_page(page);
> + if (rc == -EAGAIN) {
> + cpu_relax();
> + goto retry;
> + }
> if (rc)
> break;
> }
>