Re: [RFC PATCH v1 2/4] mm,hwpoison,hugetlb,memory_hotplug: hotremove memory section with hwpoisoned hugepage

From: Miaohe Lin
Date: Fri Apr 29 2022 - 04:49:25 EST


On 2022/4/27 12:28, Naoya Horiguchi wrote:
> From: Naoya Horiguchi <naoya.horiguchi@xxxxxxx>
>
> HWPoisoned page is not supposed to prevent memory hotremove, but
> currently this does not properly work for hwpoisoned hugepages and the
> kernel tries to migrate them, which could cause consuming corrupted
> data.
>
> Move dissolve_free_huge_pages() before scan_movable_pages(). This is
> because the result of the movable check depends on the result of the
> dissolve. Now delayed dissolve is available, so hwpoisoned hugepages
> can be turned into 4kB hwpoison page which memory hotplug can handle.
>
> And clear HPageMigratable pseudo flag for hwpoisoned hugepages. This is
> also important because dissolve_free_huge_page() can fail. So it's
> still necessary to prevent do_migrate_pages() from trying to migrate
> hwpoison hugepages.
>
> Reported-by: Miaohe Lin <linmiaohe@xxxxxxxxxx>
> Signed-off-by: Naoya Horiguchi <naoya.horiguchi@xxxxxxx>
> ---
> mm/hugetlb.c | 11 +++++++++++
> mm/memory-failure.c | 2 ++
> mm/memory_hotplug.c | 23 +++++++++++------------
> 3 files changed, 24 insertions(+), 12 deletions(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 6867ea8345d1..95b1db852ca9 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -2159,6 +2159,17 @@ int dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn)
>
> for (pfn = start_pfn; pfn < end_pfn; pfn += 1 << minimum_order) {
> page = pfn_to_page(pfn);
> +
> + if (PageHuge(page) && PageHWPoison(page)) {
> + /*
> + * Release the last refcount from hwpoison to turn into
> + * a free hugepage.
> + */
> + if (page_count(page) == 1)
> + put_page(page);
> + page = hugetlb_page_hwpoison(page);
> + }
> +

This patch looks good to me. Thanks!

One question: Can this hugepage be put into buddy system? In free_huge_page,
if h->surplus_huge_pages_node[nid] > 0, hugepage might be put into buddy via
update_and_free_page. So it's not PageHuge anymore and won't be dissolved. If
this happens, the "raw error page" is still missed and might be accessed later.
Or am I miss something?

Thanks!

> rc = dissolve_free_huge_page(page);
> if (rc)
> break;
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 73948a00ad4a..4a2e22bf0983 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1607,6 +1607,8 @@ static int try_memory_failure_hugetlb(unsigned long pfn, int flags, int *hugetlb
> return res == MF_RECOVERED ? 0 : -EBUSY;
> }
>
> + ClearHPageMigratable(head);
> +
> page_flags = head->flags;
>
> /*
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 416b38ca8def..4bc0590f4334 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1864,6 +1864,17 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages,
>
> cond_resched();
>
> + /*
> + * Dissolve free hugepages in the memory block before doing
> + * offlining actually in order to make hugetlbfs's object
> + * counting consistent.
> + */
> + ret = dissolve_free_huge_pages(start_pfn, end_pfn);
> + if (ret) {
> + reason = "failure to dissolve huge pages";
> + goto failed_removal_isolated;
> + }
> +
> ret = scan_movable_pages(pfn, end_pfn, &pfn);
> if (!ret) {
> /*
> @@ -1879,19 +1890,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages,
> goto failed_removal_isolated;
> }
>
> - /*
> - * Dissolve free hugepages in the memory block before doing
> - * offlining actually in order to make hugetlbfs's object
> - * counting consistent.
> - */
> - ret = dissolve_free_huge_pages(start_pfn, end_pfn);
> - if (ret) {
> - reason = "failure to dissolve huge pages";
> - goto failed_removal_isolated;
> - }
> -
> ret = test_pages_isolated(start_pfn, end_pfn, MEMORY_OFFLINE);
> -
> } while (ret);
>
> /* Mark all sections offline and remove free pages from the buddy. */
>