Re: [PATCH] mm/memory-failure: fix hardware poison check in unpoison_memory()

From: Naoya Horiguchi
Date: Mon Jul 17 2023 - 20:14:23 EST


On Mon, Jul 17, 2023 at 11:18:12AM -0700, Sidhartha Kumar wrote:
> It was pointed out[1] that using folio_test_hwpoison() is wrong
> as we need to check the indiviual page that has poison.
> folio_test_hwpoison() only checks the head page so go back to using
> PageHWPoison().
>
> Reported-by: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx>
> Fixes: a6fddef49eef ("mm/memory-failure: convert unpoison_memory() to folios")
> Cc: stable@xxxxxxxxxxxxxxx #v6.4
> Signed-off-by: Sidhartha Kumar <sidhartha.kumar@xxxxxxxxxx>
>
> [1]: https://lore.kernel.org/lkml/ZLIbZygG7LqSI9xe@xxxxxxxxxxxxxxxxxxxx/
> ---
> mm/memory-failure.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 02b1d8f104d51..a114c8c3039cd 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -2523,7 +2523,7 @@ int unpoison_memory(unsigned long pfn)
> goto unlock_mutex;
> }
>
> - if (!folio_test_hwpoison(folio)) {
> + if (!PageHWPoison(p)) {


I don't think this works for hwpoisoned hugetlb pages that have PageHWPoison
set on the head page, rather than on the raw subpage. In the case of
hwpoisoned thps, PageHWPoison is set on the raw subpage, not on the head
pages. (I believe this is not detected because no one considers the
scenario of unpoisoning hwpoisoned thps, which is a rare case). Perhaps the
function is_page_hwpoison() would be useful for this purpose?

Thanks,
Naoya Horiguchi

> unpoison_pr_info("Unpoison: Page was already unpoisoned %#lx\n",
> pfn, &unpoison_rs);
> goto unlock_mutex;
> --
> 2.41.0
>
>
>