Re: [PATCH] mm/memory-failure: fix hardware poison check in unpoison_memory()

From: Sidhartha Kumar
Date: Tue Jul 18 2023 - 10:33:18 EST


On 7/17/23 5:39 PM, Naoya Horiguchi wrote:
On Tue, Jul 18, 2023 at 09:14:09AM +0900, Naoya Horiguchi wrote:
On Mon, Jul 17, 2023 at 11:18:12AM -0700, Sidhartha Kumar wrote:
It was pointed out[1] that using folio_test_hwpoison() is wrong
as we need to check the indiviual page that has poison.
folio_test_hwpoison() only checks the head page so go back to using
PageHWPoison().

Reported-by: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx>
Fixes: a6fddef49eef ("mm/memory-failure: convert unpoison_memory() to folios")
Cc: stable@xxxxxxxxxxxxxxx #v6.4
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@xxxxxxxxxx>

[1]: https://lore.kernel.org/lkml/ZLIbZygG7LqSI9xe@xxxxxxxxxxxxxxxxxxxx/
---
mm/memory-failure.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 02b1d8f104d51..a114c8c3039cd 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -2523,7 +2523,7 @@ int unpoison_memory(unsigned long pfn)
goto unlock_mutex;
}
- if (!folio_test_hwpoison(folio)) {
+ if (!PageHWPoison(p)) {


I don't think this works for hwpoisoned hugetlb pages that have PageHWPoison
set on the head page, rather than on the raw subpage. In the case of
hwpoisoned thps, PageHWPoison is set on the raw subpage, not on the head
pages. (I believe this is not detected because no one considers the
scenario of unpoisoning hwpoisoned thps, which is a rare case). Perhaps the
function is_page_hwpoison() would be useful for this purpose?

Sorry, I was wrong. Checking PageHWPoison() is fine because the users of
unpoison should know where the PageHWPoison is set via /proc/kpageflags.
So this patch is OK to me after comments from other reviewers are resolved.


Hi Naoya,

While taking a closer at the patch, later in unpoison_memory() there is also:

- ret = TestClearPageHWPoison(page) ? 0 : -EBUSY;
+ ret = folio_test_clear_hwpoison(folio) ? 0 : -EBUSY;

I thought this folio conversion would be safe because page is the result of a compound_head() call but I'm wondering if the same issue exists here and we should be calling TestClearPageHWPoison() on the specific subpage by doing TestClearPageHWPoison(p).

Thanks,
Sidhartha Kumar

Acked-by: Naoya Horiguchi <naoya.horiguchi@xxxxxxx>

Thanks,
Naoya Horiguchi