Re: [PATCH v4 4/4] selftests/mm: add tests for HWPOISON hugetlbfs read

From: Sidhartha Kumar
Date: Thu Jan 11 2024 - 13:12:02 EST


On 1/11/24 10:03 AM, Matthew Wilcox wrote:
On Thu, Jan 11, 2024 at 09:51:47AM -0800, Sidhartha Kumar wrote:
On 1/11/24 9:34 AM, Jiaqi Yan wrote:
- if (!folio_test_has_hwpoisoned(folio))
+ if (!folio_test_hwpoison(folio))

Sidhartha, just curious why this change is needed? Does
PageHasHWPoisoned change after commit
"a08c7193e4f18dc8508f2d07d0de2c5b94cb39a3"?

No its not an issue PageHasHWPoisoned(), the original code is testing for
the wrong flag and I realized that has_hwpoison and hwpoison are two
different flags. The memory-failure code calls folio_test_set_hwpoison() to
set the hwpoison flag and does not set the has_hwpoison flag. When
debugging, I realized this if statement was never true despite the code
hitting folio_test_set_hwpoison(). Now we are testing the correct flag.

From page-flags.h

#ifdef CONFIG_MEMORY_FAILURE
PG_hwpoison, /* hardware poisoned page. Don't touch */
#endif

folio_test_hwpoison() checks this flag ^^^

/* At least one page in this folio has the hwpoison flag set */
PG_has_hwpoisoned = PG_error,

while folio_test_has_hwpoisoned() checks this flag ^^^

So what you're saying is that hugetlb behaves differently from THP
with how memory-failure sets the flags?

I think so, in memory_failure() THP goes through this path:

hpage = compound_head(p);
if (PageTransHuge(hpage)) {
/*
* The flag must be set after the refcount is bumped
* otherwise it may race with THP split.
* And the flag can't be set in get_hwpoison_page() since
* it is called by soft offline too and it is just called
* for !MF_COUNT_INCREASED. So here seems to be the best
* place.
*
* Don't need care about the above error handling paths for
* get_hwpoison_page() since they handle either free page
* or unhandlable page. The refcount is bumped iff the
* page is a valid handlable page.
*/
SetPageHasHWPoisoned(hpage);

which sets has_hwpoisoned flag while hugetlb goes through folio_set_hugetlb_hwpoison() which calls folio_test_set_hwpoison().