[bug report] mm, hwpoison: memory_failure races with alloc_fresh_huge_page/free_huge_page

From: Miaohe Lin
Date: Mon Aug 01 2022 - 22:01:21 EST


Hi all:
When I investigate the mm/memory-failure.c code again, I found there's a possible race window
between memory_failure and alloc_fresh_huge_page/free_huge_page. Thank about the below scene:

CPU 1 CPU 2
alloc_fresh_huge_page -- page refcnt > 0 memory_failure
prep_new_huge_page get_huge_page_for_hwpoison
!PageHeadHuge -- so 2(not a hugepage) is returned
hugetlb_vmemmap_optimize -- subpages is read-only
set_compound_page_dtor -- PageHuge is true now, but too late!!!
TestSetPageHWPoison(p)
-- We might write to read-only subpages here!!!

Another similar scene:

CPU 1 CPU 2
free_huge_page -- page refcnt == 0 and not PageHuge memory_failure
get_huge_page_for_hwpoison
!PageHeadHuge -- so 2(not a hugepage) is returned
TestSetPageHWPoison(p)
-- We might write to read-only subpages here!!!
hugetlb_vmemmap_restore -- subpages can be written to now, but too late!!!

I think the above scenes are possible. But I can't found a stable solution to fix it. Any suggestions?
Or is it not worth to fix it as it's too rare? Or am I miss something?

Any response would be appreciated!

Thanks!