Re: [PATCH v1 1/3] mm/hwpoison: find subpage in hugetlb HWPOISON list

From: Mike Kravetz
Date: Tue Jun 20 2023 - 18:40:00 EST


On 06/20/23 11:05, Mike Kravetz wrote:
> On 06/19/23 17:23, Naoya Horiguchi wrote:
> >
> > Considering this issue as one specific to memory error handling, checking
> > HPG_vmemmap_optimized in __get_huge_page_for_hwpoison() might be helpful to
> > detect the race. Then, an idea like the below diff (not tested) can make
> > try_memory_failure_hugetlb() retry (with retaking hugetlb_lock) to wait
> > for complete the allocation of vmemmap pages.
> >
> > @@ -1938,8 +1938,11 @@ int __get_huge_page_for_hwpoison(unsigned long pfn, int flags,
> > int ret = 2; /* fallback to normal page handling */
> > bool count_increased = false;
> >
> > - if (!folio_test_hugetlb(folio))
> > + if (!folio_test_hugetlb(folio)) {
> > + if (folio_test_hugetlb_vmemmap_optimized(folio))
> > + ret = -EBUSY;
>
> The hugetlb specific page flags (HPG_vmemmap_optimized here) reside in
> the folio->private field.
>
> In the case where the folio is a non-hugetlb folio, the folio->private field
> could be any arbitrary value. As such, the test for vmemmap_optimized may
> return a false positive. We could end up retrying for an arbitrarily
> long time.
>
> I am looking at how to restructure the code which removes and frees
> hugetlb pages so that folio_test_hugetlb() would remain true until
> vmemmap pages are allocated. The easiest way to do this would introduce
> another hugetlb lock/unlock cycle in the page freeing path. This would
> undo some of the speedups in the series:
> https://lore.kernel.org/all/20210409205254.242291-4-mike.kravetz@xxxxxxxxxx/T/#m34321fbcbdf8bb35dfe083b05d445e90ecc1efab
>

Perhaps something like this? Minimal testing.