Re: [PATCH] fix hugetlbfs hwpoison handling

From: Miaohe Lin
Date: Thu Jan 11 2024 - 22:02:22 EST


On 2024/1/12 3:16, Sidhartha Kumar wrote:
> has_extra_refcount() makes the assumption that a ref count of 1 means
> the page is not referenced by other users. Commit a08c7193e4f1
> (mm/filemap: remove hugetlb special casing in filemap.c) modifies
> __filemap_add_folio() by calling folio_ref_add(folio, nr); for all cases
> (including hugtetlb) where nr is the number of pages in the folio. We
> should check if the page is not referenced by other users by checking
> the page count against the number of pages rather than 1.

Thanks for your patch.

>
> In hugetlbfs_read_iter(), folio_test_has_hwpoisoned() is testing the wrong
> flag as, in the hugetlb case, memory-failure code calls
> folio_test_set_hwpoison() to indicate poison. folio_test_hwpoison() is the
> correct function to test for that flag.
>
> After these fixes, the hugetlb hwpoison read selftest passes all cases.
>
> Fixes: a08c7193e4f1 ("mm/filemap: remove hugetlb special casing in filemap.c")
> Closes: https://lore.kernel.org/linux-mm/20230713001833.3778937-1-jiaqiyan@xxxxxxxxxx/T/#m8e1469119e5b831bbd05d495f96b842e4a1c5519
> Cc: <stable@xxxxxxxxxxxxxxx> # 6.7+
> Signed-off-by: Sidhartha Kumar <sidhartha.kumar@xxxxxxxxxx>
> Reported-by: Muhammad Usama Anjum <usama.anjum@xxxxxxxxxxxxx>
> Tested-by: Muhammad Usama Anjum <usama.anjum@xxxxxxxxxxxxx>
> ---
> fs/hugetlbfs/inode.c | 2 +-
> mm/memory-failure.c | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index 36132c9125f9..3a248e4f7e93 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -340,7 +340,7 @@ static ssize_t hugetlbfs_read_iter(struct kiocb *iocb, struct iov_iter *to)
> } else {
> folio_unlock(folio);
>
> - if (!folio_test_has_hwpoisoned(folio))
> + if (!folio_test_hwpoison(folio))
> want = nr;
> else {
> /*
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index d8c853b35dbb..87f6bf7d8bc1 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -973,7 +973,7 @@ struct page_state {
> static bool has_extra_refcount(struct page_state *ps, struct page *p,
> bool extra_pins)
> {
> - int count = page_count(p) - 1;
> + int count = page_count(p) - folio_nr_pages(page_folio(p));

IIRC, the refcnt of 1 here means the page is held by memory-failure. So I think it shouldn't be
changed to folio_nr_pages(page_folio(p)).

>
> if (extra_pins)
> count -= 1;

Indeed @extra_pins indicates whether hugetlb page is kept in page cache. So the page refcnt of
'folio_nr_pages(page_folio(p))' might be used here.

Thanks.