Re: [PATCH 1/4] mm/swapfile: fix wrong swap entry type for hwpoisoned swapcache page

From: Matthew Wilcox
Date: Sun Jul 16 2023 - 22:54:23 EST


On Mon, Jul 17, 2023 at 10:33:14AM +0800, Miaohe Lin wrote:
> On 2023/7/15 11:50, Matthew Wilcox wrote:
> > On Sat, Jul 15, 2023 at 11:17:26AM +0800, Miaohe Lin wrote:
> >> Hwpoisoned dirty swap cache page is kept in the swap cache and there's
> >> simple interception code in do_swap_page() to catch it. But when trying
> >> to swapoff, unuse_pte() will wrongly install a general sense of "future
> >> accesses are invalid" swap entry for hwpoisoned swap cache page due to
> >> unaware of such type of page. The user will receive SIGBUS signal without
> >> expected BUS_MCEERR_AR payload.
> >
> > Have you observed this, or do you just think it's true?
> >
> >> +++ b/mm/swapfile.c
> >> @@ -1767,7 +1767,8 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd,
> >> swp_entry_t swp_entry;
> >>
> >> dec_mm_counter(vma->vm_mm, MM_SWAPENTS);
> >> - if (hwposioned) {
> >> + /* Hwpoisoned swapcache page is also !PageUptodate. */
> >> + if (hwposioned || PageHWPoison(page)) {
> >
> > This line makes no sense to me. How do we get here with PageHWPoison()
> > being true and hwposioned being false?
>
> hwposioned will be true iff ksm_might_need_to_copy returns -EHWPOISON.
> And there's PageUptodate check in ksm_might_need_to_copy before we can return -EHWPOISON:
>
> ksm_might_need_to_copy
> if (!PageUptodate(page))
> return page; /* let do_swap_page report the error */
> ^^^
> Will return here because hwpoisoned swapcache page is !PageUptodate(cleared via me_swapcache_dirty()).
>
> Or am I miss something?

Ah! So we don't even get to calling copy_mc_to_kernel(). That seems
like a bug in ksm_might_need_to_copy(), don't you think? Maybe this
would be a better fix:

+ if (PageHWPoison(page))
+ return ERR_PTR(-EHWPOISON);
if (!PageUptodate(page))
return page;