Re: split_huge_page_to_list() races with page_mapcount() on migration entry in smaps code? [was: Re: [syzbot] kernel BUG in __page_mapcount]

From: Matthew Wilcox
Date: Mon Jun 07 2021 - 14:04:22 EST


On Mon, Jun 07, 2021 at 07:27:23PM +0200, Jann Horn wrote:
> === Short summary ===
> I believe the issue here is a race between /proc/*/smaps and
> split_huge_page_to_list():
>
> The codepath for /proc/*/smaps walks the pagetables and (e.g. in
> smaps_account()) calls page_mapcount() not just on pages from normal
> PTEs but also on migration entries (since commit b1d4d9e0cbd0a
> "proc/smaps: carefully handle migration entries", from Linux v3.5).
> page_mapcount() expects compound pages to be stable.
>
> The split_huge_page_to_list() path first protects the compound page by
> locking it and replacing all its PTEs with migration entries (since
> the THP rewrite in v4.5, I think?), then does the actual splitting
> using __split_huge_page().
>
> So there's a mismatch of expectations here:
> The smaps code expects that migration entries point to stable compound
> pages, while the THP code expects that it's okay to split a compound
> page while it has migration entries.

Will it be a colossal performance penalty if we always get the page
refcount after looking it up? That will cause split_huge_page() to
fail to split the page if it hits this race.