Re: [syzbot] [fs?] WARNING in pagemap_scan_pmd_entry

From: Andrei Vagin
Date: Thu Nov 16 2023 - 10:38:19 EST


On Wed, Nov 15, 2023 at 4:53 PM Peter Xu <peterx@xxxxxxxxxx> wrote:
>
> Hi, Andrei, Muhammad,
>
> I had a look (as it triggered the guard I added before..), and I think I
> know what happened. So far I think it's a question to the new ioctl()
> interface, which I'd like to double check with you all. See below.
>
> On Wed, Nov 15, 2023 at 01:07:18PM -0800, Andrei Vagin wrote:
> > Cc: Peter and Muhammad
> >
> > On Wed, Nov 15, 2023 at 6:41 AM syzbot
> > <syzbot+e94c5aaf7890901ebf9b@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > Hello,
> > >
> > > syzbot found the following issue on:
> > >
> > > HEAD commit: c42d9eeef8e5 Merge tag 'hardening-v6.7-rc2' of git://git.k..
> > > git tree: upstream
> > > console+strace: https://syzkaller.appspot.com/x/log.txt?x=13626650e80000
> > > kernel config: https://syzkaller.appspot.com/x/.config?x=84217b7fc4acdc59
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=e94c5aaf7890901ebf9b
> > > compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15d73be0e80000
> > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=13670da8e80000
> > >
> > > Downloadable assets:
> > > disk image: https://storage.googleapis.com/syzbot-assets/a595d90eb9af/disk-c42d9eee.raw.xz
> > > vmlinux: https://storage.googleapis.com/syzbot-assets/c1e726fedb94/vmlinux-c42d9eee.xz
> > > kernel image: https://storage.googleapis.com/syzbot-assets/cb43ae262d09/bzImage-c42d9eee.xz
> > >
> > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > Reported-by: syzbot+e94c5aaf7890901ebf9b@xxxxxxxxxxxxxxxxxxxxxxxxx
> > >
> > > ------------[ cut here ]------------
> > > WARNING: CPU: 1 PID: 5071 at arch/x86/include/asm/pgtable.h:403 pte_uffd_wp arch/x86/include/asm/pgtable.h:403 [inline]
>
> This is the guard I added to detect writable bit set even if uffd-wp bit is
> not yet cleared. It means something obviously wrong happened.
>
> Here afaict the wrong thing is ioctl(PAGEMAP_SCAN) allows applying uffd-wp
> bit to VMA that is not even registered with userfault. Then what happened
> is when the page is written, do_wp_page() will try to reuse the anonymous
> page with the uffd-wp bit set, set W bit on top of it.

Thank you for looking at this.

>
> Below change works for me:
>
> ===8<===
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index ef2eb12906da..8a2500fa4580 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -1987,6 +1987,12 @@ static int pagemap_scan_test_walk(unsigned long start, unsigned long end,
> vma_category |= PAGE_IS_WPALLOWED;
> else if (p->arg.flags & PM_SCAN_CHECK_WPASYNC)
> return -EPERM;
> + else
> + /*
> + * Neither has the VMA enabled WP tracking, nor does the
> + * user want to explicit fail the walk. Skip the vma.
> + */
> + return 1;

In this case, I think we need to check the PM_SCAN_WP_MATCHING flag
and skip these vma-s only if it is set.

If PM_SCAN_WP_MATCHING isn't set, this ioctl returns page flags and
can be used without the intention of tracking memory changes.

>
> if (vma->vm_flags & VM_PFNMAP)
> return 1;
> ===8<===
>
> This is based on my reading of the pagemap scan flags:
>
> - Write-protect the pages. The ``PM_SCAN_WP_MATCHING`` is used to write-protect
> the pages of interest. The ``PM_SCAN_CHECK_WPASYNC`` aborts the operation if
> non-Async Write Protected pages are found. The ``PM_SCAN_WP_MATCHING`` can be
> used with or without ``PM_SCAN_CHECK_WPASYNC``.
>
> If PM_SCAN_CHECK_WPASYNC is used to enforce the check, we need to skip the
> vma that is not registered properly. Does it look reasonable to you?

I think the idea here could be to report page flags but doesn't
write-protect such pages.

Thanks,
Andrei