Re: [syzbot] [fs?] WARNING in pagemap_scan_pmd_entry

From: Peter Xu
Date: Wed Nov 15 2023 - 19:53:40 EST


Hi, Andrei, Muhammad,

I had a look (as it triggered the guard I added before..), and I think I
know what happened. So far I think it's a question to the new ioctl()
interface, which I'd like to double check with you all. See below.

On Wed, Nov 15, 2023 at 01:07:18PM -0800, Andrei Vagin wrote:
> Cc: Peter and Muhammad
>
> On Wed, Nov 15, 2023 at 6:41 AM syzbot
> <syzbot+e94c5aaf7890901ebf9b@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit: c42d9eeef8e5 Merge tag 'hardening-v6.7-rc2' of git://git.k..
> > git tree: upstream
> > console+strace: https://syzkaller.appspot.com/x/log.txt?x=13626650e80000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=84217b7fc4acdc59
> > dashboard link: https://syzkaller.appspot.com/bug?extid=e94c5aaf7890901ebf9b
> > compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15d73be0e80000
> > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=13670da8e80000
> >
> > Downloadable assets:
> > disk image: https://storage.googleapis.com/syzbot-assets/a595d90eb9af/disk-c42d9eee.raw.xz
> > vmlinux: https://storage.googleapis.com/syzbot-assets/c1e726fedb94/vmlinux-c42d9eee.xz
> > kernel image: https://storage.googleapis.com/syzbot-assets/cb43ae262d09/bzImage-c42d9eee.xz
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+e94c5aaf7890901ebf9b@xxxxxxxxxxxxxxxxxxxxxxxxx
> >
> > ------------[ cut here ]------------
> > WARNING: CPU: 1 PID: 5071 at arch/x86/include/asm/pgtable.h:403 pte_uffd_wp arch/x86/include/asm/pgtable.h:403 [inline]

This is the guard I added to detect writable bit set even if uffd-wp bit is
not yet cleared. It means something obviously wrong happened.

Here afaict the wrong thing is ioctl(PAGEMAP_SCAN) allows applying uffd-wp
bit to VMA that is not even registered with userfault. Then what happened
is when the page is written, do_wp_page() will try to reuse the anonymous
page with the uffd-wp bit set, set W bit on top of it.

Below change works for me:

===8<===
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index ef2eb12906da..8a2500fa4580 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -1987,6 +1987,12 @@ static int pagemap_scan_test_walk(unsigned long start, unsigned long end,
vma_category |= PAGE_IS_WPALLOWED;
else if (p->arg.flags & PM_SCAN_CHECK_WPASYNC)
return -EPERM;
+ else
+ /*
+ * Neither has the VMA enabled WP tracking, nor does the
+ * user want to explicit fail the walk. Skip the vma.
+ */
+ return 1;

if (vma->vm_flags & VM_PFNMAP)
return 1;
===8<===

This is based on my reading of the pagemap scan flags:

- Write-protect the pages. The ``PM_SCAN_WP_MATCHING`` is used to write-protect
the pages of interest. The ``PM_SCAN_CHECK_WPASYNC`` aborts the operation if
non-Async Write Protected pages are found. The ``PM_SCAN_WP_MATCHING`` can be
used with or without ``PM_SCAN_CHECK_WPASYNC``.

If PM_SCAN_CHECK_WPASYNC is used to enforce the check, we need to skip the
vma that is not registered properly. Does it look reasonable to you?

Thanks,

--
Peter Xu