Re: [syzbot] [mm?] BUG: unable to handle kernel paging request in copy_from_kernel_nofault

From: Jann Horn
Date: Fri Dec 08 2023 - 09:12:22 EST


On Tue, Nov 21, 2023 at 6:13 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> On Sun, Nov 19 2023 at 09:53, syzbot wrote:
> > HEAD commit: 1fda5bb66ad8 bpf: Do not allocate percpu memory at init st..
> > git tree: bpf
> > console+strace: https://syzkaller.appspot.com/x/log.txt?x=12d99420e80000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=2ae0ccd6bfde5eb0
> > dashboard link: https://syzkaller.appspot.com/bug?extid=72aa0161922eba61b50e
> > compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16dff22f680000
> > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1027dc70e80000
> >
> > Downloadable assets:
> > disk image: https://storage.googleapis.com/syzbot-assets/3e24d257ce8d/disk-1fda5bb6.raw.xz
> > vmlinux: https://storage.googleapis.com/syzbot-assets/eaa9caffb0e4/vmlinux-1fda5bb6.xz
> > kernel image: https://storage.googleapis.com/syzbot-assets/16182bbed726/bzImage-1fda5bb6.xz
> >
> > The issue was bisected to:
> >
> > commit ca247283781d754216395a41c5e8be8ec79a5f1c
> > Author: Andy Lutomirski <luto@xxxxxxxxxx>
> > Date: Wed Feb 10 02:33:45 2021 +0000
> >
> > x86/fault: Don't run fixups for SMAP violations
>
> Reverting that makes the Ooops go away, but wrongly so.
>
> > bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=103d92db680000
> > final oops: https://syzkaller.appspot.com/x/report.txt?x=123d92db680000
> > console output: https://syzkaller.appspot.com/x/log.txt?x=143d92db680000
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+72aa0161922eba61b50e@xxxxxxxxxxxxxxxxxxxxxxxxx
> > Fixes: ca247283781d ("x86/fault: Don't run fixups for SMAP violations")
> >
> > BUG: unable to handle page fault for address: ffffffffff600000
>
> This is VSYSCALL_ADDR.
>
> So the real question is why the BPF program tries to copy from the
> VSYSCALL page, which is not mapped.

The linked syz repro is:

r0 = bpf$PROG_LOAD(0x5, &(0x7f00000000c0)={0x11, 0xb,
&(0x7f0000000180)=@framed={{}, [@printk={@integer, {}, {}, {}, {},
{0x7, 0x0, 0xb, 0x3, 0x0, 0x0, 0xff600000}, {0x85, 0x0, 0x0, 0x71}}]},
&(0x7f0000000200)='GPL\x00', 0x0, 0x0, 0x0, 0x0, 0x0, '\x00', 0x0,
0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0},
0x90)
bpf$BPF_RAW_TRACEPOINT_OPEN(0x11,
&(0x7f0000000540)={&(0x7f0000000000)='kfree\x00', r0}, 0x10)

So syzkaller generated a BPF tracing program. 0x85 is BPF_JMP |
BPF_CALL, which is used to invoke BPF helpers; 0x71 is 113, which is
the number of the probe_read_kernel helper, which basically takes
arbitrary values as input and casts them to kernel pointers, and then
probe-reads them. And before that is some kinda ALU op with 0xff600000
as immediate.

So it looks like the answer to that question is "the BPF program tries
to copy from the VSYSCALL page because syzkaller decided to write BPF
code that does specifically that, and the BPF helper let it do that".

copy_from_kernel_nofault() does check
copy_from_kernel_nofault_allowed() to make sure the pointer really is
a kernel pointer, and the X86 version of that rejects anything in the
userspace part of the address space. But it does not know about the
vsyscall area.