Re: [syzbot] linux-next boot error: BUG: unable to handle kernel paging request in kernel_execve

From: Kees Cook
Date: Fri Aug 12 2022 - 14:44:29 EST


On Fri, Aug 12, 2022 at 11:29:44AM +0200, Dmitry Vyukov wrote:
> On Fri, 12 Aug 2022 at 02:11, Ira Weiny <ira.weiny@xxxxxxxxx> wrote:
> >
> > On Thu, Aug 11, 2022 at 02:00:59PM -0700, Kees Cook wrote:
> > > On Thu, Aug 11, 2022 at 11:51:34AM -0700, Ira Weiny wrote:
> > > > On Thu, Aug 11, 2022 at 10:39:29AM -0700, Ira wrote:
> > > > > On Thu, Aug 11, 2022 at 08:33:16AM -0700, Kees Cook wrote:
> > > > > > Hi Fabio,
> > > > > >
> > > > > > It seems likely that the kmap change[1] might be causing this crash. Is
> > > > > > there a boot-time setup race between kmap being available and early umh
> > > > > > usage?
> > > > >
> > > > > I don't see how this is a setup problem with the config reported here.
> > > > >
> > > > > CONFIG_64BIT=y
> > > > >
> > > > > ...and HIGHMEM is not set.
> > > > > ...and PREEMPT_RT is not set.
> > > > >
> > > > > So the kmap_local_page() call in that stack should be a page_address() only.
> > > > >
> > > > > I think the issue must be some sort of race which was being prevented because
> > > > > of the preemption and/or pagefault disable built into kmap_atomic().
> > > > >
> > > > > Is this reproducable?
> > > > >
> > > > > The hunk below will surely fix it but I think the pagefault_disable() is
> > > > > the only thing that is required. It would be nice to test it.
> > > >
> > > > Fabio and I discussed this. And he also mentioned that pagefault_disable() is
> > > > all that is required.
> > >
> > > Okay, sounds good.
> > >
> > > > Do we have a way to test this?
> > >
> > > It doesn't look like syzbot has a reproducer yet, so its patch testing
> > > system[1] will not work. But if you can send me a patch, I could land it
> > > in -next and we could see if the reproduction frequency drops to zero.
> > > (Looking at the dashboard, it's seen 2 crashes, most recently 8 hours
> > > ago.)
> >
> > Patch sent.
> >
> > https://lore.kernel.org/lkml/20220812000919.408614-1-ira.weiny@xxxxxxxxx/

Thank you!

> >
> > But I'm more confused after looking at this again.
>
> There is splat of random crashes in linux-next happened at the same time:
>
> https://groups.google.com/g/syzkaller-bugs/search?q=%22linux-next%20boot%20error%3A%22
>
> There are 10 different crashes in completely random places.
> I would assume they have the same root cause, some silent memory
> corruption or something similar.

Yeah, I noticed the crashes stopped "on their own", so I think I'll
wait a bit more, and if it start back up, we can try Ira's patch, though
I'd agree with the assessment that it looks like it shouldn't be needed.

-Kees

--
Kees Cook