Re: perf: perf_fuzzer quickly locks up on 4.15-rc7

From: Peter Zijlstra
Date: Tue Jan 09 2018 - 11:06:05 EST


On Tue, Jan 09, 2018 at 10:56:52AM -0500, Vince Weaver wrote:
> On Tue, 9 Jan 2018, Peter Zijlstra wrote:
>
> > So CONFIG_PAGE_TABLE_ISOLATION=y and booting with "pti=off" makes it
> > 'work', right?
>
> yes. Previously I was changing CONFIG_PAGE_TABLE_ISOLATION and
> recompiling, but just now I booted with it set to yes and pti=off and the
> fuzzer has been running fine for a half hour (usually it crashes in under
> 5 minutes).
>
> I did see these in the logs which I don't think I've seen before.
>
> WARNING: stack recursion on stack type 2
> WARNING: can't dereference iret registers at 000000000783fea8 for ip paranoid_entry+0x2e/0x90
> WARNING: can't dereference registers at 00000000f0698d17 for ip paranoid_entry+0x4c/0x90
> WARNING: stack going in the wrong direction? ip=native_sched_clock+0x9/0x90

I've seen that last one, but not the ones before. Josh, this isn't
healty, right? :-)

> > The below is always my first try to get something out of the machine,
> > after that its printk() stuffing code to see how far we get..
> >
> > In particular I'd start instrumenting the NMI entry_64.S code, because
> > that's really the biggest difference between PTI and !PTI :/ all rather
> > bothersome I'm afraid.
>
> I'll try that next. Also getting a few other machines up and into a
> state that I can start fuzzing on them.
>
> (extra challenge, the lab my machines is in possibly has a leak in the
> roof, and they're calling for an inch of rain on top of 3 feet of
> existing snow so I might have to shut everything down and relocate on
> short notice).

Ouch, good luck with that!