Re: [PATCH v2 1/2] x86/msr: Carry on after a non-"safe" MSR access fails without !panic_on_oops

From: Ingo Molnar
Date: Sat Mar 12 2016 - 11:03:01 EST



* Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:

> On Thu, Oct 1, 2015 at 12:15 AM, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
> >
> > * Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
> >
> >> > These could still be open coded in an inlined fashion, like the scheduler usage.
> >>
> >> We could have a raw_rdmsr for those.
> >>
> >> OTOH, I'm still not 100% convinced that this warn-but-don't-die behavior is
> >> worth the effort. This isn't a frequent source of bugs to my knowledge, and we
> >> don't try to recover from incorrect cr writes, out-of-bounds MMIO, etc, so do we
> >> really gain much by rigging a recovery mechanism for rdmsr and wrmsr failures
> >> for code that doesn't use the _safe variants?
> >
> > It's just the general principle really: don't crash the kernel on bootup. There's
> > few things more user hostile than that.
> >
> > Also, this would maintain the status quo: since we now (accidentally) don't crash
> > the kernel on distro kernels (but silently and unsafely ignore the faulting
> > instruction), we should not regress that behavior (by adding the chance to crash
> > again), but improve upon it.
>
> Just a heads up: the extable improvements in tip:ras/core make it
> straightforward to get the best of all worlds: explicit failure
> handling (written in C!), no fast path overhead whatsoever, and no new
> garbage in the exception handlers.

I _knew_ I should have merged them into tip:x86/mm, not tip:ras/core ;-)

I had a quick look at your new MSR series and I'm very happy with that direction!

Thanks,

Ingo