Re: [RFC PATCH] x86, entry: Switch stacks on a paranoid entry from userspace

From: Andy Lutomirski
Date: Tue Nov 11 2014 - 17:40:37 EST


On Tue, Nov 11, 2014 at 2:33 PM, Borislav Petkov <bp@xxxxxxxxx> wrote:
> On Tue, Nov 11, 2014 at 02:12:18PM -0800, Andy Lutomirski wrote:
>> I don't see why it would be any more likely for the normal kernel
>> stack to be corrupted due to a hardware issue that interrupted ring 3
>> code than that the IST stack is corrupted.
>
> The IST stack is, well, used solely be used for the vectors it is
> assigned for. Maybe the probabability of it getting bad is a bit
> lower..., who knows.
>
>> I don't know what, if anything, masks and unmasks #MC, but certainly
>> switching to process context like this patch does will not unmask it.
>
> Manuals say to clear MCG_STATUS[MCIP] before you return but you also
> have to IRET. Because not having cleared MCIP and returning would shut
> down the machine on another #MC.

I wonder what the IRET is for. There had better not be another magic
IRET unmask thing. I'm guessing that the actual semantics are that
nothing whatsoever can mask #MC, but that a second #MC when MCIP is
still set is a shutdown condition.

>
> But then what does it bring me to run on the kernel stack if I'm still
> in atomic context and I can't take locks? That doesn't help me with the
> memory_failure() thing.

Define "atomic".

You're still running with irqs off and MCIP set. At some point,
you're presumably done with all of the machine check registers, and
you can clear MCIP. Now, if current == victim, you can enable irqs
and do whatever you want.

In my mind, the benefit is that you don't need to think about how to
save your information and arrange to get called back the next time
that the victim task is a non-atomic context, since you *are* the
victim task and you're running in normal irqs-disabled kernel mode.

In contrast, with the current entry code, if you enable IRQs or so
anything that could sleep, you're on the wrong stack, so you'll crash.
That means that taking mutexes, even after clearing MCIP, is
impossible.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/