Re: [RFC] x86_64: A real proposal for iret-less return to kernel

From: Andy Lutomirski
Date: Wed May 21 2014 - 19:07:43 EST


On Wed, May 21, 2014 at 4:05 PM, Luck, Tony <tony.luck@xxxxxxxxx> wrote:
> On Wed, May 21, 2014 at 03:39:11PM -0700, Andy Lutomirski wrote:
>> But if we get a new MCE in here, it will be an MCE from kernel context
>> and it's fatal. So, yes, we'll clobber the stack, but we'll never
>> return (unless tolerant is set to something insane), so who cares?
>
> Remember that machine checks are broadcast. So some other cpu
> can hit a recoverable machine check in user mode ... but that int#18
> goes everywhere. Other cpus are innocent bystanders ... they will
> see MCG_STATUS.RIPV=1, MCG_STATUS.EIPV=0 and nothing important
> in any of their machine check banks.
>
> But if we are still finishing off processing the previous machine check,
> this will be a nested one - and BOOM, we are dead.

Oh. Well, crap.

FWIW, this means that there really is a problem if one of these #MC
errors hits an innocent bystander who just happens to be handling an
NMI, at least if we delete the nested NMI code. But I think my
simplified proposal gets this right.

>
> -Tony
>
> [If you peer closely at the latest edition of the SDM - you'll see the
> bits are defined for a non-broadcast model ... e.g. LMCE_S bit in
> MCG_STATUS .... but currently shipping silicon doesn't use that]



--
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/