Re: [RFC] x86_64: A real proposal for iret-less return to kernel

From: Andy Lutomirski
Date: Tue May 20 2014 - 22:39:55 EST


On Tue, May 20, 2014 at 7:27 PM, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> On Tue, 2014-05-20 at 17:53 -0700, Andy Lutomirski wrote:
>>
>> If there's an NMI on the stack, we must use `RET` until we're ready
>> to re-enabled NMIs.
>
> I'm a little confused by NMI on the stack. Do you mean NMI on the target
> stack? If so, please state that.

I mean that if we're in an NMI handler or in anything nested inside it.


>> * We can add a per-cpu variable `nmi_mce_nest_count` that is nonzero
>> whenever an NMI or MCE is on the stack. We'll increment it at the
>> very beginning of the NMI handler and clear it at the very end.
>> We will also increment it in `do_machine_check` before doing
>> anything that can cause an interrupt. The result is that the only
>> interrupt that can happen with `nmi_mce_nest_count == 0` in NMI
>> context is an MCE at the beginning or end of the NMI handler.
>
> Just note that this will probably be done in the C code, as NMI has
> issues with gs being safe.
>
> Also, should we call it "nmi" specifically. Perhaps
> "ist_stack_nest_count", stating that the stack is ist to match
> do_machine_check as well? Maybe that's not a good name either. Someone
> else can come up with something that's a little more generic than NMI?

So the issue here is that we can have an NMI followed immediately by
an MCE. The MCE code can call force_sig, which could plausibly result
in a kprobe or something similar happening. The return from that
needs to use IRET.

Since I don't see a clean way to reliably detect that we're inside an
NMI, I propose instead detecting when we're in *either* NMI or MCE,
hence the name. As long as we mark do_machine_check and whatever asm
code calls it __kprobes, I think we'll be okay.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/