Re: [PATCH v2 5/8] x86/sev-es: Leave NMI-mode before sending signals

From: Peter Zijlstra
Date: Wed May 19 2021 - 13:57:02 EST


On Wed, May 19, 2021 at 03:52:48PM +0200, Joerg Roedel wrote:
> --- a/arch/x86/kernel/sev.c
> +++ b/arch/x86/kernel/sev.c
> @@ -1343,9 +1343,10 @@ DEFINE_IDTENTRY_VC_SAFE_STACK(exc_vmm_communication)
> return;
> }
>
> + instrumentation_begin();
> +
> irq_state = irqentry_nmi_enter(regs);
> lockdep_assert_irqs_disabled();
> - instrumentation_begin();
>
> /*
> * This is invoked through an interrupt gate, so IRQs are disabled. The

That's just plain wrong. No instrumentation is allowed before you enter
the exception context.

> @@ -1395,13 +1396,19 @@ DEFINE_IDTENTRY_VC_SAFE_STACK(exc_vmm_communication)
> BUG();
> }
>
> -out:
> - instrumentation_end();
> irqentry_nmi_exit(regs, irq_state);
> + instrumentation_end();

And this can't be right either, same issue, no instrumentation is
allowed after you leave the exception context.

>
> return;
>
> fail:
> + /*
> + * Leave NMI mode - the GHCB is not busy anymore and depending on where
> + * the #VC came from this code is about to either kill the task (when in
> + * task context) or kill the machine.
> + */
> + irqentry_nmi_exit(regs, irq_state);
> +

And this is wrong too; because at this point the handler doesn't run in
_any_ context anymore, certainly not one you can call regular C code
from.

> if (user_mode(regs)) {
> /*
> * Do not kill the machine if user-space triggered the
> @@ -1423,7 +1430,9 @@ DEFINE_IDTENTRY_VC_SAFE_STACK(exc_vmm_communication)
> panic("Returned from Terminate-Request to Hypervisor\n");
> }
>
> - goto out;
> + instrumentation_end();
> +
> + return;
> }


You either get to do what MCE does, or what MCE does. That is, either
use task_work or MCE_USER and have the _user() handler use
irqentry_enter_from_user_mode().

The above is an absolute no-go.