Re: [PATCHv2 04/29] x86/traps: Add #VE support for TDX guest

From: Sean Christopherson
Date: Tue Feb 01 2022 - 16:27:00 EST


On Tue, Feb 01, 2022, Thomas Gleixner wrote:
> On Mon, Jan 24 2022 at 18:01, Kirill A. Shutemov wrote:
> > diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c
> > index df0fa695bb09..1da074123c16 100644
> > --- a/arch/x86/kernel/idt.c
> > +++ b/arch/x86/kernel/idt.c
> > @@ -68,6 +68,9 @@ static const __initconst struct idt_data early_idts[] = {
> > */
> > INTG(X86_TRAP_PF, asm_exc_page_fault),
> > #endif
> > +#ifdef CONFIG_INTEL_TDX_GUEST
> > + INTG(X86_TRAP_VE, asm_exc_virtualization_exception),
> > +#endif
> >
> > +bool tdx_get_ve_info(struct ve_info *ve)
> > +{
> > + struct tdx_module_output out;
> > +
> > + /*
> > + * NMIs and machine checks are suppressed. Before this point any
> > + * #VE is fatal. After this point (TDGETVEINFO call), NMIs and
> > + * additional #VEs are permitted (but it is expected not to
> > + * happen unless kernel panics).
>
> I really do not understand that comment. #NMI and #MC are suppressed
> according to the above. How long are they suppressed and what's the
> mechanism? Are they unblocked on return from __tdx_module_call() ?

TDX_GET_VEINFO is a call into the TDX module to get the data from #VE info struct
pointed at by the VMCS. Doing TDX_GET_VEINFO also clears that "valid" flag in
the struct. It's basically a CMPXCHG on the #VE info struct, except that it routes
through the TDX module.

The TDX module treats virtual NMIs as blocked if the #VE valid flag is set, i.e.
refuses to inject NMI until the guest does TDX_GET_VEINFO to retrieve the info for
the last #VE.

I don't understand the blurb about #MC. Unless things have changed, the TDX module
doesn't support injecting #MC into the guest.

> What prevents a nested #VE? If it happens what makes it fatal? Is it
> converted to a #DF or detected by software?

A #VE that would occur is morphed to a #DF by the TDX module if the #VE info valid
flag is already set. But nested #VE should work, so long as the nested #VE happens
after TDX_GET_VEINFO.

> Also I do not understand that the last sentence tries to tell me. If the
> suppression of #NMI and #MC is lifted on return from tdcall then both
> can be delivered immediately afterwards, right?

Yep, NMI can be injected on the instruction following the TDCALL.

Something like this?

/*
* Retrieve the #VE info from the TDX module, which also clears the "#VE
* valid" flag. This must be done before anything else as any #VE that
* occurs while the valid flag is set, i.e. before the previous #VE info
* was consumed, is morphed to a #DF by the TDX module. Note, the TDX
* module also treats virtual NMIs as inhibited if the #VE valid flag is
* set, e.g. so that NMI=>#VE will not result in a #DF.
*/

> I assume the additional #VE is triggered by software or a bug in the
> kernel.

I'm curious if that will even hold true, there's sooo much stuff that can happen
from NMI context. I don't see much value in speculating what will/won't happen
after retrieving the #VE info.