Re: [PATCH v11 05/20] x86/virt/tdx: Add SEAMCALL infrastructure

From: Huang, Kai
Date: Tue Jun 20 2023 - 06:37:27 EST


On Mon, 2023-06-19 at 14:52 +0200, David Hildenbrand wrote:
> On 04.06.23 16:27, Kai Huang wrote:
> > TDX introduces a new CPU mode: Secure Arbitration Mode (SEAM). This
> > mode runs only the TDX module itself or other code to load the TDX
> > module.
> >
> > The host kernel communicates with SEAM software via a new SEAMCALL
> > instruction. This is conceptually similar to a guest->host hypercall,
> > except it is made from the host to SEAM software instead. The TDX
> > module establishes a new SEAMCALL ABI which allows the host to
> > initialize the module and to manage VMs.
> >
> > Add infrastructure to make SEAMCALLs. The SEAMCALL ABI is very similar
> > to the TDCALL ABI and leverages much TDCALL infrastructure.
> >
> > SEAMCALL instruction causes #GP when TDX isn't BIOS enabled, and #UD
> > when CPU is not in VMX operation. Currently, only KVM code mocks with
> > VMX enabling, and KVM is the only user of TDX. This implementation
> > chooses to make KVM itself responsible for enabling VMX before using
> > TDX and let the rest of the kernel stay blissfully unaware of VMX.
> >
> > The current TDX_MODULE_CALL macro handles neither #GP nor #UD. The
> > kernel would hit Oops if SEAMCALL were mistakenly made w/o enabling VMX
> > first. Architecturally, there is no CPU flag to check whether the CPU
> > is in VMX operation. Also, if a BIOS were buggy, it could still report
> > valid TDX private KeyIDs when TDX actually couldn't be enabled.
> >
> > Extend the TDX_MODULE_CALL macro to handle #UD and #GP to return error
> > codes. Introduce two new TDX error codes for them respectively so the
> > caller can distinguish.
> >
> > Also add a wrapper function of SEAMCALL to convert SEAMCALL error code
> > to the kernel error code, and print out SEAMCALL error code to help the
> > user to understand what went wrong.
> >
> > Signed-off-by: Kai Huang <kai.huang@xxxxxxxxx>
> > ---
>
> I agree with Dave that a buggy bios is not a good motivation for this
> patch. The real strength of this infrastructure IMHO is central error
> handling and expressive error messages. Maybe it makes some corner cases
> (reboot -f) easier to handle. That would make a better justification
> than buggy bios -- and should be spelled out in the patch description.

Agreed. Will do. Thanks!

>
> [...]
>
>
> > +/*
> > + * Wrapper of __seamcall() to convert SEAMCALL leaf function error code
> > + * to kernel error code. @seamcall_ret and @out contain the SEAMCALL
> > + * leaf function return code and the additional output respectively if
> > + * not NULL.
> > + */
> > +static int __always_unused seamcall(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9,
> > + u64 *seamcall_ret,
> > + struct tdx_module_output *out)
> > +{
> > + int cpu, ret = 0;
> > + u64 sret;
> > +
> > + /* Need a stable CPU id for printing error message */
> > + cpu = get_cpu();
> > +
> > + sret = __seamcall(fn, rcx, rdx, r8, r9, out);
> > +
>
>
> Why not
>
> cpu = get_cpu();
> sret = __seamcall(fn, rcx, rdx, r8, r9, out);
> put_cpu();

Hmm.. I think this is also OK. The worst case is the error message will be
printed on remote cpu but the message still have the correct "cpu id" printed.

I'll change to above.

>
>
> > + /* Save SEAMCALL return code if the caller wants it */
> > + if (seamcall_ret)
> > + *seamcall_ret = sret;
> > +
> > + /* SEAMCALL was successful */
> > + if (!sret)
> > + goto out;
>
> Why not move that into the switch statement below to avoid th goto?
> If you do the put_cpu() early, you can avoid "ret" as well.

Yeah can do.

>
> switch (sret) {
> case 0:
> /* SEAMCALL was successful */
> return 0;
> case TDX_SEAMCALL_GP:
> pr_err_once("[firmware bug]: TDX is not enabled by BIOS.\n");
> return -ENODEV;
> ...
> }
>

[...]


> > + /*
> > + * SEAMCALL caused #GP or #UD. By reaching here %eax contains
> > + * the trap number. Convert the trap number to the TDX error
> > + * code by setting TDX_SW_ERROR to the high 32-bits of %rax.
> > + *
> > + * Note cannot OR TDX_SW_ERROR directly to %rax as OR instruction
> > + * only accepts 32-bit immediate at most.
>
> Not sure if that comment is really helpful here. It's a common pattern
> for large immediates, no?

I am not sure. I guess I am not expert of x86 assembly but only casual writer.

Hi Dave, Kirill,

Are you OK to remove it?

>
> > + */
> > + mov $TDX_SW_ERROR, %r12
> > + orq %r12, %rax
> >
> > + _ASM_EXTABLE_FAULT(1b, 2b)
> > +.Lseamcall_out:
> > .else
> > tdcall
> > .endif
>