Re: [PATCH v7 06/20] x86/virt/tdx: Shut down TDX module in case of error

From: Sean Christopherson
Date: Tue Nov 22 2022 - 14:32:51 EST


On Tue, Nov 22, 2022, Peter Zijlstra wrote:
> On Tue, Nov 22, 2022 at 04:06:25PM +0100, Thomas Gleixner wrote:
> > On Tue, Nov 22 2022 at 10:20, Peter Zijlstra wrote:
> >
> > > On Mon, Nov 21, 2022 at 01:26:28PM +1300, Kai Huang wrote:
> > >
> > >> Shutting down the TDX module requires calling TDH.SYS.LP.SHUTDOWN on all
> > >> BIOS-enabled CPUs, and the SEMACALL can run concurrently on different
> > >> CPUs. Implement a mechanism to run SEAMCALL concurrently on all online
> > >> CPUs and use it to shut down the module. Later logical-cpu scope module
> > >> initialization will use it too.
> > >
> > > Uhh, those requirements ^ are not met by this:
> >
> > Can run concurrently != Must run concurrently
> >
> > The documentation clearly says "can run concurrently" as quoted above.
>
> The next sentense says: "Implement a mechanism to run SEAMCALL
> concurrently" -- it does not.
>
> Anyway, since we're all in agreement there is no such requirement at
> all, a schedule_on_each_cpu() might be more appropriate, there is no
> reason to use IPIs and spin-waiting for any of this.

Backing up a bit, what's the reason for _any_ of this? The changelog says

It's pointless to leave the TDX module in some middle state.

but IMO it's just as pointless to do a shutdown unless the kernel benefits in
some meaningful way. And IIUC, TDH.SYS.LP.SHUTDOWN does nothing more than change
the SEAM VMCS.HOST_RIP to point to an error trampoline. E.g. it's not like doing
a shutdown lets the kernel reclaim memory that was gifted to the TDX module.

In other words, this is just a really expensive way of changing a function pointer,
and the only way it would ever benefit the kernel is if there is a kernel bug that
leads to trying to use TDX after a fatal error. And even then, the only difference
seems to be that subsequent bogus SEAMCALLs would get a more unique error message.