Re: [PATCH v7 04/20] x86/virt/tdx: Add skeleton to initialize TDX on demand

From: Thomas Gleixner
Date: Wed Nov 23 2022 - 06:06:25 EST


Kai!

On Wed, Nov 23 2022 at 00:30, Kai Huang wrote:
> On Tue, 2022-11-22 at 21:03 +0100, Thomas Gleixner wrote:
>> Clearly that's nowhere spelled out in the documentation, but I don't
>> buy the 'architecturaly required' argument not at all. It's an
>> implementation detail of the TDX module.
>
> I agree on hardware level there shouldn't be such requirement (not 100% sure
> though), but I guess from kernel's perspective, "the implementation detail of
> the TDX module" is sort of "architectural requirement"

Sure, but then it needs to be clearly documented so.

> -- at least Intel arch guys think so I guess.

Intel "arch" guys? You might look up the meanings of "arch" in a
dictionary. LKML is not twatter.

>> Technically there is IMO ZERO requirement to do so.
>>
>>  1) The TDX module is global
>>
>>  2) Seam-root and Seam-non-root operation are strictly a LP property.
>>
>>     The only architectural prerequisite for using Seam on a LP is that
>>     obviously the encryption/decryption mechanics have been initialized
>>     on the package to which the LP belongs.
>>
>> I can see why it might be complicated to add/remove an LP after
>> initialization fact, but technically it should be possible.
>
> "kernel soft offline" actually isn't an issue. We can bring down a logical cpu
> after it gets initialized and then bring it up again.

That's the whole point where this discussion started: _AFTER_ it gets
initialized.

Which means that, e.g. adding "nosmt" to the kernel command line will
make TDX fail hard because at the point where TDX is initialized the
hyperthreads are not 'soft' online and cannot respond to anything, but
the BIOS already accounted them.

This is just wrong as we all know that "nosmt" is sadly one of the
obvious counter measures for the never ending flood of speculation
issues.

> Only add/removal of physical cpu will cause problem: 

You wish.

> TDX MCHECK verifies all boot-time present cpus to make sure they are TDX-
> compatible before it enables TDX in hardware. MCHECK cannot run on hot-added
> CPU, so TDX cannot support physical CPU hotplug.

TDX can rightfully impose the limitation that it only executes on CPUs,
which are known at boot/initialization time, and only utilizes "known"
memory. That's it, but that does not enforce to prevent physical hotplug
in general.

> We tried to get it clarified in the specification, and below is what TDX/module
> arch guys agreed to put to the TDX module spec (just checked it's not in latest
> public spec yet, but they said it will be in next release):
>
> "
> 4.1.3.2. CPU Configuration
>
> During platform boot, MCHECK verifies all logical CPUs to ensure they
> meet TDX’s

That MCHECK falls into the category of security voodoo.

It needs to verify _ALL_ logical CPUs to ensure that Intel did not put
different models and steppings into a package or what?

> security and certain functionality requirements, and MCHECK passes the following
> CPU configuration information to the NP-SEAMLDR, P-SEAMLDR and the TDX Module:
>
> · Total number of logical processors in the platform.

You surely need MCHECK for this

> · Total number of installed packages in the platform.

and for this...

> · A table of per-package CPU family, model and stepping etc.
> identification, as enumerated by CPUID(1).EAX.
> The above information is static and does not change after platform boot and
> MCHECK run.
>
> Note: TDX doesn’t support adding or removing CPUs from TDX security
> perimeter, as checked my MCHECK.

More security voodoo. The TDX security perimeter has nothing to do with
adding or removing CPUs on a system. If that'd be true then TDX is a
complete fail.

> BIOS should prevent CPUs from being hot-added or hot-removed after
> platform boots.

If the BIOS does not prevent it, then TDX and the Seam module will not
even notice unless the OS tries to invoke seamcall() on a newly plugged
CPU.

A newly added CPU and newly added memory should not have any impact on
the TDX integrity of the already running system. If they have then
again, TDX is broken by design.

> The TDX module performs additional checks of the CPU’s configuration and
> supported features, by reading MSRs and CPUID information as described in the
> following sections.

to ensure that the MCHECK generated table is still valid at the point
where TDX is initialized?

That said, this documentation is at least better than the existing void,
but that does not make it technically correct.

Thanks,

tglx