Re: [PATCH v5 12/22] x86/virt/tdx: Convert all memory regions in memblock to TDX memory

From: Kai Huang
Date: Tue Aug 02 2022 - 21:30:21 EST


On Fri, 2022-07-08 at 11:34 +1200, Kai Huang wrote:
> > Why not just entirely remove the lower 1MB from the memblock structure
> > on TDX systems?  Do something equivalent to adding this on the kernel
> > command line:
> >
> >   memmap=1M$0x0
>
> I will explore this option.  Thanks!

Hi Dave,

After investigating and testing, we cannot simply remove first 1MB from e820
table which is similar to what 'memmap=1M$0x0' does, as the kernel needs low
memory as trampoline to bring up all APs.

Currently I am doing below:

--- a/arch/x86/realmode/init.c
+++ b/arch/x86/realmode/init.c
@@ -65,6 +65,17 @@ void __init reserve_real_mode(void)
* setup_arch().
*/
memblock_reserve(0, SZ_1M);
+
+ /*
+ * As one step of initializing the TDX module (on-demand), the
+ * kernel will later verify all memory regions in memblock are
+ * truly TDX-capable and convert all of them to TDX memory.
+ * The first 1MB may not be enumerated as TDX-capable memory.
+ * To avoid failure to verify, explicitly remove the first 1MB
+ * from memblock for a TDX (BIOS) enabled system.
+ */
+ if (platform_tdx_enabled())
+ memblock_remove(0, SZ_1M);

I tested an it worked (I didn't observe any problem), but am I missing
something?

Also, regarding to whether we can remove platform_tdx_enabled() at all, I looked
into the spec again and there's no MSR or CPUID from which we can check TDX is
enabled by BIOS -- except checking the SEAMRR_MASK MSR, which is basically
platform_tdx_enabled() also did.

Checking MSR_MTRRcap.SEAMRR bit isn't enough as it will be true as long as the
hardware supports SEAMRR, but it doesn't tell whether SEAMRR(TDX) is enabled by
BIOS.

So if above code is reasonable, I think we can still detect TDX during boot and
keep platform_tdx_enabled().  

It also detects TDX KeyIDs, which isn't necessary for removing the first 1MB
here (nor for kexec() support), but detecting TDX KeyIDs must be done anyway
either during kernel boot or during initializing TDX module.

Detecting TDX KeyID at boot time also has an advantage that in the future we can
expose KeyIDs via /sysfs and userspace can know how many TDs the machine can
support w/o having to initializing the TDX module first (we received such
requirement from customer but yes it is arguable).

Any comments?

--
Thanks,
-Kai