Re: [PATCH v12 19/22] x86/kexec(): Reset TDX private memory on platforms with TDX erratum

From: Nikolay Borisov
Date: Wed Jun 28 2023 - 05:29:08 EST




On 26.06.23 г. 17:12 ч., Kai Huang wrote:

<snip>



diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index 85b24b2e9417..1107f4227568 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -51,6 +51,8 @@ static LIST_HEAD(tdx_memlist);
static struct tdmr_info_list tdx_tdmr_list;
+static atomic_t tdx_may_has_private_mem;
+
/*
* Wrapper of __seamcall() to convert SEAMCALL leaf function error code
* to kernel error code. @seamcall_ret and @out contain the SEAMCALL
@@ -1113,6 +1115,17 @@ static int init_tdx_module(void)
*/
wbinvd_on_all_cpus();
+ /*
+ * Starting from this point the system may have TDX private
+ * memory. Make it globally visible so tdx_reset_memory() only
+ * reads TDMRs/PAMTs when they are stable.
+ *
+ * Note using atomic_inc_return() to provide the explicit memory
+ * ordering isn't mandatory here as the WBINVD above already
+ * does that. Compiler barrier isn't needed here either.
+ */

If it's not needed, then why use it? Simply do atomic_inc() and instead rephrase the comment to state what are the ordering guarantees and how they are achieved (i.e by using wbinvd above).

+ atomic_inc_return(&tdx_may_has_private_mem);
+
/* Config the key of global KeyID on all packages */
ret = config_global_keyid();
if (ret)
@@ -1154,6 +1167,15 @@ static int init_tdx_module(void)
* as suggested by the TDX spec.
*/
tdmrs_reset_pamt_all(&tdx_tdmr_list);
+ /*
+ * No more TDX private pages now, and PAMTs/TDMRs are
+ * going to be freed. Make this globally visible so
+ * tdx_reset_memory() can read stable TDMRs/PAMTs.
+ *
+ * Note atomic_dec_return(), which is an atomic RMW with
+ * return value, always enforces the memory barrier.
+ */
+ atomic_dec_return(&tdx_may_has_private_mem);

Make a comment here which either refers to the comment at the increment site.

out_free_pamts:
tdmrs_free_pamt_all(&tdx_tdmr_list);
out_free_tdmrs:
@@ -1229,6 +1251,63 @@ int tdx_enable(void)
}
EXPORT_SYMBOL_GPL(tdx_enable);
+/*
+ * Convert TDX private pages back to normal on platforms with
+ * "partial write machine check" erratum.
+ *
+ * Called from machine_kexec() before booting to the new kernel.
+ */
+void tdx_reset_memory(void)
+{
+ if (!platform_tdx_enabled())
+ return;
+
+ /*
+ * Kernel read/write to TDX private memory doesn't
+ * cause machine check on hardware w/o this erratum.
+ */
+ if (!boot_cpu_has_bug(X86_BUG_TDX_PW_MCE))
+ return;
+
+ /* Called from kexec() when only rebooting cpu is alive */
+ WARN_ON_ONCE(num_online_cpus() != 1);
+
+ if (!atomic_read(&tdx_may_has_private_mem))
+ return;

I think a comment is warranted here explicitly calling our the ordering requirement/guarantees. Actually this is a non-rmw operation so it doesn't have any bearing on the ordering/implicit mb's achieved at the "increment" site.

<snip>