Re: [PATCH v11 04/20] x86/cpu: Detect TDX partial write machine check erratum

From: kirill . shutemov
Date: Tue Jun 06 2023 - 08:38:38 EST


On Mon, Jun 05, 2023 at 02:27:17AM +1200, Kai Huang wrote:
> TDX memory has integrity and confidentiality protections. Violations of
> this integrity protection are supposed to only affect TDX operations and
> are never supposed to affect the host kernel itself. In other words,
> the host kernel should never, itself, see machine checks induced by the
> TDX integrity hardware.
>
> Alas, the first few generations of TDX hardware have an erratum. A
> "partial" write to a TDX private memory cacheline will silently "poison"
> the line. Subsequent reads will consume the poison and generate a
> machine check. According to the TDX hardware spec, neither of these
> things should have happened.
>
> Virtually all kernel memory accesses operations happen in full
> cachelines. In practice, writing a "byte" of memory usually reads a 64
> byte cacheline of memory, modifies it, then writes the whole line back.
> Those operations do not trigger this problem.
>
> This problem is triggered by "partial" writes where a write transaction
> of less than cacheline lands at the memory controller. The CPU does
> these via non-temporal write instructions (like MOVNTI), or through
> UC/WC memory mappings. The issue can also be triggered away from the
> CPU by devices doing partial writes via DMA.
>
> With this erratum, there are additional things need to be done around
> machine check handler and kexec(), etc. Similar to other CPU bugs, use
> a CPU bug bit to indicate this erratum, and detect this erratum during
> early boot. Note this bug reflects the hardware thus it is detected
> regardless of whether the kernel is built with TDX support or not.
>
> Signed-off-by: Kai Huang <kai.huang@xxxxxxxxx>
> ---
>
> v10 -> v11:
> - New patch
>
> ---
> arch/x86/include/asm/cpufeatures.h | 1 +
> arch/x86/kernel/cpu/intel.c | 21 +++++++++++++++++++++
> 2 files changed, 22 insertions(+)
>
> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> index cb8ca46213be..dc8701f8d88b 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -483,5 +483,6 @@
> #define X86_BUG_RETBLEED X86_BUG(27) /* CPU is affected by RETBleed */
> #define X86_BUG_EIBRS_PBRSB X86_BUG(28) /* EIBRS is vulnerable to Post Barrier RSB Predictions */
> #define X86_BUG_SMT_RSB X86_BUG(29) /* CPU is vulnerable to Cross-Thread Return Address Predictions */
> +#define X86_BUG_TDX_PW_MCE X86_BUG(30) /* CPU may incur #MC if non-TD software does partial write to TDX private memory */
>
> #endif /* _ASM_X86_CPUFEATURES_H */
> diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
> index 1c4639588ff9..251b333e53d2 100644
> --- a/arch/x86/kernel/cpu/intel.c
> +++ b/arch/x86/kernel/cpu/intel.c
> @@ -1552,3 +1552,24 @@ u8 get_this_hybrid_cpu_type(void)
>
> return cpuid_eax(0x0000001a) >> X86_HYBRID_CPU_TYPE_ID_SHIFT;
> }
> +
> +/*
> + * These CPUs have an erratum. A partial write from non-TD
> + * software (e.g. via MOVNTI variants or UC/WC mapping) to TDX
> + * private memory poisons that memory, and a subsequent read of
> + * that memory triggers #MC.
> + */
> +static const struct x86_cpu_id tdx_pw_mce_cpu_ids[] __initconst = {
> + X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X, NULL),
> + X86_MATCH_INTEL_FAM6_MODEL(EMERALDRAPIDS_X, NULL),
> + { }
> +};
> +
> +static int __init tdx_erratum_detect(void)
> +{
> + if (x86_match_cpu(tdx_pw_mce_cpu_ids))
> + setup_force_cpu_bug(X86_BUG_TDX_PW_MCE);
> +
> + return 0;
> +}
> +early_initcall(tdx_erratum_detect);

Initcall? Don't we already have a codepath to call it directly?
Maybe cpu_set_bug_bits()?

--
Kiryl Shutsemau / Kirill A. Shutemov