[PATCH v2 0/2] x86/cpu: fix invalid MTRR mask values for SEV or TME

From: Paolo Bonzini
Date: Wed Jan 31 2024 - 18:26:44 EST


Supersedes: <20240130180400.1698136-1-pbonzini@xxxxxxxxxx>

MKTME repurposes the high bit of physical address to key id for encryption
key and, even though MAXPHYADDR in CPUID[0x80000008] remains the same,
the valid bits in the MTRR mask register are based on the reduced number
of physical address bits. This breaks boot on machines that have TME enabled
and do something to cleanup MTRRs, unless "disable_mtrr_cleanup" is
passed on the command line. The fix is to move the check to early CPU
initialization, which runs before Linux sets up MTRRs.

However, as noticed by Kirill, the patch I sent as v1 actually works only
until Linux 6.6. In Linux 6.7, commit fbf6449f84bf ("x86/sev-es: Set
x86_virt_bits to the correct value straight away, instead of a two-phase
approach") reorganized the initialization of c->x86_phys_bits in a way
that broke the patch. But even in 6.7 AMD processors, which did try to
reduce it in this_cpu->c_early_init(c), had their x86_phys_bits value
overwritten by get_cpu_address_sizes(), so that early_identify_cpu()
left the wrong value in x86_phys_bits. This probably went unnoticed
because on AMD processors you need not apply the reduced MAXPHYADDR to
MTRR masks.

Therefore, this v2 prepends the fix for this issue in commit fbf6449f84bf.
Apologies for the oversight.

Tested on an AMD Epyc machine (where I resorted to dumping mtrr_state) and
on the problematic Intel Emerald Rapids machine.

Thanks,

Paolo

Paolo Bonzini (2):
x86/cpu: allow reducing x86_phys_bits during early_identify_cpu()
x86/cpu/intel: Detect TME keyid bits before setting MTRR mask
registers

arch/x86/kernel/cpu/common.c | 4 +-
arch/x86/kernel/cpu/intel.c | 178 ++++++++++++++++++-----------------
2 files changed, 93 insertions(+), 89 deletions(-)

--
2.43.0