[patch 00/30] x86/microcode: Cleanup and late loading enhancements

From: Thomas Gleixner
Date: Thu Aug 10 2023 - 14:37:32 EST


Hi!

Late microcode loading is desired by enterprise users. Late loading is
problematic as it requires detailed knowledge about the change and an
analysis whether this change modifies something which is already in use by
the kernel. Large enterprise customers have engineering teams and access to
deep technical vendor support. The regular admin does not have such
resources, so the kernel has always tainted the kernel after late loading.

Intel recently added a new previously reserved field to the microcode
header which contains the minimal microcode revision which must be running
on the CPU to make the load safe. This field is 0 in all older microcode
revisions, which the kernel assumes to be unsafe. Minimal revision checking
can be enforced via Kconfig or kernel command line. It then refuses to load
an unsafe revision. The default loads unsafe revisions like before and
taints the kernel. If a safe revision is loaded the kernel is not tainted.

But that does not solve all other known problems with late loading:

- Late loading on current Intel CPUs is unsafe vs. NMI when
hyperthreading is enabled. If a NMI hits the secondary sibling while
the primary loads the microcode, the machine can crash.

- Soft offline SMT siblings which are playing dead with MWAIT can cause
damage too when the microcode update modifies MWAIT. That's a
realistic scenario in the context of 'nosmt' mitigations. :(

Neither the core code nor the Intel specific code handles any of this at all.

While trying to implement this, I stumbled over disfunctional, horribly
complex and redundant code, which I decided to clean up first so the new
functionality can be added on a clean slate.

So the series has several sections:

1) Cleanup core code, header files and Kconfig

2) Cleanup of the Intel specific code

3) Implementation of proper core control logic to handle the NMI safe
requirements

4) Support for minimal revision check in the core and the Intel specific
parts.

Thanks to Borislav for discussing this with me and helping out with
testing. Thanks also to Ashok who contributed a few patches and helped
with testing on the Intel side especially with the new minimal revision
mechanism.

The series applies on:

git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/microcode

and is also available from git:

git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git ucode-v1

Thanks,

tglx
---
arch/x86/include/asm/microcode_amd.h | 56 -
arch/x86/include/asm/microcode_intel.h | 88 --
b/Documentation/admin-guide/kernel-parameters.txt | 5
b/arch/x86/Kconfig | 63 -
b/arch/x86/include/asm/apic.h | 5
b/arch/x86/include/asm/microcode.h | 162 +---
b/arch/x86/kernel/apic/apic_flat_64.c | 2
b/arch/x86/kernel/apic/ipi.c | 9
b/arch/x86/kernel/apic/x2apic_cluster.c | 1
b/arch/x86/kernel/apic/x2apic_phys.c | 1
b/arch/x86/kernel/cpu/common.c | 1
b/arch/x86/kernel/cpu/intel.c | 176 ----
b/arch/x86/kernel/cpu/microcode/Makefile | 4
b/arch/x86/kernel/cpu/microcode/amd.c | 25
b/arch/x86/kernel/cpu/microcode/core.c | 518 +++++++++++---
b/arch/x86/kernel/cpu/microcode/intel.c | 807 ++++++++++------------
b/arch/x86/kernel/cpu/microcode/internal.h | 190 +++++
b/arch/x86/kernel/nmi.c | 9
b/arch/x86/mm/init.c | 1
b/drivers/platform/x86/intel/ifs/load.c | 4
20 files changed, 1109 insertions(+), 1018 deletions(-)