Re: unchecked MSR access error: WRMSR to 0xd84 (tried to write 0x0000000000010003) at rIP: 0xffffffffa025a1b8 (snbep_uncore_msr_init_box+0x38/0x60 [intel_uncore])

From: Thomas Gleixner
Date: Tue Mar 05 2024 - 05:14:16 EST


On Mon, Mar 04 2024 at 21:12, Borislav Petkov wrote:
> On Mon, Mar 04, 2024 at 02:22:50PM -0500, Liang, Kan wrote:
>> The number of available CBOXs on a SNBEP machine is determined at boot
>> time. It should not be larger than the maximum number of cores.
>> The recent commit 89b0f15f408f ("x86/cpu/topology: Get rid of
>> cpuinfo::x86_max_cores") change the boot_cpu_data.x86_max_cores to
>> topology_num_cores_per_package().
>> I guess the new function probably returns a different maximum number of
>> cores on the machine. But I don't have a SNBEP on my hands. Could you
>> please help to check whether a different maximum number of cores is
>> returned?

It's a MADT problem. MADT advertises 4 present cores and 12 hotpluggable
cores on the same package. At least that's what the topology evaluation
figures out based on MADT and presumably CPUID leaf 0xb.

CPU topo: Allowing 8 present CPUs plus 24 hotplug CPUs

/sys/kernel/debug/x86/topo/domains which is based on leaf 0xb has:

domain: Thread shift: 1 dom_size: 2 max_threads: 2
domain: Core shift: 5 dom_size: 16 max_threads: 32
domain: Module shift: 5 dom_size: 1 max_threads: 32
domain: Tile shift: 5 dom_size: 1 max_threads: 32
domain: Die shift: 5 dom_size: 1 max_threads: 32
domain: DieGrp shift: 5 dom_size: 1 max_threads: 32
domain: Package shift: 5 dom_size: 1 max_threads: 32

So the algorithm is correct and works as designed :)

How to handle that becomes interesting.

The above situation decribed by leaf 0xb and MADT can happen on
virtualization, i.e. the only actual user of "physical" CPU hotplug,
because virtualization people do not necessarily care about package
topology much. It works so it must be correct :)

It seems that none of the consumers of topology_num_cores_per_package()
can actually be used on virt, so a reasonable restriction is to reject
non-present CPUs on bare metal. Something like the below.

Thanks

tglx
---
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -24,6 +24,7 @@
#include <linux/pgtable.h>

#include <asm/e820/api.h>
+#include <asm/hypervisor.h>
#include <asm/irqdomain.h>
#include <asm/pci_x86.h>
#include <asm/io_apic.h>
@@ -169,11 +170,19 @@ static bool __init acpi_is_processor_usa
if (lapic_flags & ACPI_MADT_ENABLED)
return true;

- if (!acpi_support_online_capable ||
- (lapic_flags & ACPI_MADT_ONLINE_CAPABLE))
+ if (lapic_flags & ACPI_MADT_ONLINE_CAPABLE)
return true;

- return false;
+ if (acpi_support_online_capable)
+ return false;
+
+ /* Physical hotplug on bare metal is not supported */
+ if (hypervisor_is_type(X86_HYPER_NATIVE)) {
+ pr_info_once("Ignoring non-present APIC ID on bare metal\n");
+ return false;
+ }
+
+ return true;
}

static int __init