Re: [patch 37/53] x86/cpu: Detect real BSP on crash kernels

From: Zhang, Rui
Date: Mon Jan 08 2024 - 09:11:41 EST


> +static __init void check_for_kdump_kernel(void)
> +{
> +       u32 bsp_apicid;
> +
> +       /*
> +        * There is no real good way to detect whether this a kdump()
> +        * kernel, but except on the Voyager SMP monstrosity which is
> not
> +        * longer supported, the real BSP has always the lowest
> numbered
> +        * APIC ID. If a crash happened on an AP, which then ends up
> as
> +        * boot CPU in the kdump() kernel, then sending INIT to the
> real
> +        * BSP would reset the whole system.
> +        */


Hi, Thomas,

Unfortunately this causes a regression on Intel Meteorlake platform,
where the BSP APIC ID is NOT the lowest numbered APIC ID (instead,
CPU12, the first Ecore CPU, has APIC ID 0).

And this causes the system fails to enumerate CPU12 (I didn't do
bisect. I suspect this patch breaks it by reading the code).

log with 6.7-rc vanilla kernel,
[ 0.335133] smp: Bringing up secondary CPUs ...
[ 0.335133] smpboot: x86: Booting SMP configuration:
[ 0.335133] .... node #0, CPUs: #1 #3 #6 #8 #10 #12 #13
#14 #15 #16 #17 #18 #19 #20 #21
[ 0.010435] core: cpu_atom PMU driver: PEBS-via-PT
[ 0.010435] ... version: 5
[ 0.010435] ... bit width: 48
[ 0.010435] ... generic registers: 8
[ 0.010435] ... value mask: 0000ffffffffffff
[ 0.010435] ... max period: 00007fffffffffff
[ 0.010435] ... fixed-purpose events: 3
[ 0.010435] ... event mask: 00000007000000ff
[ 0.339203] #2 #4 #5 #7 #9 #11
[ 0.343208] smp: Brought up 1 node, 22 CPUs

log with 6.5-rc4 kernel + your patch series,
[ 2.208960] smpboot: x86: Booting SMP configuration:
[ 2.209869] .... node #0, CPUs: #1 #3 #6 #8 #10 #13 #14
#15 #16 #17 #18 #19 #20 #21
[ 1.796167] core: cpu_atom PMU driver: PEBS-via-PT
[ 1.796167] ... version: 5
[ 1.796167] ... bit width: 48
[ 1.796167] ... generic registers: 8
[ 1.796167] ... value mask: 0000ffffffffffff
[ 1.796167] ... max period: 00007fffffffffff
[ 1.796167] ... fixed-purpose events: 3
[ 1.796167] ... event mask: 00000007000000ff
[ 2.260958] #2 #4 #5 #7 #9 #11
[ 2.263906] smp: Brought up 1 node, 21 CPUs


thanks,
rui

# cpuid -l 0x1f -s 0 | grep x2APIC
x2APIC ID of logical processor = 0x20 (32)
x2APIC ID of logical processor = 0x10 (16)
x2APIC ID of logical processor = 0x11 (17)
x2APIC ID of logical processor = 0x18 (24)
x2APIC ID of logical processor = 0x19 (25)
x2APIC ID of logical processor = 0x21 (33)
x2APIC ID of logical processor = 0x28 (40)
x2APIC ID of logical processor = 0x29 (41)
x2APIC ID of logical processor = 0x30 (48)
x2APIC ID of logical processor = 0x31 (49)
x2APIC ID of logical processor = 0x38 (56)
x2APIC ID of logical processor = 0x39 (57)
x2APIC ID of logical processor = 0x0 (0)
x2APIC ID of logical processor = 0x2 (2)
x2APIC ID of logical processor = 0x4 (4)
x2APIC ID of logical processor = 0x6 (6)
x2APIC ID of logical processor = 0x8 (8)
x2APIC ID of logical processor = 0xa (10)
x2APIC ID of logical processor = 0xc (12)
x2APIC ID of logical processor = 0xe (14)
x2APIC ID of logical processor = 0x40 (64)
x2APIC ID of logical processor = 0x42 (66)