Re: [patch 37/53] x86/cpu: Detect real BSP on crash kernels

From: Zhang, Rui
Date: Sat Jan 13 2024 - 02:35:56 EST


On Fri, 2024-01-12 at 16:39 +0100, Thomas Gleixner wrote:
> On Fri, Jan 12 2024 at 09:14, Zhang, Rui wrote:
> > On Wed, 2024-01-10 at 16:14 +0100, Thomas Gleixner wrote:
> > > On Wed, Jan 10 2024 at 15:19, Thomas Gleixner wrote:
> > > > > This is the order in MADT,
> > > > > $ cat apic.dsl  | grep x2Apic
> > > > > [030h 0048   4]          Processor x2Apic ID : 00000010
> > > > > [040h 0064   4]          Processor x2Apic ID : 00000011
> > > ...
> > > > > and this is the order in Linux (from CPU0 to CPUN)
> > > > >       x2APIC ID of logical processor = 0x20 (32)
> > > > >       x2APIC ID of logical processor = 0x10 (16)
> > > >
> > > > What a mess...
> > >
> > > And clearly not according to the spec
> > >
> > >   "The second is that platform firmware should list the boot
> > > processor
> > >    as the first processor entry in the MADT."
> > >
> > > Oh well. There are reasons why this is written the way it is.
> >
> > This is indeed a violation of the ACPI spec and we should modify
> > the
> > order in MADT. But this doesn't bring any actual effect as Linux
> > already handles this, right?
>
> It brings the effect that we can detect when we are not booting
> (kexec
> case) on the actual boot CPU because then the first enumerated APIC
> ID
> is not the same as the boot CPU APIC ID. No?

Right.
I was thinking in the way this patch series does, which just compares
the boot CPU APIC ID and the lowest numbered APIC ID.

>
> > For the BSP APIC ID 0x20, I didn't find out a specific reason why
> > we
> > have to do it in that way, but it is still legal.
>
> Linux does not really care in which order the APICs are enumerated.
>
> > We may need to figure out another way to distinguish the kdump
> > kernel.
>
> Having the first enumerated APIC in the MADT as the actual boot CPU
> is a
> sensible and functional way. Everything else including the silly
> kexec
> boot parameter is error prone.
>
> I agree that MADT is error prone too given the fact that not even
> Intel
> can get it right....

For this MTL, I can raise an internal ticket to get it right.

Are there quite some platforms with BSP not listed as the first entry
in MADT?
if so, we still have to live with the kexec boot parameter? :)

thanks,
rui