Re: [patch 00/29] x86/cpu: Rework the topology evaluation

From: Thomas Gleixner
Date: Wed Jul 26 2023 - 18:38:30 EST


On Wed, Jul 26 2023 at 15:15, Sohil Mehta wrote:
> On 7/24/2023 10:43 AM, Thomas Gleixner wrote:
>> The series is based on the APIC cleanup series:
>>
>> https://lore.kernel.org/lkml/20230724131206.500814398@xxxxxxxxxxxxx
>>
>> and also available on top of that from git:
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git topo-cpuid-v1
>>
>
> The series boots fine on an old Sandy Bridge 2S system. There is a print
> from topology_update_die_map() which is missing from dmesg but it seems
> mostly harmless.
>
>> [ 0.085686] smpboot: x86: Booting SMP configuration:> [ 0.085690] .... node #0, CPUs: #1 #2 #3 #4 #5
> #6 #7 #8 #9
>> [ 0.089253] .... node #1, CPUs: #10 #11 #12 #13 #14 #15 #16 #17 #18 #19
>> [ 0.000000] smpboot: CPU 10 Converting physical 0 to logical die 1
>
> ^^ The "Converting physical..." line doesn't show up with the patches
> applied.

That message comes from the complete nonsense in the current upstream
code that cpuinfo::die_id is made relative to the package. That's just
wrong. My rework uses the physical die id which is unique by definition
and therefore does not need this conversion. The logical ID is the same
as the physical id in that case.

>> [ 0.134035] .... node #0, CPUs: #20 #21 #22 #23 #24 #25 #26 #27 #28 #29
>> [ 0.140239] .... node #1, CPUs: #30 #31 #32 #33 #34 #35 #36 #37 #38 #39
>
> Please let me know if you need more information.
>
> Tested-by: Sohil Mehta <sohil.mehta@xxxxxxxxx>

There is a real and unrelated snafu vs. that logical package and logical
die management which I discovered today. I missed the fact that this
cruft abuses cpuinfo as permanent storage, which breaks CPU
offline/online as the online operation reinitializes the topology
information.

I pushed out a fixed version to

git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git x86/topology

earlier. It doesn't have a tag yet.

Thanks,

tglx