Re: [patch 00/53] x86/topology: The final installment

From: Thomas Gleixner
Date: Tue Aug 08 2023 - 17:00:10 EST


On Tue, Aug 08 2023 at 13:30, Sohil Mehta wrote:
> On 8/8/2023 12:10 PM, Thomas Gleixner wrote:

> domain: Thread shift: 1 dom_size: 2 max_threads: 2
> domain: Core shift: 5 dom_size: 16 max_threads: 32
> domain: Module shift: 5 dom_size: 1 max_threads: 32
> domain: Tile shift: 5 dom_size: 1 max_threads: 32
> domain: Die shift: 5 dom_size: 1 max_threads: 32
> domain: Package shift: 5 dom_size: 1 max_threads: 32
>
> CPU 0:
> 0x0000000b 0x00: eax=0x00000001 ebx=0x00000002 ecx=0x00000100 edx=0x00000000
> 0x0000000b 0x01: eax=0x00000005 ebx=0x00000014 ecx=0x00000201 edx=0x00000000

Ok. So this is consistent.

> Also, I see a warning message that only seems to show up with the final
> installment series applied. I attached the complete dmesg as well (just
> in case):
>
> unchecked MSR access error: WRMSR to 0xe44 (tried to write
> 0x0000000000000003) at rIP: 0xffffffff8d2a6698 (native_write_msr+0x8/0x30)
> uncore_box_ref.part.0+0xa6/0xe0
> uncore_event_cpu_online+0x6e/0x1c0
> ? __pfx_uncore_event_cpu_online+0x10/0x10
> cpuhp_invoke_callback+0x165/0x4b0

That's probably a consequence of the inconsistency.

> [ 0.187210] CPU topo: Register 000 1
> [ 0.187211] CPU topo: Register 002 1
> [ 0.187212] CPU topo: Register 004 1
> [ 0.187213] CPU topo: Register 006 1
> [ 0.187214] CPU topo: Register 008 1
> [ 0.187215] CPU topo: Register 010 1
> [ 0.187216] CPU topo: Register 012 1
> [ 0.187217] CPU topo: Register 014 1
> [ 0.187218] CPU topo: Register 016 1
> [ 0.187219] CPU topo: Register 018 1

The first package (primary threads)

> [ 0.187219] CPU topo: Register 020 1
> [ 0.187220] CPU topo: Register 022 1
> [ 0.187221] CPU topo: Register 024 1
> [ 0.187222] CPU topo: Register 026 1
> [ 0.187223] CPU topo: Register 028 1
> [ 0.187223] CPU topo: Register 030 1
> [ 0.187224] CPU topo: Register 032 1
> [ 0.187225] CPU topo: Register 034 1
> [ 0.187226] CPU topo: Register 036 1
> [ 0.187227] CPU topo: Register 038 1

The second package (primary threads)

> [ 0.187228] CPU topo: Register 001 1
> [ 0.187228] CPU topo: Register 003 1
> [ 0.187229] CPU topo: Register 005 1
> [ 0.187230] CPU topo: Register 007 1
> [ 0.187230] CPU topo: Register 009 1
> [ 0.187231] CPU topo: Register 011 1
> [ 0.187232] CPU topo: Register 013 1
> [ 0.187233] CPU topo: Register 015 1
> [ 0.187233] CPU topo: Register 017 1
> [ 0.187234] CPU topo: Register 019 1

The second package (secondary threads)

> [ 0.187235] CPU topo: Register 021 1
> [ 0.187235] CPU topo: Register 023 1
> [ 0.187236] CPU topo: Register 025 1
> [ 0.187237] CPU topo: Register 027 1
> [ 0.187238] CPU topo: Register 029 1
> [ 0.187238] CPU topo: Register 031 1
> [ 0.187239] CPU topo: Register 033 1
> [ 0.187240] CPU topo: Register 035 1
> [ 0.187241] CPU topo: Register 037 1
> [ 0.187241] CPU topo: Register 039 1

The second package (secondary threads)

> [ 0.187244] CPU topo: Register 000 0
> [ 0.187244] CPU topo: Register 001 0

... PKG 0

> [ 0.187266] CPU topo: Register 01e 0
> [ 0.187267] CPU topo: Register 01f 0

Ah. that's indeed the issue which the ACPI patch addresses. So that
table claims that the packages are truly filled up to capacity, i.e. 32
threads. The old code did not notice because they are all marked
non-present, but with the new approach these are rightfully accounted as
pluggable and show up in the bitmaps accordingly. Sigh...

> [ 0.187268] CPU topo: Register 020 0
... PKG 1
> [ 0.187291] CPU topo: Register 03f 0

> [ 0.187292] CPU topo: Register 040 0
... PKG 2
> [ 0.187304] CPU topo: Register 05f 0

> [ 0.187305] CPU topo: Register 060 0
... PKG 3
> [ 0.187335] CPU topo: Register 077 0

This one is funny as it stops at 0x77, i.e 8 CPUs short of the full
range.

So this:

> [ 0.187412] CPU topo: Max. logical packages: 4

_IS_ correct according to the above.

I bet that the ACPI patch cures it.

Thanks,

tglx