Re: nr_cpu_ids vs AMD 3970x(32 physical CPUs)

From: Gabriel C
Date: Fri Jul 03 2020 - 13:08:09 EST


Am Fr., 3. Juli 2020 um 17:58 Uhr schrieb Uladzislau Rezki <urezki@xxxxxxxxx>:
>
> Hello, folk.

Hello,

>
> I have a system based on AMD 3970x CPUs. It has 32 physical cores
> and 64 threads. It seems that "nr_cpu_ids" variable is not correctly
> set on latest 5.8-rc3 kernel. Please have a look below on dmesg output:
>
> <snip>
> urezki@pc638:~$ sudo dmesg | grep 128
> [ 0.000000] IOAPIC[0]: apic_id 128, version 33, address 0xfec00000, GSI 0-23
> [ 0.000000] smpboot: Allowing 128 CPUs, 64 hotplug CPUs
> [ 0.000000] setup_percpu: NR_CPUS:512 nr_cpumask_bits:512 nr_cpu_ids:128 nr_node_ids:1
> ...
> [ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=128, Nodes=1
> [ 0.000000] rcu: RCU restricting CPUs from NR_CPUS=512 to nr_cpu_ids=128.
> [ 0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=128
> urezki@pc638:~$
> <snip>
>
> For example SLUB thinks that it deals with 128 CPUs in the system what is
> wrong if i do not miss something. Since nr_cpu_ids is broken(?), thus the
> "cpu_possible_mask" does not correspond to reality as well.
>
> Any thoughts?

This is not a 5.8-rc3 problem. Almost all AMD CPUs and APUs are
looking like this.
The only CPUs I own are getting that right is a dual EPYC box,
everything else is broken
regarding the right C/T & socket(s) count, and that probably bc is
using NUAM code
to have the info.

I reported that a while back and no-one ever cared.

There is even a comment in the hotplug code saying setting the wrong CPU count
is a waste of resources.

I have a 2200G is reporting 48Cores.

AMD Ryzen 7 3750H reporting twice the cores and twice the socket.

...

[ 0.040578] smpboot: Allowing 16 CPUs, 8 hotplug CPUs
...
[ 0.382122] smpboot: Max logical packages: 2
..

I boot all the boxes restricting the cores to the correct count on the
command line.

Wasted resource or not, this is still a bug IMO.

Best Regards,

Gabriel C