RE: [PATCH] cpu-topology: Skip the exist but not possible cpu nodes

From: Zengtao (B)
Date: Tue Jan 07 2020 - 21:02:04 EST


> -----Original Message-----
> From: Dietmar Eggemann [mailto:dietmar.eggemann@xxxxxxx]
> Sent: Tuesday, January 07, 2020 9:12 PM
> To: Zengtao (B); sudeep.holla@xxxxxxx
> Cc: Linuxarm; Greg Kroah-Hartman; Rafael J. Wysocki;
> linux-kernel@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH] cpu-topology: Skip the exist but not possible cpu
> nodes
>
> On 07/01/2020 02:35, Zengtao (B) wrote:
> >> -----Original Message-----
> >> From: Dietmar Eggemann [mailto:dietmar.eggemann@xxxxxxx]
> >> Sent: Tuesday, January 07, 2020 2:42 AM
> >> To: Zengtao (B); sudeep.holla@xxxxxxx
> >> Cc: Linuxarm; Greg Kroah-Hartman; Rafael J. Wysocki;
> >> linux-kernel@xxxxxxxxxxxxxxx
> >> Subject: Re: [PATCH] cpu-topology: Skip the exist but not possible cpu
> >> nodes
> >>
> >> On 02/01/2020 04:24, Zeng Tao wrote:
> >>> When CONFIG_NR_CPUS is smaller than the cpu nodes defined in the
> >> device
> >>> tree, the cpu node parsing will fail. And this is not reasonable for a
> >>> legal device tree configs.
> >>> In this patch, skip such cpu nodes rather than return an error.
> >>
> >> Is this extra code really necessary?
> >>
> >> Currently you get warnings indicating that CONFIG_NR_CPUS is too
> small
> >> so you could correct the setup issue easily.
> >>
> >
> > Not only about warning messages, the problem is :
> > What we are expected to do if the CONFIG_NR_CPUS is too small? I think
> there
> > are two choices:
> > 1. Keep the dts parsing result, but skip the the CPU nodes whose id
> exceeds the
> > the CONFIG_NR_CPUS, and this is what this patch do.
> > 2. Just abort all the CPU nodes parsing, and using MPIDR to guess the
> topology,
> > and this is what the current code do.
>
> Ah, you're referring to:
>
> 530 void __init init_cpu_topology(void)
> 531 {
> ...
> 540 else if (of_have_populated_dt() && parse_dt_topology())
> 541 --> reset_cpu_topology();
>
> With my Juno example (6 Cpus in DT but CONFIG_NR_CPUS=4):
>
> root@juno:~# dmesg | grep "\*\*\|mpidr"
> [ 0.084760] ** get_cpu_for_node() cpu=1
> [ 0.088706] ** get_cpu_for_node() cpu=2
> [ 0.092592] ** get_cpu_for_node() cpu=0
> [ 0.096550] ** get_cpu_for_node() cpu=3
> [ 0.105578] ** get_cpu_for_node() cpu=-19
> [ 0.116070] ** store_cpu_topology(): cpuid=0
> [ 0.120355] CPU0: cluster 1 core 0 thread -1 mpidr 0x00000080000100
> [ 0.242465] ** store_cpu_topology(): cpuid=1
> [ 0.242471] CPU1: cluster 0 core 0 thread -1 mpidr 0x00000080000000
> [ 0.286505] ** store_cpu_topology(): cpuid=2
> [ 0.286510] CPU2: cluster 0 core 1 thread -1 mpidr 0x00000080000001
> [ 0.330631] ** store_cpu_topology(): cpuid=3
> [ 0.330637] CPU3: cluster 1 core 1 thread -1 mpidr 0x00000080000101
>
> and with your patch:
>
> root@juno:~# dmesg | grep "\*\*\|mpidr"
> [ 0.084778] ** get_cpu_for_node() cpu=1
> [ 0.088742] ** get_cpu_for_node() cpu=2
> [ 0.092662] ** get_cpu_for_node() cpu=0
> [ 0.096627] ** get_cpu_for_node() cpu=3
> [ 0.107942] ** get_cpu_for_node() cpu=-19
> [ 0.119429] ** get_cpu_for_node() cpu=-19
> [ 0.123461] ** store_cpu_topology(): cpuid=0
> [ 0.243571] ** store_cpu_topology(): cpuid=1
> [ 0.287610] ** store_cpu_topology(): cpuid=2
> [ 0.331737] ** store_cpu_topology(): cpuid=3
>
> so we bail out of store_cpu_topology() since 'cpuid_topo->package_id !=
> -1'.
>

Good, you got me. And I found this issue when I test the NUMA issue.
Thanks.

> > And i think choice 1 is better because:
> > 1. It's a legal dts, we should keep the same result whether
> CONFIG_NR_CPUS is
> > too small or not.
> > 2. In the function of_parse_and_init_cpus, we just do the same way as
> choice 1.
> >
> > But i am open for the issue, any suggestions are welcomed.
>
> [...]