Re: [PATCH] init: refactor the generic cpu_to_node for NUMA

From: Greg KH
Date: Thu Jan 18 2024 - 04:27:58 EST


On Thu, Jan 18, 2024 at 11:14:12AM +0800, Huang Shijie wrote:
> (0) We list the ARCHs which support the NUMA:
> arm64, loongarch, powerpc, riscv,
> sparc, mips, s390, x86,

I do not understand this format, what are you saying here?

Have you read the kernel documentation for how to write changelog texts?
It doesn't say "list a bunch of things", it's a bit more descriptive.

>
> (1) Some ARCHs in (0) override the generic cpu_to_node(), such as:
> sparc, mips, s390, x86.
>
> Since these ARCHs have their own cpu_to_node(), we do not care
> about them.
>
> (2) The ARCHs enable NUMA and use the generic cpu_to_node.
> From (0) and (1), we can know that four ARCHs support NUMA and
> use the generic cpu_to_node:
> arm64, loongarch, powerpc, riscv,
>
> The generic cpu_to_node depends on percpu "numa_node".
>
> (2.1) The loongarch sets "numa_node" in:
> start_kernel --> smp_prepare_boot_cpu()
>
> (2.2) The arm64, powerpc, riscv set "numa_node" in:
> start_kernel --> arch_call_rest_init() --> rest_init()
> --> kernel_init() --> kernel_init_freeable()
> --> smp_prepare_cpus()
>
> (2.3) The first place calling the cpu_to_node() is early_trace_init():
> start_kernel --> early_trace_init()--> __ring_buffer_alloc()
> --> rb_allocate_cpu_buffer()
>
> (2.4) So it safe for loongarch. But for arm64, powerpc and riscv,
> there are at least four places in the common code where
> the cpu_to_node() is called before it is initialized:
> a.) early_trace_init() in kernel/trace/trace.c
> b.) sched_init() in kernel/sched/core.c
> c.) init_sched_fair_class() in kernel/sched/fair.c
> d.) workqueue_init_early() in kernel/workqueue.c
>
> (3) In order to fix the issue, the patch refactors the generic cpu_to_node:
> (3.1) change cpu_to_node to function pointer,
> and export it for kernel modules.
>
> (3.2) introduce _cpu_to_node() which is the original cpu_to_node().
>
> (3.3) introduce smp_prepare_boot_cpu_start() to wrap the original
> smp_prepare_boot_cpu(), and set cpu_to_node with
> early_cpu_to_node which works fine for arm64, powerpc,
> riscv and loongarch.
>
> (3.4) introduce smp_prepare_cpus_done() to wrap the original
> smp_prepare_cpus().
> The "numa_node" is ready after smp_prepare_cpus(),
> then set cpu_to_node with _cpu_to_node().

When you start listing different things in a changelog, that's a hint to
the reviewer to say "please break this up" as patches need to do only
one thing at a time. As I can't follow the above text at all, that's
all the review comments I'm able to give here, sorry.

But as-is, this isn't acceptable :(

thanks,

greg k-h