Re: [PATCH] NUMA: Early use of cpu_to_node() returns 0 instead of the correct node id

From: Shijie Huang
Date: Mon Jan 22 2024 - 02:33:01 EST

Next message: Petr Vorel: "Re: [PATCH] keys, dns: Fix size check of V1 server-list header"
Previous message: Vinod Koul: "Re: [PATCH V2 01/13] ASoC/soundwire: implement generic api for scanning amd soundwire controller"
In reply to: Yury Norov: "Re: [PATCH] NUMA: Early use of cpu_to_node() returns 0 instead of the correct node id"
Next in thread: Mike Rapoport: "Re: [PATCH] NUMA: Early use of cpu_to_node() returns 0 instead of the correct node id"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

在 2024/1/20 2:02, Yury Norov 写道:

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please be mindful of safe email handling and proprietary information protection practices.]

On Fri, Jan 19, 2024 at 04:50:53PM +0800, Shijie Huang wrote:

在 2024/1/19 16:42, Mike Rapoport 写道:

On Fri, Jan 19, 2024 at 02:46:16PM +0800, Shijie Huang wrote:

在 2024/1/19 12:42, Yury Norov 写道:

This adds another level of indirection, I think. Currently cpu_to_node
is a simple inliner. After the patch it would be a real function with
all the associate overhead. Can you share a bloat-o-meter output here?

#./scripts/bloat-o-meter vmlinux vmlinux.new
add/remove: 6/1 grow/shrink: 61/51 up/down: 1168/-588 (580)
Function old new delta
numa_update_cpu 148 244 +96

...................................................................................................................................(to many to skip)

Total: Before=32990130, After=32990710, chg +0.00%

It's not only about text size, the indirect call also hurts performance

The cpu_to_node() is called at very low frequency, most of the times is in
the kernel booting time.

That doesn't matter. This function is a simple inliner that dereferences
a pointer, and I believe all of us want to keep it simple.

Yes. I agree.

I also want to keep it simple too.

Regardless, I don't think that the approach is correct. As per your
description, some initialization functions erroneously call
cpu_to_node() instead of early_cpu_to_node() which exists specifically
for that case.

If the above correct, it's clearly a caller problem, and the fix is to
simply switch all those callers to use early version.

It is easy to change to early_cpu_to_node() for sched_init(),
init_sched_fair_class()

and workqueue_init_early(). These three places call the cpu_to_node() in the
__init function.

But it is a little hard to change the early_trace_init(), since it calls
cpu_to_node in the deep

function stack:

early_trace_init() --> ring_buffer_alloc() -->rb_allocate_cpu_buffer()

For early_trace_init(), we need to change more code.

Anyway, If we think it is not a good idea to change the common code, I am
oaky too.

Is there a fundamental reason to have early_cpu_to_node() at all?

The early_cpu_to_node does not work on some ARCHs (which support the NUMA),
such

as SPARC, MIPS and S390.

So, your approach wouldn't work either, right? I think you've got a
testing bot report on it already...

IMHO, my patch works fine for them.

They have their own cpu_to_node.

The x86 reported an compiling error, because the x86 does not compile

the driver/base/arch_numa.c.

I have fixed it by moving the cpu_to_node from

driver/base/arch_numa.c to driver/base/node.c

The driver/base/node.c is built-in for all the NUMA ARCHs.

You can make it like this:

#ifdef CONFIG_ARCH_NO_EARLY_CPU_TO_NODE
#define early_cpu_to_node cpu_to_node
#endif

Thanks. Add this make it more complicated..

It seems that all the mappings are known by the end of setup_arch() and the
initialization of numa_node can be moved earlier.

I would also initialize the numa_node with NUMA_NO_NODE at declaration,
so that if someone calls cpu_to_node() before the variable is properly
initialized at runtime, he'll get NO_NODE, which is obviously an error.

Even we set the numa_node with NUMA_NO_NODE, it does not always produce
error.

You can print this error yourself:

#ifndef cpu_to_node
static inline int cpu_to_node(int cpu)
{
int node = per_cpu(numa_node, cpu);

#ifdef CONFIG_DEBUG_PER_CPU_MAPS
if (node == NUMA_NO_NODE)
pr_err(...);
#endif

return node;
}
#endif

Thanks. I had a samiliar private to detect it.

After my patch, there is no need to detect the error again.

Thanks

Huang Shijie

Next message: Petr Vorel: "Re: [PATCH] keys, dns: Fix size check of V1 server-list header"
Previous message: Vinod Koul: "Re: [PATCH V2 01/13] ASoC/soundwire: implement generic api for scanning amd soundwire controller"
In reply to: Yury Norov: "Re: [PATCH] NUMA: Early use of cpu_to_node() returns 0 instead of the correct node id"
Next in thread: Mike Rapoport: "Re: [PATCH] NUMA: Early use of cpu_to_node() returns 0 instead of the correct node id"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]