RE: How can I optimize a process on a NUMA architecture(x86-64 specifically)?

From: David Schwartz
Date: Sat May 22 2004 - 21:52:10 EST



> Let's say I have a 2 way opteron and want to run 4 long-lived processes.
> I fork and exec to create 1 of the processes, it chooses to run on
> processor 0 since processor 1 is overloaded at that time, so its
> homenode is processor 0. I fork and exec another, it chooses processor
> 0 since processors 1 is overloaded at that time. .. Let's say an uneven
> distribution is chosen for all 4 processes, with all processes mapped to
> processor 0. So they allocate on node 0 yet the scheduler will map these
> to both processors since CPU should be balanced. In this case, you will
> have a situation where the second processor will have to fetch memory
> from the other processor's memory.
>
> So a better solution would be to use numactl to set the homenodes
> explicitly, choosing processor 0 for 2 processes, processor 1 for the 2
> other processes.
>
> Is this incorrect?

Generally, yes, it is. Surprisingly so. If you assume everything is
perfect, then this seems true. But in the real world, it almost never works
that way.

Consider, for example, if process 1 is responsible for most of the memory
load at any particular time. If it's on CPU #1 and all its memory is on CPU
#1, then the memory controller on CPU #2 is underused and memory bandwidth
suffers. This is why most BIOSes interlave the memory pages across the CPUs.
This gives you the best chance of being able to use both memory controllers
under load.

I don't think we've reached the point yet where treating x86-64 systems as
NUMA machines makes very much sense.

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/