Re: [PATCH 0/2] faking and fixing the NUMA SLIT

From: Joachim Deguara
Date: Wed Jul 18 2007 - 07:06:43 EST


On Wednesday 18 July 2007 11:42:20 Andi Kleen wrote:
> On Wednesday 18 July 2007 11:30:01 Joachim Deguara wrote:
> > The problem with NUMA distances in the SLIT is that they are often wrong,
> > oh wait they aren't there at all because the BIOS didn't create a SLIT
> > since Windows does not use it. ÂIf Linux does not find a slit it just
> > says the distance to local=10 and remote=20 according to ACPI spec. ÂThe
> > problem is when we have a 4P system (or larger), there is generally one
> > node where we have two hops and its distance should be >20.
> >
> > Following are patches to first fake the SLIT in the ACPI code and then
> > add ability to write the distances from sysfs.
>
> The main use for the SLIT information are the zone fallback lists in
> the VM. These are created at boot. If you change the SLIT later these
> won't be regenerated.
>
> The scheduler also uses it for load balancing, but it is much less
> important there than in the VM.

I looked at how node_distance() was called from page_alloc.c and sched.c but I
overlooked that those results are really only used at init time. So the
backing mechanism of patching the SLIT is right but sysfs is way to late for
those uses and creating a boot option is as you say ugly. It would be great
if BIOS just did this for us and correctly but what do you really expect
that ;)

I would be happy to add it as a boot option if there is any popular support
for the idea.

-Joachim





-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/