Re: [BUG][PATCH] fix mempolcy's check on a system with memory-less-node take2

From: Andi Kleen
Date: Thu Feb 08 2007 - 03:19:38 EST


On Thursday 08 February 2007 09:08, Andrew Morton wrote:
> On Thu, 8 Feb 2007 09:03:46 +0100 Andi Kleen <ak@xxxxxxx> wrote:
>
> > On Thursday 08 February 2007 09:00, Andrew Morton wrote:
> > > On Thu, 8 Feb 2007 08:49:41 +0100 Andi Kleen <ak@xxxxxxx> wrote:
> > >
> > > >
> > > > > This panic(hang) was found by a numa test-set on a system with 3 nodes, where
> > > > > node(2) was memory-less-node.
> > > >
> > > > I still think it's the wrong fix -- just get rid of the memory less node.
> > >
> > > "Let's break it even more"?
> >
> > I still don't get what you believe what would be broken then.
>
> A node with no memory is physical reality. The kernel should do its best
> handle and report it accurately.

What would be the advantage of that?

The reason we present nodes to user space is that we can tell the user
where the memory is. You seem to try to promote it to some abstract entity
beyond that, but that doesn't seem particularly fruitful to me. I think
I prefer "down to earth" memory nodes.

> Pretending that the CPUs on that node are
> local to a different node's memory (as I understand your proposal) goes
> against that.

Well it is then most local to that node's memory.

>
> > > > I expect you'll likely run into more problems with that setup anyways.
> > >
> > > What happens if he doesn't run into more problems?
> >
> > Then he's lucky. I ran into problems at least when I still had the empty
> > nodes some time ago on x86-64. Christoph said SN2 is doing the same.
> >
> > iirc slab blew up at least, but that might be fixed by now. But it's a little risky
> > because there is more code now that is node aware.
> >
>
> Well... I'd suggest that we try to struggle on, get it working. Is there
> a downside to doing that?

The main problem I used to have was regressions. e.g. having memory
less nodes in different combinations would break in new releases.
(on x86-64 this depends on how the DIMMs are distributed so sometimes
breakage wasn't noticed for a long time. We ended up trying all combinations
on a simulator in the end). Assigning to nearby nodes avoided that special
case.

Ok there is still one other case -- nodes that have memory, but not enough
to do anything useful. This can happen too, but is still somewhat different.
Basically all kmalloc_node() need to fail gratiously.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/