Re: [bug] mm/slab.c boot crash in -git, "kernel BUG at mm/slab.c:2103!"

From: Pekka Enberg
Date: Fri Apr 11 2008 - 04:50:44 EST


Hi Ingo,
> On Fri, Apr 11, 2008 at 10:41 AM, Ingo Molnar <mingo@xxxxxxx> wrote:
> > our x86.git randconfig auto-qa found a mm/slab.c early-bootup crash in
> > mainline that got introduced since v2.6.24.
> >
> > http://redhat.com/~mingo/misc/log-Thu_Apr_10_10_41_16_CEST_2008.bad
> > http://redhat.com/~mingo/misc/config-Thu_Apr_10_10_41_16_CEST_2008.bad
> >
> > Note, the very same bzImage does not crash on other testboxes - only on
> > this 8-way box with 4GB of RAM.
> >
> > i tried a "use v2.6.24's slab.c" revert (with a few API fixes needed for
> > it to build on .25) but that didnt solve the problem either.

On Fri, Apr 11, 2008 at 11:21 AM, Pekka Enberg <penberg@xxxxxxxxxxxxxx> wrote:
> As mentioned privately, I suspect it's the page allocator changes that
> went into 2.6.24. Mel, Christoph, any ideas?

So I'm thinking it's probably related to this patch:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=523b945855a1427000ffc707c610abe5947ae607

As kmalloc_node() in setup_cpu_cache() returns NULL, it seems likely
to be due to the use of GFP_THISNODE in cache_alloc_refill() when
calling cache_grow() and that the semantics changed. No idea why page
allocator would think your UMA "local node" has no memory though.

Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/