Re: WARNING: at mm/slub.c:3357, kernel BUG at mm/slub.c:3413

From: Christoph Lameter
Date: Mon Nov 21 2011 - 22:18:59 EST


On Mon, 21 Nov 2011, Christian Kujau wrote:

> On Tue, 22 Nov 2011 at 07:27, Benjamin Herrenschmidt wrote:
> > Note that I hit a similar looking crash (sorry, I couldn't capture a
> > backtrace back then) on a PowerMac G5 (ppc64) while doing a large rsync
> > transfer yesterday with -rc2-something (cfcfc9ec) and
> > Christian Kujau (CC) seems to be able to reproduce something similar on
> > some other ppc platform (Christian, what is your setup ?)
>
> I seem to hit it with heavy disk & cpu IO is in progress on this PowerBook
> G4. Full dmesg & .config: http://nerdbynature.de/bits/3.2.0-rc1/oops/
>
> I've enabled some debug options and now it really points to slub.c:2166

Hmmm... That means that c->page points to page not frozen. Per cpu
partial pages are frozen until they are reused or until the partial list
is flushed.

Does this ever happen on x86 or only on other platforms? In put_cpu_partial() the
this_cpu_cmpxchg really needs really to be irq safe. this_cpu_cmpxchg is
only preempt safe.

Index: linux-2.6/mm/slub.c
===================================================================
--- linux-2.6.orig/mm/slub.c 2011-11-21 21:15:41.575673204 -0600
+++ linux-2.6/mm/slub.c 2011-11-21 21:16:33.442336849 -0600
@@ -1969,7 +1969,7 @@
page->pobjects = pobjects;
page->next = oldpage;

- } while (this_cpu_cmpxchg(s->cpu_slab->partial, oldpage, page) != oldpage);
+ } while (irqsafe_cpu_cmpxchg(s->cpu_slab->partial, oldpage, page) != oldpage);
stat(s, CPU_PARTIAL_FREE);
return pobjects;
}

x
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/