Re: 2.6.15-mm4 failure on power5

From: Ingo Molnar
Date: Wed Jan 18 2006 - 01:36:18 EST



* Dave C Boutcher <sleddog@xxxxxxxxxx> wrote:

> On Wed, Jan 18, 2006 at 11:19:36AM +1100, Michael Ellerman wrote:
> > It booted fine _with_ the patch applied, with DEBUG_MUTEXES=y and n.
> >
> > Boutcher, to be clear, you can't boot with kernel-kernel-cpuc-to-mutexes.patch
> > applied and DEBUG_MUTEXES=y ?
> >
> > But if you revert kernel-kernel-cpuc-to-mutexes.patch it boots ok?
> >
> > This is looking quite similar to another hang we're seeing on Power4 iSeries
> > on mainline git:
> > http://ozlabs.org/pipermail/linuxppc64-dev/2006-January/007679.html
>
> Correct...I die in exactly the same place every time with
> DEBUG_MUTEXES=Y. I posted a backtrace that points into the _lock_cpu
> code, but I haven't really dug into the issue yet. I believe this is
> very timing related (Serge was dying slightly differently).

so my question still is: _without_ the workaround patch, i.e. with
vanilla -mm4, and DEBUG_MUTEXES=n, do you get a hang?

the reason for my question is that DEBUG_MUTEXES=y will e.g. enable
interrupts - so buggy early bootup code which relies on interrupts being
off might be surprised by it. The fact that you observed that it's
somehow related to the timer interrupt seems to strengthen this
suspicion. DEBUG_MUTEXES=n on the other hand should have no such
interrupt-enabling effects.

[ if this indeed is the case then i'll add irqs_off() checks to
DEBUG_MUTEXES=y, to ensure that the mutex APIs are never called with
interrupts disabled. ]

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/