Re: [PATCH 0/2] printk vs rq->lock and xtime lock

From: Peter Zijlstra
Date: Mon Mar 24 2008 - 14:16:32 EST


On Mon, 2008-03-24 at 10:58 -0700, Linus Torvalds wrote:
>
> On Mon, 24 Mar 2008, Peter Zijlstra wrote:
> >
> > As to the regression reported by Marcin; what happens is that we invoke
> > printk() while holding the xtime lock for writing. printk() will call
> > wake_up_klogd() which tries to enqueue klogd on some rq.
> >
> > The known deadlock here is calling printk() while holding rq->lock, which
> > would then try to recusively lock the rq again when trying to wake klogd.
>
> Ok.
>
> Right now, however, I think that for 2.6.25 I'll just remove the printk.
>
> And for the long haul, I really don't think the solution is
> "printk_nowakup()", because this is going to happen again when somebody
> doesn't realize the code is called with the rq lock held, and it's going
> to be a bitch to debug.

Yeah, we get the printk vs rq->lock thing on regular basis, the xtime
lock is new.

If the NMI watchdog works its rather easy to debug.

> I just don't think this is maintainable.

I'm afraid I'll have to agree.

How about I use the lockdep infrastructure to check if printk() is
invoked whole holding either xtime or rq lock, and then avoid calling
wake_up_klogd(). That way, we at least get sane debug output when the
lock debugging infrastructure is enabled?

As for removing the printk(), Thomas, do you see any other sane way to
relay that information?



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/