Re: [RFC][PATCH] printk: Remove separate printk_sched buffers anduse printk buf instead

From: Jan Kara
Date: Wed Feb 06 2013 - 18:02:42 EST


On Tue 05-02-13 20:05:48, Steven Rostedt wrote:
> [ I sent this in a reply to another thread, but wanted a bit more attention to it ]
>
> To prevent deadlocks with doing a printk inside the scheduler,
> printk_sched() was created. The issue is that printk has a console_sem
> that it can grab and release. The release does a wake up if there's a
> task pending on the sem, and this wake up grabs the rq locks that is
> held in the scheduler. This leads to a possible deadlock if the wake up
> uses the same rq as the one with the rq lock held already.
>
> What printk_sched() does is to save the printk write in a per cpu buffer
> and sets the PRINTK_PENDING_SCHED flag. On a timer tick, if this flag is
> set, the printk() is done against the buffer.
>
> There's a couple of issues with this approach.
>
> 1) If two printk_sched()s are called before the tick, the second one
> will overwrite the first one.
>
> 2) The temporary buffer is 512 bytes and is per cpu. This is a quite a
> bit of space wasted for something that is seldom used.
>
> In order to remove this, the printk_sched() can instead use the printk
> buffer instead, and delay the console_trylock()/console_unlock() to the
> tick.
>
> Because printk_sched() would then be taking the logbuf_lock, the
> logbuf_lock must not be held while doing anything that may call into the
> scheduler functions, which includes wake ups. Unfortunately, printk()
> also has a console_sem that it uses, and on release, the
> up(&console_sem) may do a wake up of any pending waiters. This must be
> avoided while holding the logbuf_lock.
>
> Luckily, there's not many places that do the unlock, or hold the
> logbuf_lock. By moving things around a little, the console_sem can be
> released without ever holding the logbuf_lock, and we can safely have
> printk_sched() use the printk buffer directly.
So after quite some experiments and some hair tearing I have a patch that
uses PRINTK_PENDING_OUTPUT and makes the machine survive my heavy-printk
test. The first patch I attach is actually a small improvement of your
patch which I think can be folded in it. I was also wondering whether we
still need printk_needs_cpu(). I left it in since I don't know about a
better way of keeping at least one CPU ticking. But maybe others do?

The second patch then makes use of PRINTK_PENDING_OUTPUT to handle the
printing when console_unlock() would take too long. If you wonder whether
the last_printing_cpu in printk_tick() is necessary - it is... Without it
we keep printing on one CPU and the machine complains, looses drives,
etc... (I guess I should add this comment somewhere to the code).

Anyway, what do you guys think about this version?

Honza
--
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR