Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

From: Sergey Senozhatsky
Date: Thu Dec 10 2015 - 09:54:54 EST


Hello,

*** in this email and in every later emails ***
Sorry, if I messed up with Cc list or message-ids. It's suprisingly
hard to jump in into a loop that has never been in your inbox. It took
some `googling' effort.

I haven't tested the patch set yet, I just 'ported' it to linux-next.
I reverted 073696a8bc7779b ("printk: do cond_resched() between lines while
outputting to consoles") as a first step, but it comes in later again. I can
send out the updated series (off list is OK).

> Currently, console_unlock() prints messages from kernel printk buffer to
> console while the buffer is non-empty. When serial console is attached,
> printing is slow and thus other CPUs in the system have plenty of time
> to append new messages to the buffer while one CPU is printing. Thus the
> CPU can spend unbounded amount of time doing printing in console_unlock().
> This is especially serious problem if the printk() calling
> console_unlock() was called with interrupts disabled.
>
> In practice users have observed a CPU can spend tens of seconds printing
> in console_unlock() (usually during boot when hundreds of SCSI devices
> are discovered) resulting in RCU stalls (CPU doing printing doesn't
> reach quiescent state for a long time), softlockup reports (IPIs for the
> printing CPU don't get served and thus other CPUs are spinning waiting
> for the printing CPU to process IPIs), and eventually a machine death
> (as messages from stalls and lockups append to printk buffer faster than
> we are able to print). So these machines are unable to boot with serial
> console attached. Also during artificial stress testing SATA disk
> disappears from the system because its interrupts aren't served for too
> long.
>
> This patch implements a mechanism where after printing specified number
> of characters (tunable as a kernel parameter printk.offload_chars), CPU
> doing printing asks for help by waking up one of dedicated kthreads. As
> soon as the printing CPU notices kthread got scheduled and is spinning
> on print_lock dedicated for that purpose, it drops console_sem,
> print_lock, and exits console_unlock(). Kthread then takes over printing
> instead. This way no CPU should spend printing too long even if there
> is heavy printk traffic.
>
> Signed-off-by: Jan Kara <jack@xxxxxxx>

I think we better use raw_spin_lock as a print_lock; and, apart from that,
seems that we don't re-init in zap_lock(). So I ended up with the following
patch on top of yours (to be folded):

- use raw_spin_lock
- do not forget to re-init `print_lock' in zap_locks()
---
kernel/printk/printk.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index d986599..2a86ff1 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -85,7 +85,7 @@ EXPORT_SYMBOL_GPL(console_drivers);
* we can spin on it when some other thread wants to take over printing to
* console.
*/
-static DEFINE_SPINLOCK(print_lock);
+static DEFINE_RAW_SPINLOCK(print_lock);

/*
* Number of printing threads spinning on print_lock. Can go away once
@@ -1516,6 +1516,7 @@ static void zap_locks(void)
/* If a crash is occurring, make sure we can't deadlock */
raw_spin_lock_init(&logbuf_lock);
/* And make sure that we print immediately */
+ raw_spin_lock_init(&print_lock);
sema_init(&console_sem, 1);
}

@@ -2311,7 +2312,7 @@ void console_unlock(void)
console_cont_flush(text, sizeof(text));
again:
retry = false;
- spin_lock_irqsave(&print_lock, flags);
+ raw_spin_lock_irqsave(&print_lock, flags);
for (;;) {
struct printk_log *msg;
size_t ext_len = 0;
@@ -2410,7 +2411,7 @@ skip:
* succeeds in getting console_sem (unless someone else takes it and
* then he'll be responsible for printing).
*/
- spin_unlock_irqrestore(&print_lock, flags);
+ raw_spin_unlock_irqrestore(&print_lock, flags);

/*
* In case we cannot trylock the console_sem again, there's a new owner
@@ -2773,9 +2774,9 @@ static int printing_task(void *arg)
* want to sleep once we got scheduled to make sure we take
* over printing without depending on the scheduler.
*/
- spin_lock_irqsave(&print_lock, flags);
+ raw_spin_lock_irqsave(&print_lock, flags);
atomic_dec(&printing_tasks_spinning);
- spin_unlock_irqrestore(&print_lock, flags);
+ raw_spin_unlock_irqrestore(&print_lock, flags);
if (console_trylock())
console_unlock();
preempt_enable();
--
2.6.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/