Re: [PATCH v2 1/1] printk: suppress rcu stall warnings caused by slow console devices

From: Wander Costa
Date: Fri Nov 12 2021 - 09:42:57 EST


On Thu, Nov 11, 2021 at 10:42 PM Sergey Senozhatsky
<senozhatsky@xxxxxxxxxxxx> wrote:
>
> On (21/11/11 16:59), Wander Lairson Costa wrote:
> >
> > If we have a reasonable large dataset to flush in the printk ring
> > buffer in the presence of a slow console device (like a serial port
> > with a low baud rate configured), the RCU stall detector may report
> > warnings.
> >
> > This patch suppresses RCU stall warnings while flushing the ring buffer
> > to the console.
> >
> [..]
> > +extern int rcu_cpu_stall_suppress;
> > +
> > +static void rcu_console_stall_suppress(void)
> > +{
> > + if (!rcu_cpu_stall_suppress)
> > + rcu_cpu_stall_suppress = 4;
> > +}
> > +
> > +static void rcu_console_stall_unsuppress(void)
> > +{
> > + if (rcu_cpu_stall_suppress == 4)
> > + rcu_cpu_stall_suppress = 0;
> > +}
> > +
> > /**
> > * console_unlock - unlock the console system
> > *
> > @@ -2634,6 +2648,9 @@ void console_unlock(void)
> > * and cleared after the "again" goto label.
> > */
> > do_cond_resched = console_may_schedule;
> > +
> > + rcu_console_stall_suppress();
> > +
> > again:
> > console_may_schedule = 0;
> >
> > @@ -2645,6 +2662,7 @@ void console_unlock(void)
> > if (!can_use_console()) {
> > console_locked = 0;
> > up_console_sem();
> > + rcu_console_stall_unsuppress();
> > return;
> > }
> >
> > @@ -2716,8 +2734,10 @@ void console_unlock(void)
> >
> > handover = console_lock_spinning_disable_and_check();
> > printk_safe_exit_irqrestore(flags);
> > - if (handover)
> > + if (handover) {
> > + rcu_console_stall_unsuppress();
> > return;
> > + }
> >
> > if (do_cond_resched)
> > cond_resched();
> > @@ -2738,6 +2758,8 @@ void console_unlock(void)
> > retry = prb_read_valid(prb, next_seq, NULL);
> > if (retry && console_trylock())
> > goto again;
> > +
> > + rcu_console_stall_unsuppress();
> > }
>
> May be we can just start touching watchdogs from printing routine?
>
Hrm, console_unlock is called from vprintk_emit [0] with preemption
disabled. and it already has the logic implemented to call
cond_resched when possible [1].

[0] https://elixir.bootlin.com/linux/latest/source/kernel/printk/printk.c#L2244
[1] https://elixir.bootlin.com/linux/latest/source/kernel/printk/printk.c#L2719