Re: [RFC PATCH 1/2] rcu: sysctl: Panic on RCU Stall

From: Josh Triplett
Date: Tue May 31 2016 - 15:23:39 EST


On Tue, May 31, 2016 at 12:18:27PM -0700, Josh Triplett wrote:
> On Tue, May 31, 2016 at 04:07:32PM -0300, Daniel Bristot de Oliveira wrote:
> > It is not always easy to define the cause of an RCU stall just by
> > analysing the RCU stall messages, mainly when the problem is caused
> > by the indirect starvation of rcu threads. For example, when preempt_rcu
> > is not awakened due to the starvation of a timer softirq.
> >
> > We have been hard coding panic() in the RCU stall functions for
> > some time while testing the kernel-rt. But this is not possible in
> > some scenarios, like when supporting customers.
> >
> > This patch implements the sysctl kerner.panic_on_rcu_stall. If
> > set to 1, the system will panic() when an RCU stall takes place,
> > enabling the capture of a vmcore. The vmcore provides a way to analyze
> > all kernel/tasks states, helping out to point to the culprit and the
> > solution for the stall.
> >
> > The kerner.panic_on_rcu_stall sysctl is disabled by default.
>
> s/kerner/kernel/ (here and in the previous paragraph).
>
> Also, even though it's only two lines, please consider creating a static
> function wrapping the if and panic, to avoid duplication.
>
> With those changes,
> Reviewed-by: Josh Triplett <josh@xxxxxxxxxxxxxxxx>

Sorry, realized something else a moment after sending: I don't think
this will build if you use the tiny RCU implementation. That
implementation *does* support tracing, and if you enable tracing,
you'll have CONFIG_RCU_STALL_COMMON=y, but you won't build tree.c where
the variable definition lives. So, the sysctl code will reference a
variable that doesn't exist.

- Josh Triplett