Re: [PATCH 2/2] timer: really raise softirq if there is irq_work todo

From: Steven Rostedt
Date: Fri Jan 31 2014 - 12:08:09 EST


On Fri, 31 Jan 2014 15:34:05 +0100
Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> wrote:

> from looking at the code, it seems that the softirq is only raised (in
> the !base->active_timers case) if we have also an expired timer
> (time_before_eq() is true). This patch ensures that the timer softirq is
> also raised in the !base->active_timers && no timer expired.

A couple of things. If there is no active timers, we do not need to
check the expired timers. That may contain a deferred timer that does
not need to be raised if the system is idle. This will just
re-introduce the problems that other people have been seeing.

The bug that I found is that if there *are* active timers, but they
have not expired yet. Why is this a problem? Because in that case we do
not check if there is irq_work to be done. That means the irq_work will
have to wait till the timer expires, and since RCU depends on this,
that can take a while. I've had a synchronize_sched() take up to 5
seconds to complete due to this!


The real fix is the following:

timer/rt: Always raise the softirq if there's irq_work to be done

It was previously discovered that some systems would hang on boot up
with a previous version of 3.12-rt. This was due to RCU using irq_work,
and RT defers the irq_work to a softirq. But if there's no active
timers, the softirq will not be raised, and RCU work will not get done,
causing the system to hang. The fix was to check that if there was no
active timers but irq_work to be done, then we should raise the softirq.

But this fix was not 100% correct. It left out the case that there were
active timers that were not expired yet. This would have the softirq
not get raised even if there was irq work to be done.

If there is irq_work to be done, then we must raise the timer softirq
regardless of if there is active timers or whether they are expired or
not. The softirq can handle those cases. But we can never ignore
irq_work.

As it is only PREEMPT_RT_FULL that requires irq_work to be done in the
softirq, we can pull out the check in the active_timers condition, and
make the code a bit cleaner by having the irq_work check separate, and
put the code in with the other #ifdef PREEMPT_RT. If there is irq_work
to be done, there's no need to check the active timers or if they are
expired. Just raise the time softirq and be done with it. Otherwise, we
can do the timer checks just like we do with non -rt.

Signed-off-by: Steven Rostedt <rostedt@xxxxxxxxxxx>

diff --git a/kernel/timer.c b/kernel/timer.c
index 106968f..426d114 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -1461,18 +1461,20 @@ void run_local_timers(void)
* the timer softirq.
*/
#ifdef CONFIG_PREEMPT_RT_FULL
+ /* On RT, irq work runs from softirq */
+ if (irq_work_needs_cpu()) {
+ raise_softirq(TIMER_SOFTIRQ);
+ return;
+ }
+
if (!spin_do_trylock(&base->lock)) {
raise_softirq(TIMER_SOFTIRQ);
return;
}
#endif
- if (!base->active_timers) {
-#ifdef CONFIG_PREEMPT_RT_FULL
- /* On RT, irq work runs from softirq */
- if (!irq_work_needs_cpu())
-#endif
- goto out;
- }
+
+ if (!base->active_timers)
+ goto out;

/* Check whether the next pending timer has expired */
if (time_before_eq(base->next_timer, jiffies))
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/