Re: Timer refuses to expire

From: Paul E. McKenney
Date: Tue Dec 05 2017 - 18:37:52 EST


On Mon, Dec 04, 2017 at 09:42:08AM -0800, Paul E. McKenney wrote:
> On Fri, Dec 01, 2017 at 10:25:29AM -0800, Paul E. McKenney wrote:
> > Hello, Anna-Maria,
> >
> > It turned out that one set of problems was due to NO_HZ_FULL issues,
> > which are addressed with a pair of patches. However, there is still one
> > case of a timer refusing to expire in rcutorture's TREE01 test scenario.
> > The takes from two to four hours to trigger, and the most recent one
> > gives the following diagnostic (see patch at the very end of this email):
> >
> > [13166.127458] schedule_timeout: Waylayed timer base->clk: 0x100c40004 jiffies: 0x100c4524e base->next_expiry: 0x100c40004 timer->flags: 0x1000003 timer->expires 0x100c40003 idx: 4 idx_now: ea base->pending_map 000000000000001000000000001080000000000000040000000002000000000000000000000000000000000000040000000000000000000000000000000000000000000000000000
> >
> > The message appears any time a timer for less than five jiffies takes
> > more than eight seconds to expire. In all cases, this expiry did not
> > happen naturally, but rather via an unsolicited wakeup from the RCU CPU
> > stall code. If I am interpreting this message correctly, base->clk
> > has advanced past this timer, and the timer is correctly flagged in
> > base->pending_map. This seems like part of the problem because the
> > timer's timeout was only three jiffies. However, base->clk has not
> > advanced for more than 20 seconds, which seems like another problem.
> >
> > What additional diagnostics would be helpful? I can capture data
> > at the beginning of the schedule_timeout in the timer_list structure,
> > and of course can print more information at the time of the wakeup.
>
> And on the off-chance that the following messages from this weekend's
> runs are at all helpful. One interesting point is that starting at
> time 4731.872311, there are repeated enqueues to CPU 5's timer wheel,
> but CPU 5's ->clk does not advance. Things seem to advance at
> time 4891.721190.

Another layer on the onion... For at least some of the failures, there
is a stalled CPU-hotplug operation. This of course can mean that the
timers are stuck on the partially offlined CPU. So I dumped the stack
of the task that is taking the CPU offline. Please see below for console
output and patch.

I am considering doing an unsolicited wakeup of the task doing the
hotplug operation, but I am not convinced that the entirely of the CPU
hotplug code is willing to put up with that sort of thing.

Thanx, Paul

------------------------------------------------------------------------

[ 1035.692679] schedule_timeout: Waylayed timer base->clk: 0x1000ae76c jiffies: 0x1000b39cd base->next_expiry: 0x1000ae76c timer->flags: 0xb000007 timer->expires 0x1000ae76b idx: 2c idx_now: dd base->pending_map 000010000000000002000800000000000002000000000000000000000020092000000000000000000000000080000000000000000000000000000000000000000000000000000000
[ 1035.692689] Torture onoff task state:
[ 1035.692691] torture_onoff D14584 837 2 0x80000000
[ 1035.692734] Call Trace:
[ 1035.692742] ? __schedule+0x33c/0x6f0
[ 1035.692745] ? _raw_spin_lock_irq+0xf/0x30
[ 1035.692748] schedule+0x37/0x90
[ 1035.692750] schedule_timeout+0x20d/0x490
[ 1035.692755] ? preempt_count_add+0x51/0x90
[ 1035.692758] wait_for_completion+0x95/0x100
[ 1035.692763] ? wake_up_q+0x60/0x60
[ 1035.692783] cpuhp_kick_ap+0x29/0x70
[ 1035.692799] cpuhp_kick_ap_work+0x34/0x130
[ 1035.692807] _cpu_down+0x17e/0x1f0
[ 1035.692811] do_cpu_down+0x30/0x50
[ 1035.692834] torture_offline+0x76/0x140
[ 1035.692851] torture_onoff+0xef/0x1c0
[ 1035.692854] kthread+0xf0/0x130
[ 1035.692857] ? torture_kthread_stopping+0x70/0x70

------------------------------------------------------------------------

commit 8069a4a1a0634055c43b50a37cb08237eea97dd5
Author: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
Date: Tue Dec 5 13:59:06 2017 -0800

EXP: timer: Dump torture-test online/offline task state

Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>

diff --git a/include/linux/torture.h b/include/linux/torture.h
index 66272862070b..ac62d21c9688 100644
--- a/include/linux/torture.h
+++ b/include/linux/torture.h
@@ -102,4 +102,6 @@ void _torture_stop_kthread(char *m, struct task_struct **tp);
#define torture_preempt_schedule()
#endif

+void schedule_timeout_set_task2dump(struct task_struct *t);
+
#endif /* __LINUX_TORTURE_H */
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 23af27461d8c..0a67df4cdeb6 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1724,6 +1724,16 @@ static void process_timeout(struct timer_list *t)
wake_up_process(timeout->task);
}

+static struct task_struct *schedule_timeout_task2dump;
+static DEFINE_MUTEX(schedule_timeout_task2dump_mutex);
+void schedule_timeout_set_task2dump(struct task_struct *t)
+{
+ mutex_lock(&schedule_timeout_task2dump_mutex);
+ WRITE_ONCE(schedule_timeout_task2dump, t);
+ mutex_unlock(&schedule_timeout_task2dump_mutex);
+}
+EXPORT_SYMBOL_GPL(schedule_timeout_set_task2dump);
+
/**
* schedule_timeout - sleep until timeout
* @timeout: timeout value in jiffies
@@ -1808,6 +1818,14 @@ signed long __sched schedule_timeout(signed long timeout)
for (i = 0; i < WHEEL_SIZE / sizeof(base->pending_map[0]) / 8; i++)
pr_cont("%016lx", base->pending_map[i]);
pr_cont("\n");
+ if (READ_ONCE(schedule_timeout_task2dump)) {
+ mutex_lock(&schedule_timeout_task2dump_mutex);
+ if (schedule_timeout_task2dump) {
+ pr_info("Torture onoff task state:\n");
+ sched_show_task(schedule_timeout_task2dump);
+ }
+ mutex_unlock(&schedule_timeout_task2dump_mutex);
+ }
}
del_singleshot_timer_sync(&timer.timer);

diff --git a/kernel/torture.c b/kernel/torture.c
index 37b94012a3f8..48d5f03c644a 100644
--- a/kernel/torture.c
+++ b/kernel/torture.c
@@ -238,6 +238,7 @@ int torture_onoff_init(long ooholdoff, long oointerval)
if (onoff_interval <= 0)
return 0;
ret = torture_create_kthread(torture_onoff, NULL, onoff_task);
+ schedule_timeout_set_task2dump(onoff_task);
#endif /* #ifdef CONFIG_HOTPLUG_CPU */
return ret;
}
@@ -251,6 +252,7 @@ static void torture_onoff_cleanup(void)
#ifdef CONFIG_HOTPLUG_CPU
if (onoff_task == NULL)
return;
+ schedule_timeout_set_task2dump(NULL);
VERBOSE_TOROUT_STRING("Stopping torture_onoff task");
kthread_stop(onoff_task);
onoff_task = NULL;