[PATCH] tick/broadcast-hrtimer: Prevent the timer device on broadcast duty CPU from being disabled

From: Yu Liao
Date: Tue Dec 19 2023 - 03:29:39 EST


It was found that running the LTP hotplug stress test on a aarch64
system could produce rcu_sched stall warnings.

The issue is the following:

CPU1 (owns the broadcast hrtimer) CPU2

tick_broadcast_enter()
//shut down local timer device
...
tick_broadcast_exit()
//exits with tick_broadcast_force_mask set,
timer device remains disabled

initiates offlining of CPU1
take_cpu_down()
//CPU1 shuts down and does
not send broadcast IPI anymore
takedown_cpu()
hotplug_cpu__broadcast_tick_pull()
//move broadcast hrtimer to this CPU
clockevents_program_event()
bc_set_next()
hrtimer_start()
//does not call hrtimer_reprogram()
to program timer device if expires
equals dev->next_event, so the timer
device remains disabled.

CPU2 takes over the broadcast duty but local timer device is disabled,
causing many CPUs to become stuck.

Fix this by calling tick_program_event() to reprogram the local timer
device in this scenario.

Signed-off-by: Yu Liao <liaoyu15@xxxxxxxxxx>
---
kernel/time/tick-broadcast-hrtimer.c | 18 +++++++++++++++---
1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/kernel/time/tick-broadcast-hrtimer.c b/kernel/time/tick-broadcast-hrtimer.c
index e28f9210f8a1..6a4a612581fb 100644
--- a/kernel/time/tick-broadcast-hrtimer.c
+++ b/kernel/time/tick-broadcast-hrtimer.c
@@ -42,10 +42,22 @@ static int bc_shutdown(struct clock_event_device *evt)
*/
static int bc_set_next(ktime_t expires, struct clock_event_device *bc)
{
+ ktime_t next_event = this_cpu_ptr(&tick_cpu_device)->evtdev->next_event;
+
/*
- * This is called either from enter/exit idle code or from the
- * broadcast handler. In all cases tick_broadcast_lock is held.
- *
+ * This can be called from CPU offline operation to move broadcast
+ * assignment. If tick_broadcast_force_mask is set, the CPU local
+ * timer device may be disabled. And hrtimer_reprogram() will not
+ * called if the timer is not the first expiring timer. Reprogram
+ * the cpu local timer device to ensure we can take over the
+ * broadcast duty.
+ */
+ if (tick_check_broadcast_expired() && expires >= next_event)
+ tick_program_event(next_event, 1);
+
+ /*
+ * This is called from enter/exit idle code, broadcast handler or
+ * CPU offline operation. In all cases tick_broadcast_lock is held.
* hrtimer_cancel() cannot be called here neither from the
* broadcast handler nor from the enter/exit idle code. The idle
* code can run into the problem described in bc_shutdown() and the
--
2.33.0