Re: [PATCH] tick/common: Align tick period during sched_timer setup.

From: Luiz Capitulino
Date: Thu Jun 15 2023 - 10:15:44 EST




On 2023-06-15 05:18, Sebastian Andrzej Siewior wrote:




From: Thomas Gleixner <tglx@xxxxxxxxxxxxx>

The tick period is aligned very early while the first clock_event_device
is registered. The system runs in periodic mode and switches later to
one-shot mode if possible.

The next wake-up event is programmed based on aligned value
(tick_next_period) but the delta value, that is used to program the
clock_event_device, is computed based on ktime_get().

With the subtracted offset, the devices fires in less than the exacted
time frame. With a large enough offset the system programs the timer for
the next wake-up and the remaining time left is too little to make any
boot progress. The system hangs.

Move the alignment later to the setup of tick_sched timer. At this point
the system switches to oneshot mode and a highres clocksource is
available. It safe to update tick_next_period ktime_get() will now
return accurate (not jiffies based) time.

[bigeasy: Patch description + testing].

Reported-by: Mathias Krause <minipli@xxxxxxxxxxxxxx>
Reported-by: "Bhatnagar, Rishabh" <risbhat@xxxxxxxxxx>
Fixes: e9523a0d81899 ("tick/common: Align tick period with the HZ tick.")
Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
Link: https://lore.kernel.org/5a56290d-806e-b9a5-f37c-f21958b5a8c0@xxxxxxxxxxxxxx
Link: https://lore.kernel.org/12c6f9a3-d087-b824-0d05-0d18c9bc1bf3@xxxxxxxxxx


I've tested this against 5.10.184 (which is where it reproduces quick
for me):

Tested-by: Luiz Capitulino <luizcap@xxxxxxxxxx>

---
kernel/time/tick-common.c | 11 +----------
kernel/time/tick-sched.c | 13 ++++++++++++-
2 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
index 65b8658da829e..b85f2f9c32426 100644
--- a/kernel/time/tick-common.c
+++ b/kernel/time/tick-common.c
@@ -218,19 +218,10 @@ static void tick_setup_device(struct tick_device *td,
* this cpu:
*/
if (tick_do_timer_cpu == TICK_DO_TIMER_BOOT) {
- ktime_t next_p;
- u32 rem;

tick_do_timer_cpu = cpu;

- next_p = ktime_get();
- div_u64_rem(next_p, TICK_NSEC, &rem);
- if (rem) {
- next_p -= rem;
- next_p += TICK_NSEC;
- }
-
- tick_next_period = next_p;
+ tick_next_period = ktime_get();
#ifdef CONFIG_NO_HZ_FULL
/*
* The boot CPU may be nohz_full, in which case set
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 52254679ec489..42c0be3080bde 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -161,8 +161,19 @@ static ktime_t tick_init_jiffy_update(void)
raw_spin_lock(&jiffies_lock);
write_seqcount_begin(&jiffies_seq);
/* Did we start the jiffies update yet ? */
- if (last_jiffies_update == 0)
+ if (last_jiffies_update == 0) {
+ u32 rem;
+
+ /*
+ * Ensure that the tick is aligned to a multiple of
+ * TICK_NSEC.
+ */
+ div_u64_rem(tick_next_period, TICK_NSEC, &rem);
+ if (rem)
+ tick_next_period += TICK_NSEC - rem;
+
last_jiffies_update = tick_next_period;
+ }
period = last_jiffies_update;
write_seqcount_end(&jiffies_seq);
raw_spin_unlock(&jiffies_lock);
--
2.40.1