Re: [PATCH 0/5 v2] Fixes for leapsecond expiring early ABS_TIME CLOCK_REALTIME timers

From: John Stultz
Date: Mon Jun 15 2015 - 14:55:24 EST


On Mon, Jun 15, 2015 at 6:46 AM, Prarit Bhargava <prarit@xxxxxxxxxx> wrote:
> On 06/15/2015 09:10 AM, Prarit Bhargava wrote:
>> I'm testing this on a few more large CPU count systems. So far, so good ...
>>
>
> John, unfortunately I'm seeing failures across a lot of the large systems
> dev/early-leap-timer as of this AM
>
[snip]
> [root@intel-chiefriver-04 ~]# ./leap-a-day -s
> Setting time to speed up testing
> This runs continuously. Press ctrl-c to stop
>
> Setting time to Mon Jun 15 19:59:50 2015
> Scheduling leap second for Mon Jun 15 20:00:00 2015
> Setting timer for Mon Jun 15 20:00:00 2015
> Mon Jun 15 19:59:57 2015 + 18177 us (0) TIME_INS
> Mon Jun 15 19:59:57 2015 + 526181 us (0) TIME_INS
> Mon Jun 15 19:59:58 2015 + 37418 us (0) TIME_INS
> Mon Jun 15 19:59:58 2015 + 546474 us (0) TIME_INS
> Mon Jun 15 19:59:59 2015 + 55724 us (0) TIME_INS
> Mon Jun 15 19:59:59 2015 + 566970 us (0) TIME_INS
> Mon Jun 15 19:59:59 2015 + 6013 us (1) TIME_OOP - TIMER FIRED
> Error: Early timer expiration!
> Error: Incorrect NTP state?
> Mon Jun 15 19:59:59 2015 + 6040 us (1) TIME_OOP
> Mon Jun 15 19:59:59 2015 + 517240 us (1) TIME_OOP
> Mon Jun 15 20:00:00 2015 + 28504 us (1) TIME_WAIT
> Mon Jun 15 20:00:00 2015 + 539745 us (1) TIME_WAIT
> Mon Jun 15 20:00:01 2015 + 50978 us (1) TIME_WAIT
> Mon Jun 15 20:00:01 2015 + 560360 us (1) TIME_WAIT
> Mon Jun 15 20:00:02 2015 + 69952 us (1) TIME_WAIT

Hrm.. I'm unable so far to reproduce this.

Attached is a simple trace debug patch, would you mind running the
test, and when the failure occurs send me the output from
cat /sys/kernel/debug/tracing/trace (along with the test output for
the failure).

(Do make sure you have tracing enabled in your kernel config)

thanks
-john
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 5d67ffb..b613a27 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -545,9 +545,13 @@ EXPORT_SYMBOL_GPL(pvclock_gtod_unregister_notifier);
static inline void tk_update_leap_state(struct timekeeper *tk)
{
tk->next_leap_ktime = ntp_get_next_leap();
- if (tk->next_leap_ktime.tv64 != KTIME_MAX)
+ if (tk->next_leap_ktime.tv64 != KTIME_MAX) {
/* Convert to monotonic time */
+ trace_printk("Setting leap %lld (from %lld - %lld)\n",
+ ktime_sub(tk->next_leap_ktime, tk->offs_real).tv64,
+ tk->next_leap_ktime.tv64, tk->offs_real.tv64);
tk->next_leap_ktime = ktime_sub(tk->next_leap_ktime, tk->offs_real);
+ }
}

/*
@@ -1978,8 +1982,10 @@ ktime_t ktime_get_update_offsets_now(unsigned int *cwsseq, ktime_t *offs_real,
}

/* Handle leapsecond insertion adjustments */
- if (unlikely(base.tv64 >= tk->next_leap_ktime.tv64))
+ if (unlikely(base.tv64 >= tk->next_leap_ktime.tv64)) {
+ trace_printk("Inserting leap %lld >= %lld\n", base.tv64, tk->next_leap_ktime.tv64);
*offs_real = ktime_sub(tk->offs_real, ktime_set(1, 0));
+ }

} while (read_seqcount_retry(&tk_core.seq, seq));