[RFC v3 0/3] adapt clockevents frequencies to mono clock

From: Nicolai Stange
Date: Wed Jul 13 2016 - 09:01:28 EST


This series is a split-off of the arguable and non-x86 related parts
from the "avoid double timer interrupt with nohz and Intel TSC" v2 series
(http://lkml.kernel.org/g/20160710193047.18320-1-nicstange@xxxxxxxxx).

The goal is to make the clockevents core take the dynamic frequency
adjustments of the monotonic clock into account.

My first attempt, [4/4] ("kernel/time/clockevents: compensate for
monotonic clock's dynamic frequency), raised concerns with regard to
performance (way too much "math in the CE programming path"). Thomas
Gleixner asked me to provide an initial patch doing the necessary
adjustments on the clockevents devices' frequencies instead.

Here it is.

Known issues:
- The way the export of the mono and raw clock's ->mult from timekeeping
to clockevents is done is ugly. I'd rather make tk_core non-static. But
for this POC, it's fine.

- The patchset assumes that a clockevent device's ->mult is changed after
registration only through calls to clockevents_update_freq().
For a handful of non-x86 drivers this isn't the case.

- ->min_delta_ns and ->max_delta_ns vs ->mult_mono:
In clockevents_program_event(), we had
delta = min(delta, (int64_t) dev->max_delta_ns);
delta = max(delta, (int64_t) dev->min_delta_ns);
clc = ((unsigned long long) delta * dev->mult) >> dev->shift;
The dev->mult is replaced with the dynamically adjusted dev->mult_mono
by this series. That's problematic since as I understand it, especially
->max_delta_ns is a hard limit preventing the clockevent devices counter
to be programmed with values larger than its width allows for.
If ->mult_mono happens to be only slightly larger than ->mult, the
comparison of delta against the ->mult based ->max_delta_ns can pass
although the final clc might actually be larger than allowed.
I think what we really want to have at this place is a check of clc
against the already present ->min_delta_cycles and ->max_delta_cycles.

The problem with this approach is that many drivers (~40) initialize
->min_delta_ns and ->max_delta_ns (typically with clockevent_delta2ns())
but not the ->*_delta_cycles members. My suggestion at this point would
be to convert them. This makes ->max_delta_ns obsolete right away.

->min_delta_ns is still needed in order to set the ->next_event in
clockevents_program_min_delta() though. My claim is that
->min_delta_ns can be safely replaced with 0 in
clockevents_program_min_delta(), i.e. that
dev->next_event = ktime_add_ns(ktime_get(), delta);
can be replaced with
dev->next_event = ktime_get();
Reasoning:
1. ->next_event is consumed only from __clockevents_update_freq():
clockevents_program_event(dev, dev->next_event, false);
2. Either way dev->next_event is set from
clockevents_program_min_delta(), the above clockevents_program_event()
will hit the min limit and thus, reprogram with the min delta again.

Another place where ->min_delta_ns is used is
clockevents_increase_min_delta(). But this one should be invoked seldomly
and it would probably be OK to do derive the needed value from
->min_delta_cycles by means of the math-intensive clockevent_delta2ns().

Summarizing: if all clockevent drivers are converted to set
->min_delta_cycles and ->max_delta_cycles, then it might be possible to
get rid of ->min_delta_ns and ->max_delta_ns.

Applicable to linux-next 20160712. The patches depend on each other in the
order given.

Nicolai Stange (3):
kernel/time/clockevents: initial support for mono to raw time
conversion
kernel/time/clockevents: make setting of ->mult and ->mult_mono atomic
kernel/time/timekeeping: inform clockevents about freq adjustments

include/linux/clockchips.h | 1 +
kernel/time/clockevents.c | 81 +++++++++++++++++++++++++++++++++++++++++---
kernel/time/tick-broadcast.c | 5 +--
kernel/time/tick-internal.h | 2 ++
kernel/time/timekeeping.c | 11 ++++++
5 files changed, 94 insertions(+), 6 deletions(-)

--
2.9.0