Re: [PATCH] [RFC] timerfd: add TFD_NOTIFY_CLOCK_SET to watch for clock changes

From: Jamie Lokier
Date: Wed Dec 01 2010 - 05:44:37 EST


Lennart Poettering wrote:
> On Tue, 23.11.10 19:22, Alexander Shishkin (virtuoso@xxxxxxxxx) wrote:
>
> > Certain userspace applications (like "clock" desktop applets or cron or
> > systemd) might want to be notified when some other application changes
> > the system time. There are several known to me reasons for this:
> > - avoiding periodic wakeups to poll time changes;
> > - rearming CLOCK_REALTIME timers when said changes happen;
> > - changing system timekeeping policy for system-wide time management
> > programs;
> > - keeping guest applications/operating systems running in emulators
> > up to date.
> >
> > This is another attempt to approach notifying userspace about system
> > clock changes. The other one is using an eventfd and a syscall [1]. In
> > the course of discussing the necessity of a syscall for this kind of
> > notifications, it was suggested that this functionality can be achieved
> > via timers [2] (and timerfd in particular [3]). This idea got quite
> > some support [4], [5], [6] and some vague criticism [7], so I decided
> > to try and go a bit further with it.
>
> I agree with Kay, this is pretty much exactly what we want for
> systemd. (Assuming that the time jump due to system suspend is
> propagated to userspace like any other time jump with this path).

I hope the time jump due to suspend is *not* propagated in the same
way to userspace :-)

What I'd like to see:

1. Time jump due to the system clock being stepped: Notification.

This is *not* a change in real time. It means the clock was
corrected/changed. No physical time passed.

2. Time jump due to suspend/resume: Different notification.

This *is* a change in real time. Physical time passed.

3. Time drift corrections: As now, no notification, it's just
the clock being regulated.

To signal the difference between 1 and 2, there ought to be some way
for userspace to determine how much of the clock delta corresponds
with physical time, by reading some sort of "monotonic" clock :-)

CLOCK_MONOTONIC is unsuitable because it stops at suspend. Maybe it
should stay that way. But maybe not - programs using CLOCK_MONOTONIC
usually want to trigger timeouts etc. based on real elapsed time, and
after suspend/resume, it's quite reasonable to want to trigger all of
a program's short timeouts immediately. Indeed some network protocol
userspace may currently behave *incorrectly* over suspend/resume,
especially those using clock times to validate their caches,
*because* CLOCK_MONOTONIC doesn't count it.

So maybe CLOCK_MONOTONIC should be changed to include elapsed time
during suspend/resume, and CLOCK_MONOTONIC_RAW could remain as it is,
for programs that want that?

That, plus this proposed patch, would signal the difference between 1
and 2 above nicely.

-- Jamie
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/