Re: [patch 54/55] timekeeping: Provide fast and NMI safe access to CLOCK_MONOTONIC[_RAW]

From: Peter Zijlstra
Date: Mon Jul 14 2014 - 04:38:13 EST


On Fri, Jul 11, 2014 at 01:45:19PM -0000, Thomas Gleixner wrote:
> Tracers want a correlated time between the kernel instrumentation and
> user space. We really do not want to export sched_clock() to user
> space, so we need to provide something sensible for this.
>
> Using separate data structures with an non blocking sequence count
> based update mechanism allows us to do that. The data structure
> required for the readout has a sequence counter and two copies of the
> timekeeping data.
>
> On the update side:
>
> tkf->seq++;
> smp_wmb();
> update(tkf->base[0], tk;
> tkf->seq++;
> smp_wmb();
> update(tkf->base[1], tk;
>
> On the reader side:
>
> do {
> seq = tkf->seq;
> smp_rmb();
> idx = seq & 0x01;
> now = now(tkf->base[idx]);
> smp_rmb();
> } while (seq != tkf->seq)
>
> So if NMI hits the update of base[0] it will use base[1] which is
> still consistent. In case of CLOCK_MONOTONIC this can result in
> slightly wrong timestamps (a few nanoseconds) accross an update. Not a
> big issue for the intended use case.

But it breaks monotonicity, right? ;-)

Also, what happens when TSC is not available as a clocksource? There's
still a metric ton of hardware (including the latest generation HSW)
that has fucked firmware/TSC.


Attachment: pgpp3iDd29N9L.pgp
Description: PGP signature