On 4/1/13 12:29 PM, John Stultz wrote:Any chance a decision can be reached in time for 3.10? Seems like the
simplest option is the perf event based ioctl.
I'm still not sold on the CLOCK_PERF posix clock. The semantics are
still too hand-wavy and implementation specific.
While I'd prefer perf to export some existing semi-sane time domain
(using interpolation if necessary), I realize the hardware constraints
and performance optimizations make this unlikely (though I'm
disappointed I've not seen any attempt or proof point that it won't work).
Thus if we must expose this kernel detail to userland, I think we should
be careful about how publicly we expose such an interface, as it has the
potential for misuse and eventual user-land breakage.
But perf_clock timestamps are already exposed to userland. This new API -- be it a posix clock or an ioctl -- just allows retrieval of a timestamp outside of a generated event.
So while having a perf specific ioctl is still exposing what I expect
will be non-static kernel internal behavior to userland, it at least it
exposes it in a less generic fashion, which is preferable to me.
The next point of conflict is likely if the ioctl method will be
sufficient given performance concerns. Something I'd be interested in
hearing about from the folks pushing this. Right now it seems any method
is preferable then not having an interface - but I want to make sure
that's really true.
For example, if the ioctl interface is really too slow, its likely folks
will end up using periodic perf ioctl samples and interpolating using
normal vdso clock_gettime() timestamps.
The performance/speed depends on how often is called. I have no idea what Stephane's use case is but for me it is to correlate perf_clock timestamps to timeofday. In my perf-based daemon that tracks process schedulings, I update the correlation every 5-10 minutes.
If that is acceptable, then why not invert the solution and just have
perf injecting periodic CLOCK_MONOTONIC timestamps into the log, then
have perf report fast, but less-accurate sched_clock deltas from that
CLOCK_MONOTONIC boundary.
Something similar to that approach has been discussed as well. i.e, add a realtime clock event and have it injected into the stream e.g.,
https://lkml.org/lkml/2011/2/27/158
But there are cons to this approach -- e.g, you need that first event generated that tells you realtime to perf_clock correlation and you don't want to have to scan an unknown length of events looking for the first one to get the correlation only to backup and process the events.
And an ioctl to generate that first event was shot down as well...
https://lkml.org/lkml/2011/3/1/174
https://lkml.org/lkml/2011/3/2/186