Re: [RFC PATCH 1/3] Unified trace buffer

From: Jeremy Fitzhardinge
Date: Thu Sep 25 2008 - 21:17:43 EST


Ingo Molnar wrote:
> hm, i'm not sure you've thought through this delta record idea.
>
> Take a system that goes idle thousands of times a second. (it's easy -
> just some networking workload)
>
> Now take a tracer that just wants to emit a trace entry every now and
> then. Once or twice a second or so.
>
> Note that suddenly you have thousands of totally artificial 'delta' time
> records between two real events, and have to post-process all your way
> up between these events to get to the real timeline.
>
> ... it is totally impractical and costly.
>

No, as I said: "You just need to emit the current
tsc+frequency+wallclock time before you emit any more delta records
after the frequency change."

When an event which affects the tsc occurs, like a frequency change or
pause, set a flag. When you're next about to emit a delta, check the
flag and emit new timing parameters (or instead).

> and then i havent even mentioned some other big complications:
>
> - the numeric errors that mount up over thousands of delta events
> - the memory overhead over thousands of entries
>
No, you only need to emit records as needed.

> - the fact that cpufreq and PLL changes are rarely atomic and that the
> TSC can flip-flop between two frequencies.
>

You need to know the frequency at the time you sample the tsc, and you
need to know when the frequency changes. If you don't, you can't use
the tsc for time, regardless of whether you process it immediately or
post-process it.

> ... and the moment you accept the fact that the GTOD _has_ to be mixed
> into it, all the rest follows pretty much automatically: either you
> store the (GTOD,freq,TSC) triple and post-process that absolute
> timestamp, and you accept the in-memory cost of that and do
> post-processing, or you compress that triple in situ and store the
> result only.
>

Right. You store (GTOD,freq,tsc) every time you need that information,
and then interpolate with the tsc while you know its monotonic.

Unless your tsc is completely screwed, the (GTOD,freq,tsc) triple is
going to be stored at a fairly low frequency, and won't fill your event
buffer very much (though it might be a large proportion of your recorded
events if you're only recording stuff at a low frequency).

> [ You will then also want some fall-back implementation for CPUs that
> have no TSCs, and for architectures that have no default
> implementation - something jiffies based

Well, whatever the best timer the platform has. And maybe its already
processed into real time, in which case you just emit raw deltas and
never worry about updating the timing parameters.

> . And you want some special hooks for paravirt, as always. ]
>
Yeah. The scheme relies on a cpu's tsc remaining the cpu's tsc.

> I.e. you will end up having something quite close to
> cpu_clock()/sched_clock().
>
> _Or_ if you manage to get any better than that for tracing then please
> tell us about it because we want to apply those concepts to
> cpu_clock()/sched_clock() too :-)

Well, its what Xen does already for time. It works well.

J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/