Re: [RFC PATCH 1/3] Unified trace buffer

From: Thomas Gleixner
Date: Fri Sep 26 2008 - 10:10:27 EST


On Thu, 25 Sep 2008, Steven Rostedt wrote:
> On Fri, 26 Sep 2008, Ingo Molnar wrote:
> >
> > Firstly they need a low-frequency (10khz-100khz) shared clock line
> > across all CPUs. A single line - and since it's low frequency it could
> > be overlaid on some existing data line and filtered out. That works
> > across NUMA nodes as well and physics allows it to be nanosec accurate
> > up to dozens of meters or so.
>
> Can this possibly be true? I mean, light travels only one foot every
> nanosecond. Can it really keep nanosecond accuracy up to dozens of meters
> away? If you send the same signal to CPU1 that is 1 foot away, as well as
> send it to CPU2 that is 2 feet away. CPU2 will get that signal at least 1
> nanosec after CPU1 receives it.
>
> Of course if the hardware is smart enough to know this topology, then it
> could account for these delays in traffic.

Yup, it can be done. Ingo's proposal reminds me on a project we did in
1999 for distributed computing.

|------------|
|master clock|
|1 MHz |===========[slave 1]=========[slave 2] ..........
|sync pulse |
|1 Hz |
|------------|

The slave implementation was a simple PLL driven by the master clock
with an output frequency of 1GHz. The micro seconds counter was driven
by the 1MHz clock and the nanoseconds part by the PLL clock.

The nanoseconds counter was implemented so it stopped counting at 999
and it was reset to 0 when the 1MHz pulse came in. That made sure that
the PLL inaccuracy was corrected with every 1MHz pulse.

When a slave attached itself to the system, then it waited for the 1Hz
sync pulse, queried the absolute time from the master via the network
and synced itself to the next 1Hz sync pulse.

The signal runtime was compensated by the position of the node in the
topology, i.e. the distance to the master clock. That's simple math.
Signal runtime is known for a given wiring / PCB layout. So you just
have to know the distance.

For twisted pair the signal speed is ~ 0.6 * c =~ 180000 km/s.

So a node which is 1m away from the master gets the signal delayed by:
0.001 / 180000 s ~= 5.5 nsec

That's constant and easy to account for. You can get into the +/-
10nsec accuracy accross a large distributed system that way.

The sad part is, that we talked to AMD/Intel about this back then and
they both thought it would be a nice idea for cross socket synced
counters and could be easy implemented in hardware. Then they went off
and did the sh*t which we have to deal with right now.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/