RE: Direct rdtsc call side-effect

From: David Laight
Date: Tue Jun 06 2023 - 04:24:26 EST


From: H. Peter Anvin
> Sent: 05 June 2023 17:32
...
> The TSC is certainly not perfect; partly because, ironically enough, it
> was introduced just *before* out of order and power management entered
> the x86 world.

Another issue is that the crystal used for the cpu clock won't be
that accurate (in terms of ppm error rate), and will have significant
temperature drift.
OTOH the crystal in the traditional x86 motherboard 'clock' chip
is (meant to be) designed to have long term accuracy.
While reading the TSC is a lot faster there ought to have been
some kind of PLL to continuously adjust the measured TSC frequency
to keep synchronised with the timer chip.
(Instead kernels end up writing the drifted TSC based time back to
the timer chip during shutdown.)

> It is no secret that it has been slow to catch up. It was easy to put a
> counter in; it is a *lot* harder to make it work in all the possible
> scenarios in the power-managed, out-of-order world.

That rather depends on what you mean by 'work' :-)

> It is one of my personal pet projects in the architecture work to push
> to get that last distance; we are not yet there.

For performance measurements possibly what you want is a simple
clock counter which is dependent on an a register.
So pretty much zero overhead but is guaranteed to happen after
some other instruction without really affecting the pipeline.

IIRC the x86 performance counters aren't dependent on anything
so they tend to execute much earlier than you want.
OTOH rdtsc is likely to be synchronising and affect what follows.
ISTR using rdtsc to wait for instructions to complete and then
the performance clock counter to see how long it took.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)