Re: [RT] read_tsc: ACK! TSC went backward! Unsynced TSCs?

From: thockin
Date: Mon Nov 28 2005 - 12:25:42 EST


On Mon, Nov 28, 2005 at 07:05:54AM -0500, Steven Rostedt wrote:
> With -rt20 on the AMD64 x2, I'm getting a crap load of these:
>
> read_tsc: ACK! TSC went backward! Unsynced TSCs?
>
> So bad that the system wont even boot (at least I won't wait long enough
> to let it finish).

The kernel's use of TSC is wholly incorrect. TSCs can ramp up and down
and *do* vary between nodes as well as between cores within a node. You
really can not compare TSCs between cpu cores at all, as is (and the
kernel assumes 1 global TSC in at least a few places).

If you have any sort of power-management enabled on a k8 (including 'hlt'
C1 state), you *will* get hosed.

We got into a situation where 1 CPU had somehow lagged behind the other
because it was idle for a while. Suddenly gettimeofday() was only giving
me HZ granularity. Successive reads would get the exact same timeval, as
much as 1 ms later.

What happened was the last_tsc was set to the higher-TSC CPU. The
gettimeofday code for TSC was running on the lower-TSC CPU. The code
recognized that current tsc < last tsc and set current = last. As long as
I was running on the laggy CPU, time stood still for bursts. Then if I
bounced CPUs it would shoot forward.

Switching to HPET for timing made it all go away, but (at least as of
2.6.11) it was horribly broken.

Tim
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/