Re: [1/3] Bugfix: Don't use the TSC in sched_clock if unstable

From: Andi Kleen
Date: Sun Mar 04 2007 - 18:28:55 EST


On Sunday 04 March 2007 19:33, Guillaume Chazarain wrote:
> 2007/3/4, Andi Kleen <ak@xxxxxxx>:
>
> > On what hardware?
>
> Pentium M 798 MHz -> 2GHz

An older one (Banias) I assume? (please post /proc/cpuinfo)

Assuming that:

> > And how many frequency transitions do you have per second?
>
> 10 in a kernel compile exhibiting audio skips.
>
> Anyway, the "Clocksource tsc unstable (delta = -263211549 ns)" line

That's because the new clocksource code doesn't correctly update
its state on cpufrequency scaling. That's another problem, but independent
of this. sched_clock doesn't rely on the clock source state for this,
but uses its own conversion functions.

> in the dmesg I attached to the previous mail confirms IMHO that my
> TSC is too unstable for scheduling purpose.

No it shouldn't be. Must be a bug somewhere.

Your CPU doesn't have p state invariant TSC, but we should be able
to generate nano seconds from TSC using the known frequency at any point
of time using the frequency updated in arch/i386/kernel/tsc.c:time_cpufreq_notifier().

Can you run with this debug patch and send me the output? If it stops
logging before you see the problem increase MAX in the patch.

-Andi

Index: linux/arch/i386/kernel/tsc.c
===================================================================
--- linux.orig/arch/i386/kernel/tsc.c
+++ linux/arch/i386/kernel/tsc.c
@@ -85,9 +85,20 @@ static unsigned long cyc2ns_scale __read

#define CYC2NS_SCALE_FACTOR 10 /* 2^10, carefully chosen */

+enum {
+ MAX = 500,
+};
+
+static int cnt;
+
static inline void set_cyc2ns_scale(unsigned long cpu_khz)
{
- cyc2ns_scale = (1000000 << CYC2NS_SCALE_FACTOR)/cpu_khz;
+ long new = (1000000 << CYC2NS_SCALE_FACTOR)/cpu_khz;
+ if (cnt < MAX) {
+ printk("cyc2ns_scale %lx -> %lx\n", cyc2ns_scale, new);
+ cnt++;
+ }
+ cyc2ns_scale = new;
}

static inline unsigned long long cycles_2_ns(unsigned long long cyc)
@@ -101,6 +112,8 @@ static inline unsigned long long cycles_
unsigned long long sched_clock(void)
{
unsigned long long this_offset;
+ static u64 last_offset;
+ static long last_conv;

if (unlikely(custom_sched_clock))
return (*custom_sched_clock)();
@@ -116,7 +129,17 @@ unsigned long long sched_clock(void)
rdtscll(this_offset);

/* return the value in ns */
- return cycles_2_ns(this_offset);
+ this_offset = cycles_2_ns(this_offset);
+
+ if (cnt < MAX && this_offset <= last_offset) {
+ printk("sched_clock backward %llx(%lx)->%llx(%lx)\n",
+ last_offset, last_conv, this_offset, cyc2ns_scale);
+ cnt++;
+ }
+ last_offset = this_offset;
+ last_conv = cyc2ns_scale;
+
+ return this_offset;
}

static unsigned long calculate_cpu_khz(void)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/