Re: Regression in 2.6.27 caused by commit bfc0f59

From: Larry Finger
Date: Mon Sep 01 2008 - 15:36:36 EST


Thomas Gleixner wrote:
On Mon, 1 Sep 2008, Larry Finger wrote:
Thomas Gleixner wrote:
The critical differences in the dmesg output between the "good" and "bad"
results indicate a factor of 2 difference in the clock speed, and are
shown
below:
+Clocksource tsc unstable (delta = 500037272 ns)
-Clocksource tsc unstable (delta = 83950402 ns)
In both cases the TSC is ahead of the pm_timer, which looks like the
pm_timer is behaving strange.

Can you please disable the pm_timer (in the kernel config,
unfortunately there is no command line option for that) for a test and
provide the relevant output of demsg ?
It took a while to figure out how to kill the pm_timer. I finally did it by
changing the default to no rather than yes. I also reset the bisection and
compiled a full -rc4 kernel.

What I hope is the relevant output of dmesg is below. The clock rate is
correctly determined, and the b43legacy errors are gone.

Hmm. Haven't seen that before, but if confirms what I guessed from
your previous dmesg information. I wonder why you did not observe
strange behaviour with older kernel versions. I don't mean the
b43legacy errors, which might be caused by the wrong calibrated TSC,
but those even should show strange behaviour vs. time.

Can you please provide the output of an older "working" kernel version
from:

# cat /sys/devices/system/clocksource/clocksource0/current_clocksource

after the TSC was set to unstable. It should say acpi_pm.

If that's the case please run
# time sleep 60

on a shell and provide the output and verify it against a knwn to be
halfways correct stopwatch.

Then do the same on the current mainline with pm_timer
disabled. current clocksource should be either jiffies or tsc.

Both the openSUSE 2.6.22 kernel and the one with the pm_timer disabled return "pit". I don't think pm_timer had ever been used until the commit in question.

The timed sleep is as accurate as I can measure.

I put in some test prints. The value of pm2 is zero when the else branch of the "if (hpet)" is entered; however, pm1 is 15768471. When we reach the do_div(tsc2, tsc1) statement, tsc2 is zero, which I think means that the two calls to tsc_read_refs() are returning the same junk value.

Larry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/