Re: [BUG] APM resume breakage from 2.6.18-rc1 clocksource changes

From: john stultz
Date: Mon Jul 10 2006 - 14:18:05 EST


On Mon, 2006-07-10 at 20:08 +0200, Pavel Machek wrote:
> Hi!
>
> > > >> I've traced the cause of this problem to the i386 time-keeping
> > > >> changes in kernel 2.6.17-git11. What happens is that:
> > > >> - The kernel autoselects TSC as my clocksource, which is
> > > >> reasonable since it's a PentiumII. 2.6.17 also chose the TSC.
> > > >> - Immediately after APM resumes (arch/i386/kernel/apm.c line
> > > >> 1231 in 2.6.18-rc1) there is an interrupt from the PIT,
> > > >> which takes us to kernel/timer.c:update_wall_time().
> > > >> - update_wall_time() does a clocksource_read() and computes
> > > >> the offset from the previous read. However, the TSC was
> > > >> reset by HW or BIOS during the APM suspend/resume cycle and
> > > >> is now smaller than it was at the prevous read. On my machine,
> > > >> the offset is 0xffffffd598e0a566 at this point, which appears
> > > >> to throw update_wall_time() into a very very long loop.
> > > >
> > > >Huh. It seems you're getting an interrupt before timekeeping_resume()
> > > >runs (which resets cycle_last). I'll look over the code and see if I can
> > > >sort out why it works w/ ACPI suspend, but not APM, or if the
> > > >resume/interrupt-enablement bit is just racy in general.
> > >
> > > I forgot to mention this, but I had a debug printk() in apm.c
> > > which showed that irqs_disabled() == 0 at the point when APM
> > > resumes the kernel.
> >
> > So it seems possible that the timer tick will be enabled before the
> > timekeeping resume code runs. I'm not sure why this isn't seen w/ ACPI
> > suspend/resume, as I think they're using the same
> > sysdev_class .suspend/.resume bits.
>
> ACPI actually keeps interrupts disabled, always.
>
> APM can only keep interrupts disabled on non-IBM machines, presumably
> due to BIOS problems.
>
> Could we get some sanity check into looping function? If timesource
> goes backwards, at least somehow reporting it would be nice...

Yep, I'm working on a debug patch (similar to the paranoid timekeeping
debugging option in earlier versions of the TOD patch) that will spit
out warnings when we see unusual behavior: large numbers of lost ticks,
timer ticks arriving too early, settimeofday being call, etc.

thanks
-john

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/