I built a new 2.3.3 kernel (Makefile still says 2.3.2...I guess
somebody forgot to update it?) with your patches. As you suspected:
May 19 21:33:46 vheissu kernel: recover_lost_timer: lost 1 tick from 080c1fb8
May 19 21:33:46 vheissu kernel: recover_lost_timer: lost 1 tick from 40218baf
May 19 21:33:46 vheissu kernel: recover_lost_timer: lost 1 tick from 40218ba8
May 19 21:33:46 vheissu kernel: recover_lost_timer: lost 1 tick from 401c8360
May 19 21:33:46 vheissu kernel: recover_lost_timer: lost 1 tick from 401c7fbe
May 19 21:33:46 vheissu kernel: recover_lost_timer: lost 1 tick from c0107997
May 19 21:33:46 vheissu kernel: recover_lost_timer: lost 1 tick from 401c82d6
May 19 21:33:46 vheissu kernel: recover_lost_timer: lost 1 tick from c0107997
c010795c T cpu_idle
c01079ac T sys_idle
I also got this "lost 429495 ticks" (!) from a previous kernel build,
but I don't have the System.map to go with it. I wish I knew what was
going on here; clearly, some signed calculation is returning a
negative number which is then being cast to unsigned. I think this is
the original problem I was having, with the clock jumping back & forth
4294 seconds (429495 ticks, right?)
May 19 21:24:09 vheissu kernel: recover_lost_timer: lost 429495 ticks from c0107997
May 19 21:24:09 vheissu kernel: recover_lost_timer: lost 2 ticks from c0107997
May 19 21:24:20 vheissu kernel: recover_lost_timer: lost 1 tick from c01f2ef7
May 19 21:24:20 vheissu kernel: recover_lost_timer: lost 1 tick from c0107997
=>If the problem is a lost tick over the time then my TSC code should tell
=>you also which is the piece of code that masked irqs on all cpus for a so
=>long time, so you can optimize it.
A lot of the ticks are lost from 0x40... is that userland? How can I
find out which process? (I suspect the X server: the easiest way to
lose ticks is to scroll a Netscape X window.) Can the server turn off
IRQs on both CPUs?
=>In my patch I implemented a recover_lost_ticks mechanizm that will detect
=>a lost timer interrupt and will update xtime to take care of the lost irq.
=>This will only work with TSC enabled, if you don't have the i386 TSC you
=>will continue to lose time over the time ;) and gettimeofday can't be
=>monotone in presence of a lost tick.
Is there any way to force gettimeofday not to return an earlier time?
Of course, I want the clock to be as accurate as possible, but the
time decrements are causing more trouble than a simple slow or fast
clock would. I've modified do_gettimeofday so that it maintains the
latest time it ever returned, and won't return a time earlier than
this. (The remembered time is updated in do_settimeofday so that
clock adjustments will still take effect.) This seems to help, but do
you see any problems with it?
regards,
d.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/