[patch] fix for non-monotonic gettimeofday caused by broken 8254

Truxton Fulton (trux@truxton.com)
Fri, 5 Mar 1999 00:59:26 -0800 (PST)


Ok, the kernel is not to blame for my broken hardware, but I have found
a reliable way to work around my problem. I'm hoping that others
that have reported the same problem will have success with my patch.

The problem: on 386 and 486 machines, the kernel must use the 8254
hardware timer chip to interpolate time between jiffies. The timer
gives a continuously decrementing count from 0x2e9b to 0. The count is
read as two bytes in sequence: LSB then MSB from the same port (0x40).
On my 486F38X "Mother" brand vesa-local-bus motherboard, for some
unknown reason, stressing the machine can cause the alternating
sequence of LSB and MSB reads to get 1 byte out of synch, effectively
causing the counter values to get byte-swapped.

The symptom: gettimeofday() is not monotonic. It makes small jumps
backwards in time, producing a "sawtooth" progression. Programs
that calculate time deltas hang for hours or days. xclock stops
moving, netscape (aside from its other bugs) freezes, general havoc.

The solution: If we detect a count above the valid maximum (0x2e9b for
HZ=100), then do one extra read to put us back in synch. This
fix works for me. This one byte correction is only occasionally
needed on my machine (perhaps once per hour or so). It should not
noticeably decrease performance for others with non-broken timers.
It will not affect any processors other than i386 and i486.

Many thanks to Ingo Molnar and Colin Plumb for their guidance.

-Truxton

--- endorphin/linux-2.2.2/arch/i386/kernel/time.c Wed Jan 20 10:18:53 1999
+++ nicotine/linux-2.2.2/arch/i386/kernel/time.c Thu Mar 4 22:38:35 1999
@@ -156,9 +156,12 @@
* comp.protocols.time.ntp!
*/

+unsigned long dsgtoi=0 ; /* do_slow_gettimeoffset invocation counter */
+
static unsigned long do_slow_gettimeoffset(void)
{
int count;
+ int extra;

static int count_p = LATCH; /* for the first call after boot */
static unsigned long jiffies_p = 0;
@@ -180,6 +183,21 @@
jiffies_t = jiffies;

count |= inb_p(0x40) << 8;
+
+ dsgtoi++ ; /* count how many times we read the 8254 counter */
+ if(count>LATCH) /* uh-oh... */
+ {
+ extra=inb_p(0x40); /* extra read to get us back in synch */
+ printk("hardware timer out of synch! %lu:(%04X,%02X)\n",dsgtoi,count,extra) ;
+ outb_p(0x00, 0x43); /* latch the count*/
+ count = inb_p(0x40); /* read the latched count */
+ count |= inb_p(0x40) << 8;
+ if(count>LATCH) /* still not right... */
+ {
+ printk("hardware timer is hosed! %lu:(%04X)\n",dsgtoi,count);
+ count=count_p;
+ }
+ }

/*
* avoiding timer inconsistencies (they are rare, but they happen)...

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/