problems with do_slow_gettimeoffset()

From: Jason Sodergren (jason@mugwump.taiga.com)
Date: Thu Mar 30 2000 - 17:35:36 EST


Hello, everyone.

I've run into a problem with the do_slow_gettimeoffset function in kernel 2.2.14
(the code apparently hasn't changed much in newer kernels).

During calls to do_gettimeofday(), the time returned is occasionally 10mS behind
where it should be. I've narrowed this down to what appears to be a problem
with timer underflow detection in do_slow_gettimeoffset.

Here is a copy of that function with whitespace/commments/neptune bug stuff
stripped out for the sake of brevity:

>From arch/i386/kernel/time.c:
static unsigned long do_slow_gettimeoffset(void)
{
        int count;
        static int count_p = LATCH; /* for the first call after boot */
        static unsigned long jiffies_p = 0;
        unsigned long jiffies_t;
        /* timer count may underflow right here */
        outb_p(0x00, 0x43); /* latch the count ASAP */
        count = inb_p(0x40); /* read the latched count */
         jiffies_t = jiffies;
        count |= inb_p(0x40) << 8;
(1) if( jiffies_t == jiffies_p ) {
                if( count > count_p ) {
                        outb_p(0x0A, 0x20);
                        if( inb(0x20) & 0x01 ) {
(2) count -= LATCH;
                        } else {
                                printk("do_slow_gettimeoffset(): hardware timer problem?\n");
                        }
                }
        } else
                jiffies_p = jiffies_t;
        count_p = count;
        count = ((LATCH-1) - count) * TICK_SIZE;
        count = (count + LATCH/2) / LATCH;
        return count;
}

The problem seems to be that underflow detection will not necessarily work the first time
this function is called while interrupts are disabled. For example, in this sequence
of events:

- device driver interrupt occurs, ISR is entered with interrupts disabled.
- timer underflow occurs; irq0 is now pending
- device driver ISR calls do_gettimeofday(), which calls do_slow_gettimeoffset()

In the above case, the condition at (1) will be false if this is the
first call to the function during the current jiffy, since the current
jiffies value is different from the stored value from last call of the function.
Therefore, count is not compensated for timer underflow and time seems to jump
backwards 10mS.

In subsequent calls with interrupts still disabled, the check at (1) would
return true, and the correct underflow compensation would occur.

I've modified the above function as follows; this seems to correct
the problem on my test machines:

static unsigned long do_slow_gettimeoffset(void)
{
        int count;
        unsigned char irqpend;
        /* timer count may underflow right here */
        outb(0x0A, 0x20);
        outb(0x00, 0x43); /* latch the count ASAP */
(1) irqpend=inb(0x20); /* get IRQ0 status as close to */
                                /* count latch time as possible */
        /* Slight chance that IRQ0 was set AFTER count was latched. */
        
        count = inb_p(0x40); /* read the latched count */
        count |= inb_p(0x40) << 8;

        if( irqpend & 0x01 ) /* Counter underflow? */
        {
        /* If count is small and IRQ is pending, IRQ was most likely
           set AFTER count was latched, or an IRQ0 was lost*/
           
(2) if(count>10) /* 10 is arbitrary */
                        count -= LATCH;
        }
        count = ((LATCH-1) - count) * TICK_SIZE;
        count = (count + LATCH/2) / LATCH;
        return count;
}

Instead of checking count and jiffies against values stored during the previous
call to the function, I'm just checking for a pending IRQ0, which I try to
check as close to the latching of count as possible. There's still the possibility
that underflow occurs right after latching, resulting in erroneous detection of
underflow; that's what the check of the latched count value at (2) tries
to address. This code seems to fix the time jump problem I've observed when using
the original code.

It seems to me the original code is flawed. Am I missing something?
Any input is appreciated. If this IS a flaw, I'll work on the function
a bit and produce a patch.

- Jason Sodergren - jason@taiga.com - http://www.taiga.com/~jason -
          - PGPK @ http://www/taiga.com/~jason/pgp.phtml -

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Mar 31 2000 - 21:00:27 EST