Re: Important patch to fix select!

Dong Liu (dongliu@lucent.com)
Thu, 08 Jul 1999 10:46:57 -0400


Rogier Wolff wrote:
>
> Dong Liu wrote:
> > My point is that select,poll and nanosleep should use the same
> > way to calculate jiffies, so you may want change select to
> > add the extra one.
>
> Ok. I can agree with that.
>
> For the rationale behind the patch, read the comment I added to the
> top of the file in the hope that it will stay fixed....
>
> After the patch is the program that proves that there is something
> really wrong...
>
> -----------------------------------------------------------------------
>
> diff -ur linux-2.2.10.clean/fs/select.c linux/fs/select.c
> --- linux-2.2.10.clean/fs/select.c Thu Jun 17 16:19:39 1999
> +++ linux/fs/select.c Wed Jul 7 20:24:37 1999
> @@ -39,6 +39,45 @@
> * Linus noticed. -- jrs
> */
>
> +/*
> + * An important issue that people seem to forget is that if I request
> + * a timeout of xx ms, then the system should NEVER timeout BEFORE
> + * those xx ms are over. On a multitasking operating system I should
> + * always be prepared to be woken up a bit later (e.g. because another
> + * task was running, or because my pages were suddenly swapped to
> + * disk), but I can count on not being woken up earlier.
> + *
> + * (For simplicity the following uses the numbers for Intel machines.
> + * On Machines with a HZ value that is different than 100, the same
> + * applies, but at a different scale).
> + *
> + * Now, indeed when things seem to matter is when an application
> + * requests a timeout that is very small. Because the system timer
> + * runs at 100Hz, the system has to round everything to a multiple of
> + * the 10ms system clock. And due to the requirement of never timing
> + * out early, we have to round up. But there is one more issue. As we
> + * want maximum speed here, we won't use the cycle counters or
> + * something like that to determine where in the 10ms interval between
> + * two timerticks we are. So for all we know, we could be only a
> + * microsecond away from the next timertick.
> + *
> + * So therefore, besides rounding up, we need to add in one more timer
> + * tick to compensate for this.
> + *
> + * This means that an application that has something trivial to do and
> + * then sleeps for an interval < 1/Hz, will be put to sleep for two
> + * timer ticks. However we cannot tolerate that an application that
> + * takes 9.9 ms to do its work and then requests a timeout in 9.9 ms
> + * gets woken up in just 0.1 ms.
> + *
> + * Is this clear to everyone? Now please don't break this again.
> + * Thanks.
> + *
> + * -- REW.
> + *
> + */
> +
> +
> static void free_wait(poll_table * p)
> {
> struct poll_table_entry * entry;
> @@ -261,8 +300,9 @@
> if (sec < 0 || usec < 0)
> goto out_nofds;
>
> + /* See note at the top of this file. */
> if ((unsigned long) sec < MAX_SELECT_SECONDS) {
> - timeout = ROUND_UP(usec, 1000000/HZ);
> + timeout = 1 + ROUND_UP(usec, 1000000/HZ);
> timeout += sec * (unsigned long) HZ;
> }
> }
>

You should not blindly add the extra one, you should check if usec == 0
first.

Maybe your way is more correct, but my way is more useful. Suppose
you want generate a stream of UDP packets with 10 ms second length
of voice, in this case it is not very inportant if you occasionally
send a packet too fast or to slow, so I can do it like this,

while (1) {
send a packet;
usleep(10000);
}

But if use your approach, I have to rely on some other trick, like
increasing
HZ, using itimer signal or using the RTC driver.

I think the best way is to schedule for the minimun timeout, when wakeup
check to see if the timeout is really expired, if not then scheduling
for the one extra jiffie.

Dong

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/