Re: nanosleep

Markus Kuhn (
Thu, 4 Apr 1996 17:40:22 +0200 (MET DST)

> Tom Bjorkholm writes:

> > nanosleep

> Speaking of which, from nanosleep:

> if (t.tv_sec == 0 && t.tv_nsec <= 2000000L &&
> current->policy != SCHED_OTHER) {
> /*
> * Short delay requests up to 2 ms will be handled with
> * high precision by a busy wait for all real-time processes.
> */
> udelay((t.tv_nsec + 999) / 1000);
> return 0;
> }

> Busy waiting for 2ms seems like a performance hit - 2 micro seconds maybe...

I know that it is ugly, that is why only root processes are allowed to
do it (only root processes can get current->policy != SCHED_OTHER).
It is intended for applications where having a reliably precision delay is
more important than performance (e.g. controlling some time-critical
hardware like an astronomical CCD camera, where the PC determines
the exposure time with a wait loop). And SCHED_FIFO will guarantee that
your process still has the CPU when the nanosleep is over. Remember: these
features were implemented in order to allow Linux to be used for
time-critical applications which previously required me to boot MS-DOS.

There might be more elegant solutions possible. If you want to implement
an interrupt-on-demand timer facility for Linux: yes, please try it!

The problem is just that I am not sure yet, how exectly this interrupt-
on-demand facility would use the system ressources available in a PC.

The basic idea is that you don't implement the timers based on the periodic
1/HZ s interrupt, but that the kernel keeps a priority queue of timers
that will expire soon. A hardware timer will always be programmed such that
it will cause an interrupt exactly at the microsecond at which a kernel timer
expires. This will give you a busy-wait free nanosleep(), that offers
microsecond precision not only for short < 2 ms delays, but for ALL delays.
Good real-time operating systems provide such an interrupt-on-demand
timer implementation.

What do we have in the PC?

We have three timers in an Intel 8253 chip connected to the bus clock.

timer 1: implements the famous 18.2 Hz DOS clock (master clock divided
by 65536) and under Linus the 100 Hz tick interrupts which
is used for process preemption, system clock, NTP PLL,
itimers, kernel timers, etc.

timer 2: was historically used for DRAM refresh, I have no idea,
whether it can be used for other things.

timer 3: can produce "sound" on the speaker.

As far as I know, only timer 1 can generate interrupts.

In addition, we have the CMOS real-time clock. It can also generate
periodic interrupts at various rates (e.g. 128 Hz would be possible)
and it has its own 32768 Hz crystal as a frequency reference. Disadvantage:
You can not read a counter value with high precision from the CMOS clock.

And finally, all Pentium processors have a 64-bit counter on chip
(TSC=time spamp counter). It provides the highest timing resolution
(depending on the CPU speed down to a few ns) as it counts the CPU clock
cycles. Advantage: This counter will never have an overflow during
any realistic Linux uptime. Disadvantage: It can not generate interrupts.

Suggestion: We could use the 128 Hz CMOS interrupt for periodic tasks
like process preemption. This would make the 8253 timer 1 free for
interrupt-on-demand timer implementation with around one microsecond
precision. For high precision time measurement with gettimeofday()
or the POSIX equivalent clock_gettime(), the Pentium timer could be

On 386 and 486 systems, the old system would have to be preserved,
because if the 8253 is not used any more for periodic interrupts, there
is no way to measure time with microsecond precision reliably. The
CMOS clock can not be read with microsecond precision any time,
and pre-Pentium Intels do not offer a 64-bit TSC.

Does all this sound reasonable?


Markus Kuhn, Computer Science student -- University of Erlangen,
Internet Mail: <> - Germany
WWW Home: <>