Re: poll() in 2.6 and beyond

From: Richard B. Johnson
Date: Wed Mar 03 2004 - 13:23:23 EST


On Tue, 2 Mar 2004, David Dillow wrote:

> On Tue, 2004-03-02 at 18:32, Richard B. Johnson wrote:
> > Yes. The code I attached earlier shows that the poll() in a driver
> > gets called (correctly), then it calls poll_wait(). Unfortunately
> > the call to poll_wait() returns immediately so that the return
> > value from the driver's poll() is whatever it was before some
> > event occurred that the driver was going to signal with
> > wake_up_interruptible().
>
> You've been handed a clue enough times now that you should understand
> that poll_wait() does not, and has never, put the process to sleep.
>
> If you can show a case where do_poll() returns stale data, then by all
> means do so. We will be happy to fix any such error in the kernel.
>
> You say do_poll() looses the status returned from your driver's poll
> method. If your driver is truly returning a nonzero status from the
> poll() method call, then a simple read of the code in do_pollfd() will
> show that the only way it looses information from that event mask is if
> your user space is not setting that event type in pollfd.events.
>
> If I were you, I'd check two things:
> 1) that your poll method is really returning a non-zero status when you
> think it is
> 2) that your user space program is really asking for all events you
> think it is
>
> I think you'll find your problem is not this well-used mechanism in the
> kernel.
>
> Dave

The very great problems that exist with poll on linux-2.6.0
are being quashed by those who just like to argue. Therefore,
I wrote some code that emulates the environment in which I
discovered the poll failure. Experts can decide whatever they
want about the inner workings of poll(). I supposed that if
`ps` showed that a task was sleeping in poll() then it must
be sleeping in poll(). So, even it that's wrong, here is
irrefutable proof that there is a problem with polling events
getting lost on 2.6.0. The attached code will execute without
errors in 2.4.24 and below.

Here is the test code running on Linux-2.4.24

Script started on Wed Mar 3 10:24:05 2004
# make
gcc -Wall -O2 -D__KERNEL__ -DMODULE -DMAJOR_NR=199 -I/usr/src/linux-2.4.24/include -c -o thingy.o thingy.c
as -o rtc_hwr.o rtc_hwr.S
ld -i -o driver.o thingy.o rtc_hwr.o
gcc -Wall -o tester -O2 tester.c
rm -f THING
mknod THING c 199 0
# insmod driver.o
# lsmod
Module Size Used by
driver 1928 0 (unused)
nfs 48008 0 (autoclean)
lockd 37308 0 (autoclean) [nfs]
sunrpc 66236 0 (autoclean) [nfs lockd]
vxidrvr 2748 0 (unused)
vximsg 5328 0 [vxidrvr]
ramdisk 5596 0 (unused)
ipchains 42204 0
nls_cp437 4376 4 (autoclean)
3c59x 28252 1
isofs 17368 0 (unused)
loop 8856 0
sr_mod 12188 0 (unused)
cdrom 27936 0 [sr_mod]
BusLogic 35832 7
sd_mod 10472 14
scsi_mod 56236 3 [sr_mod BusLogic sd_mod]
# ./tester

You have mail in /var/spool/mail/root
# exit
exit
Script done on Wed Mar 3 11:05:56 2004

This ran for about 1/2 hour ( 10:24 -> 11:05) with no
errors.

This is the same thing, running on this exact same machine, booted with
Linux-2.6.0-test8

Script started on Wed Mar 3 13:00:18 2004
# make
gcc -Wall -O2 -D__KERNEL__ -DMODULE -DMAJOR_NR=199 -I/usr/src/linux-2.6.0-test8/include -c -o thingy.o thingy.c
as -o rtc_hwr.o rtc_hwr.S
ld -i -o driver.o thingy.o rtc_hwr.o
gcc -Wall -o tester -O2 tester.c
rm -f THING
mknod THING c 199 0
# insmod driver.o
# ./tester
ERROR: (51 != 1) Lost events : 50
ERROR: (59371 != 59369) Lost events : 52
ERROR: (75501 != 75498) Lost events : 55
ERROR: (94511 != 94509) Lost events : 57
ERROR: (164660 != 164658) Lost events : 59
ERROR: (194867 != 194864) Lost events : 62
ERROR: (206892 != 206890) Lost events : 64
ERROR: (214232 != 214230) Lost events : 66
ERROR: (308340 != 308337) Lost events : 69
ERROR: (353062 != 353059) Lost events : 72
ERROR: (444541 != 444539) Lost events : 74
ERROR: (446516 != 446514) Lost events : 76
ERROR: (447906 != 447904) Lost events : 78

# exit
exit
Script done on Wed Mar 3 13:04:39 2004

It ran from 13:00 to 13:04, accumulating 78 errors.

A review of the test code shows that the driver uses IRQ8
from the RTC timer chip. The driver doesn't care if some
interrupt ticks are lost because it just increments a memory
variable every time that an interrupt occurs. After that variable
is incremented, the interrupt service routine sets a global
flag to POLLIN and sends a wake_up_interruptible() to whomever
is sleeping in poll(). Spin-locks are put around critical
sections to prevent undefined results.

In the user-mode code, the task sleeping in poll() reads
the variable using read(). The user-mode code doesn't
care what the value is, only that it must be one-greater than
the last one read. If it isn't, an error message is written
to standard error.

Observations:
If the call to mlockall() is removed in the test-code, the
behavior seems to be better, just the opposite of what one
would suspect.

The priority (nice value) doesn't seem to affect anything.

The code runs without any errors on 2.4.24, 2.4.22, 2.4.21, and
2.4.18. The only source I have on this machine for 2.6.x is
for 2.6.0-test8.


Cheers,
Dick Johnson
Penguin : Linux version 2.4.24 on an i686 machine (797.90 BogoMips).
Note 96.31% of all statistics are fiction.

Attachment: modules.tar.gz
Description: Binary data