2.1.127 "->timeout" bugs

Terence J. Wells (tjw30@cam.ac.uk)
Sun, 8 Nov 1998 16:58:30 +0000 (GMT)


I read linux-kernel by NNTP, so I apologise for not using a proper
reply/follow-up to the existing messages, but I felt I ought to speak
out before we get saddled with difficult-to-trace subtle bugs...

We're probably all aware that something like the following is needed
due to the task struct change:
- current->timeout = foo;
- schedule;
+ schedule_timeout(foo);

But I've noticed that various bits of the kernel used to rely on the
modified "->timeout" value after "schedule", in while loops and so on
(e.g. ftape).

So in some places a "long timeout" (or similar) is needed to store the
value returned by "schedule_timeout", so that the value is not lost
between invocations of "schedule_timeout".

So far so good -- but unfortunately "schedule" wasn't the only
function which used "->timeout": AFAICS "interruptible_sleep_on" (at
least) did also. "interruptible_sleep_on" now also has an appropriate
new version, "interruptible_sleep_on_timeout"

The trouble is, it's not being used... All the following files
contained references to "->timeout" and "interruptible_sleep_on"
before the task struct change, and now (2.1.127) _don't_ contain
"interruptible_sleep_on_timeout". In some cases this may be correct
(if "->timeout" was somehow being reset to 0 between calls in the old
version), but at least in ./net/sunrpc/sched.c, between lines 918 and
928, this is not the case. I've weeded out those files which had been
doing explicit resets (e.g. some of the sound drivers):

./arch/ppc/8xx_io/uart.c (implicit ->timeout reset)
./drivers/block/floppy.c (probably OK)
./drivers/block/md.ca (implicit ->timeout reset)
./drivers/block/swim3.c (implicit ->timeout reset)
./drivers/cdrom/cdu31a.c (not sure -- should be like mcdx?)
./drivers/cdrom/mcdx.c (not fixed yet)
./drivers/cdrom/sonycd535.c (not sure -- should be like mcdx?)
./drivers/char/cyclades.c (implicit ->timeout reset)
./drivers/char/epca.c (implicit ->timeout reset)
./drivers/char/esp.c (implicit ->timeout reset)
./drivers/char/istallion.c (implicit ->timeout reset)
./drivers/char/lp_m68k.c (implicit ->timeout reset)
./drivers/char/pcxx.c (implicit ->timeout reset)
./drivers/char/riscom8.c (implicit ->timeout reset)
./drivers/char/rocket.c (implicit ->timeout reset)
./drivers/char/serial.c (implicit ->timeout reset)
./drivers/char/serial167.c (implicit ->timeout reset)
./drivers/char/specialix.c (implicit ->timeout reset)
./drivers/char/stallion.c (implicit ->timeout reset)
./drivers/isdn/isdn_tty.c (implicit ->timeout reset)
./drivers/macintosh/macserial.c (implicit ->timeout reset)
./drivers/net/sktr.c (implicit ->timeout reset)
./drivers/sbus/char/sab82532.c (implicit ->timeout reset)
./drivers/sbus/char/zs.c (implicit ->timeout reset)
./drivers/sgi/char/sgiserial.c (implicit ->timeout reset)
./drivers/sound/dmasound.c (bad SLEEP macro?)
./include/linux/wrapper.h (still sets ->timeout?)
./net/sunrpc/sched.c (implicit ->timeout reset)

I have no idea how sensitive the code is to subtle changes in
behaviour such as this (I don't pretend to understand most of it
either), but it would perhaps be a good idea for the relevant
maintainers to check that the applied patches don't break the timing
logic -- and while they're at it, perhaps putting in "HZ" in
appropriate places would be a good idea too.

I hope someone feels this is useful and does something about it -- I
prefer not to submit patches when I have no idea what's actually going
on...

Regards,

TJW

--------------------------------------------------------------------
Terence J. Wells E-mail : tjw30@cam.ac.uk
St John's College Phone : +44.468.916617 (mobile)
Cambridge CB2 1TP Room : A3 New Court, SJC
United Kingdom Talk : tjw30@tjw30.joh.cam.ac.uk

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/