processes freezing between v2.1.130 and v2.2.1

Truxton Fulton (trux@truxton.com)
Thu, 4 Feb 1999 22:32:08 -0800 (PST)


Dear Linus et al,

There is a kernel peculiarity introduced after v2.1.130. I have been seeing
it happen with all the 2.2.0-pre releases, but it is intermittent enough
to make it hard to reproduce. What happens is that a process will block
for a long time. This happens with xload, xclock, netscape, and other
apps too. I have two machines with identical filesystems. This problem
does not happen on my 96MB pentium with local disks, rather it happens
on my 24MB i486 with all filesystems nfs mounted (although it has a local
swap disk). High load average and swapping seem to trigger this bug.
A process will just get stuck for a long while. Using strace on xload
and xclock reveals a select() waiting on fd 3 for hundreds or sometimes
thousands of seconds. I remember someone else on the mailing list
puzzling at netscape doing a similar select(). I can believe netscape
has enough bugs of its own to cause this bad behaviour, but what I find
strange is that other relatively simple programs like xload are doing the
same thing, and that reverting to linux v2.1.130 seems to cure the problem.
v2.2.1 contained some promising sounding comments in the mm code about
fixing a scheduling problem, but v2.2.1 alas still has the same problem.
If I can assist in debugging this further, I will be glad to.

-Truxton

vaiva@endorphin(/tmp)>uname -a
Linux endorphin.truxton.com 2.2.1 #1 Sun Jan 31 15:37:49 PST 1999 i486 unknown
vaiva@endorphin(/tmp)>free -t
total used free shared buffers cached
Mem: 22628 22108 520 9908 4 8984
-/+ buffers/cache: 13120 9508
Swap: 83284 40756 42528
Total: 105912 62864 43048

[root@endorphin /root]# ps waux | egrep "xload|PID"
USER PID %CPU %MEM SIZE RSS TTY STAT START TIME COMMAND
root 1430 0.0 1.7 1132 400 p5 S 21:31 0:00 egrep xload |PID
vaiva 319 0.0 1.3 2464 308 1 S 23:03 0:03 xload -fn 6x10 -hl lightblue -geometry 250x50-106+0 -scale 2 -update 8
[root@endorphin /root]# strace -p 319
select(4, [3], [], [], {493, 860000} <unfinished ...>
[root@endorphin /root]#
[root@endorphin /root]# ps waux | egrep "xclock|PID"
USER PID %CPU %MEM SIZE RSS TTY STAT START TIME COMMAND
root 1436 0.0 1.7 1132 400 p5 S 21:32 0:00 egrep xclock |PID
vaiva 324 0.0 1.4 2412 332 1 S 23:03 0:01 xclock -fn 6x13 -geometry 100x100-0+0 -update 1
[root@endorphin /root]# strace -p 324
select(4, [3], [], [], {521, 30000} <unfinished ...>
[root@endorphin /root]# strace -p 324
select(4, [3], [], [], {504, 10000} <unfinished ...>
[root@endorphin /root]# strace -p 324
select(4, [3], [], [], {493, 800000} <unfinished ...>
[root@endorphin /root]# strace -p 324
select(4, [3], [], [], {479, 50000} <unfinished ...>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/