2.1.116 -- No go ...

Steffen Luitz (luitz1@vxn48b.cern.ch)
Wed, 19 Aug 1998 22:32:41 +0200 (MET DST)


After booting 2.1.116 SMP on our cluster of 8 PII-266 dual processor
machines, root file system on NFS, no screen, no keyboard, serial console
output only, two of them immediately locked up hard (no more network
access ...) with the following diagnostics on the serial console output
(repeating every few seconds):

<[c0113d83]> <[c0174f9e]> <[c0175080]> <[c014859b]> <[c011a065]>
<[c0114062]> <[c011fc74]>
wait_on_bh, CPU 1:
irq: 1 [1 0]
bh: 1 [0 0]

The trace translates into (hopefully I got it right, oh these hex
numbers!):

del_timer _rpc_wake_up _rcp_wake_up_task nfs_updatepage do_bottom_half
schedule generic_file_write

2.1.114-SMP did much better (ca. 1 crash / day / 8 machines under heavy
disk and network I/O)

Cheers

Steffen

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.altern.org/andrebalsa/doc/lkml-faq.html