We just had 2.0.33 die on us. It wasn't a complete hang like people have
been reporting. It was pingable but would not launch new processes.
When I got to the box the console was logging "eth0: couldn't allocate a
sb_buff of size ...", along with a few "Couldn't get a free page"
messages. This sounds _very_ similar to what someone else reported. Of
course it has plenty of memory.
This was on 5 days of uptime under 2.0.33 + aicxx5.0.7 + inode fix.
Previously we had 100+ days of uptime under 2.0.32 before we shutdown for
a kernel/hardware upgrade. We now have a second SCSI card + an IDE disk.
None of this new equipment was in use at the time though.
The start of an oops was logged:
Mar 6 19:10:43 ferret kernel: general protection: 0000
Mar 6 19:10:43 ferret kernel: CPU: 0
Mar 6 19:10:44 ferret kernel: EIP: 0010:[sleep_on+96/124]
Mar 6 19:10:44 ferret kernel: EFLAGS: 00010002
Mar 6 19:10:44 ferret kernel: eax: 5bf4658d ebx: 03a7aeb8
ecx: 02ab8eb8 edx: 5bf4658d
Mar 6 19:54:03 ferret kernel: klogd 1.3-3, log source = /proc/kmsg started.
Not much I'm sorry but that's all I have. Additional wait-queue related
oopses were seen on the console.
I don't suspect hardware at this stage given recent discussions about
possible wait queue corruptions in 2.0.33, and the similar nature of this
crash to another one reported.
For now we are running on the same kernel. If troubles persist we will
drop to 2.0.32 again.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to email@example.com