Re: 2.4.24 SMP lockups

From: Arkadiusz Miskiewicz
Date: Fri Jan 09 2004 - 17:21:43 EST


On Friday 09 of January 2004 22:04, Simon Kirby wrote:
> 'lo all,
>
> We've had about 6 cases of this now, across 4 separate boxes.
I had several such cases with 2.4.23 on two separate boxes. First is dual PIII
1GHz Intel SRMK2 platform -
http://www.intel.com/support/motherboards/server/srmk2/ with 1,5GB RAM,
reiserfs as filesystem, scsi disks and Adaptec AIC-7899P U160/m controller.

Second was UP PIII 500MHz machine on some Intel BX mainboard, 256MB RAM, ext3
as filesystem, software raid 5 on scsi disks using aic7xxx (Adaptec AIC-7892B
U160/m).

First one was locking up few times per day (pretty big load), second one maybe
once per day/two (lower load).

Both machines are working _fine_ with 2.4.21 kernel (one was also using 2.4.22
for some time and no problems occured).

kernels on both machines were exactly the same (just copied) but using
different modules - kernel was compiled using 2.95.4 (3+some parts from 2.95
branch of gcc cvs)

> These boxes are all dual CPU, and the failure case shows up suddenly with
> no warning. Sysreq-P works, but only reports from one CPU no matter how
> many times I try. In normal operation, every machine distributes all
> IRQs across both CPUs, and Sysreq-P reports from both CPUs.
Similar here - but sometimes even sysrq wasn't working (on second machine).

> Even on boxes with nmi_watchdog=1, nothing is reported from the NMI
> watchdog.
Exactly same here.
append=" console=tty0 console=ttyS0,9600n81 panic=60 nmi_watchdog=1"

I was thinking that maybe that's due to some problem in aic7xxx driver and
updated it on one machine to latest available version (these in kernel are
very old) but that didn't help.

> Simon-

--
Arkadiusz Miśkiewicz CS at FoE, Wroclaw University of Technology
arekm.pld-linux.org AM2-6BONE, 1024/3DB19BBD, arekm(at)ircnet, PLD/Linux
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/