Re: Strange problem with a Linux server

Henning P. Schmiedehausen (hps@tanstaafl.de)
10 Oct 1998 13:45:19 +0200


weis@math.hws.edu (Dominik Weis) writes:

>We had a Linux server that lost all the network connections and nobody
>was able to connect to it. I was not at
>the server when it happend but another person rebooted the server and
>after it restarted it worked again. We did not change anything on the
>server for more than two weeks. It worked really fine. The strange part is
>I have nothing in the logs about errors and there where no error messages
>on the console.

Turn off the screen blanker and you will 99% sure see:

Deadlock detected by CPU#<insert your favourite CPU>

I tried to run a high network traffic, high scsi traffic server system
on 2.0.x SMP (all.all Newsserver) and the machine crashed again and
again with your symtoms in less than three days.

(two Adaptec 2940 UW controller, 512 MB RAM, 333 MHz Intel PII CPU,
three 9 gig UW SCSI disks, one 4 gig Ultra SCSI Disk, 100 MBit tulip
Ethernet running on a Cisco Catalyst 29xx switch)

The machine runs 2.0.36pre12 together with the new aic7xxx driver from
Doug Ledford (sp?).

I ripped out one of the CPUs and now:

1:29pm up 24 days, 23:37, 1 user, load average: 4.04, 3.66, 3.42

This is the machine in idle state. Under real load it goes well into
the twenties.

Any further questions? Linux 2.0.x SMP is not yet ready for high
traffic, high load production use.

Remove one of your CPUs and you will have no more problems.

Kind regards
Henning

-- 
Dipl.-Inf. (Univ.) Henning P. Schmiedehausen --             hps@tanstaafl.de
TANSTAAFL! Consulting - Unix, Internet, Security      

Hutweide 15 Fon.: 09131 / 50654-0 "There ain't no such D-91054 Buckenhof Fax.: 09131 / 50654-20 thing as a free Linux"

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/