vmstat: zero cs | system debug

From: Nico Schottelius
Date: Tue Nov 23 2004 - 08:38:53 EST


Hello!

Some minutes ago I had the following problem:

- a fileserver with many process in 'D' state
- vmstat shows zero context switches
- more or less zero access to the system
- system load of ~150 (many smbd processes with 'D')
- raid looked fine
- no errors in dmesg
- no mysterious processes
- system not hacked (as its in our company lan)
- after rebooting it seems to work
- the initialization of the Adaptec AHA-2940U/UW/D / AIC-7881U took
quite long, but it works

My questions:

- what could I have done to find out the problem?
-> I did ps axu, vmstat, cat /proc/mdstat, free, netstat -an
- does zero context switches mean there is one process using
completly the cpu? if so, why was I able to start vmstat?

- what todo to fix the problem?
-> tried killall -9 smbd apache ...

- is there some TFM for reading about "Linux system analyzing"?

Thanks for any answer,

Nico

Attachment: pgp00000.pgp
Description: PGP signature