spinlock on Alpha ES40

From: Andrew Pochinsky (avp@honti.mit.edu)
Date: Wed Jun 21 2000 - 16:35:47 EST


Hi,

 I'm having strange problem with running SMP-enabled kernels on
Compaq's ES40. The problem appears on all kernels from 2.2.12 to
2.2.16 and seems to manifest itself when the system is reasonably
loaded. Here is the problem's manifistation:

>From time to time, there is a burst of messages both on the console
and in the /var/log/messages which look like this:

Jun 21 15:58:56 es40-001 kernel: fault.c:43 spinlock stuck in main2.x at fffffc000032a360(3) owner main4.x at fffffc000032a360(1) fault.c:43
Jun 21 15:58:56 es40-001 kernel: fault.c:43 spinlock grabbed in main2.x at fffffc000032a360(3) 1202 ticks
Jun 21 16:00:12 es40-001 kernel: fault.c:43 spinlock stuck in main1.x at fffffc000032a360(3) owner main3.x at fffffc000032a360(0) fault.c:43
Jun 21 16:00:12 es40-001 kernel: sched.c:30 spinlock stuck in sshd at fffffc000032d2e4(2) owner main3.x at fffffc000032a360(0) fault.c:43
Jun 21 16:00:12 es40-001 kernel: fault.c:43 spinlock grabbed in main1.x at fffffc000032a360(3) 875 ticks
Jun 21 16:00:12 es40-001 kernel: sched.c:30 spinlock grabbed in sshd at fffffc000032d2e4(2) 876 ticks
Jun 21 16:00:23 es40-001 kernel: fault.c:43 spinlock stuck in main1.x at fffffc000032a360(2) owner main3.x at fffffc000032a360(1) fault.c:43
Jun 21 16:00:23 es40-001 kernel: fault.c:43 spinlock stuck in main2.x at fffffc000032a360(3) owner main3.x at fffffc000032a360(1) fault.c:43
Jun 21 16:00:23 es40-001 kernel: fault.c:43 spinlock grabbed in main1.x at fffffc000032a360(2) 869 ticks
Jun 21 16:00:23 es40-001 kernel: fault.c:43 spinlock grabbed in main2.x at fffffc000032a360(3) 867 ticks
Jun 21 16:00:29 es40-001 kernel: fault.c:43 spinlock stuck in main4.x at fffffc000032a360(2) owner main1.x at fffffc000032a360(3) fault.c:43
Jun 21 16:00:29 es40-001 kernel: fault.c:43 spinlock grabbed in main4.x at fffffc000032a360(2) 837 ticks
Jun 21 16:01:27 es40-001 kernel: fault.c:43 spinlock stuck in main2.x at fffffc000032a360(3) owner main3.x at fffffc000032a360(2) fault.c:43
Jun 21 16:01:27 es40-001 kernel: fault.c:43 spinlock grabbed in main2.x at fffffc000032a360(3) 841 ticks
Jun 21 16:01:29 es40-001 kernel: fault.c:43 spinlock stuck in main1.x at fffffc000032a360(3) owner main4.x at fffffc000032a360(2) fault.c:43
Jun 21 16:01:29 es40-001 kernel: fault.c:43 spinlock grabbed in main1.x at fffffc000032a360(3) 847 ticks
Jun 21 16:01:41 es40-001 kernel: fault.c:43 spinlock stuck in main2.x at fffffc000032a360(1) owner main4.x at fffffc000032a360(0) fault.c:43
Jun 21 16:01:41 es40-001 kernel: fault.c:43 spinlock grabbed in main2.x at fffffc000032a360(1) 837 ticks
Jun 21 16:02:15 es40-001 kernel: fault.c:43 spinlock stuck in main2.x at fffffc000032a360(1) owner main4.x at fffffc000032a360(0) fault.c:43
Jun 21 16:02:15 es40-001 kernel: fault.c:43 spinlock grabbed in main2.x at fffffc000032a360(1) 837 ticks

Sometimes, the machine goes completely catatonic and should be
resetted. Less often the system really crashes. My estimate is that
this lockup happens once in about a fortnight; out of 10 machines we
are running there is somewhat less that one failure per day.

Details about machines: 4 cpus 21264 at 666MHz, 1GB memory, 9GB scsi
disk on sym53c895, 2GB swap space. The problem seems only to exist
when smp is enabled in the kernel. spinlock gets stuck in various
places, including fault.c, sched.c, open.c, read_write.c etc.

Had anyone have the same problem, or has any suggestions?

Thanks,
--andrew

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Jun 23 2000 - 21:00:22 EST