libata command timeout on AMD64 (VIA mobo)

From: Denis Vlasenko
Date: Tue May 24 2005 - 03:47:22 EST


Unfortunately this happens on the box which I don't have physical access to,
I will try to supply as much info as I can.

An AMD64 box mysteriously dies from time to time.

Basically, problems start with:

"12:58:24 kernel: ata1: command 0x25 timeout, stat 0xd0 host_stat 0x1"

and after several occurrences of those,

"13:03:59 kernel: Slab corruption: start=c1949460, len=344"

appears. Last syslog message has time 13:05:59. I couldn't ssh into the box
after this, but ping works and TCP sessions are established.
I guess any disk access on the box is D-stated forever.

Attached tarball contains gory details.
--
vda

Attachment: report.tar.gz
Description: application/tgz