libata errors in 2.6.15.1 ICH6 AHCI (SATA drive WD740GD)

From: Kalin KOZHUHAROV
Date: Fri Jan 27 2006 - 00:14:39 EST


Hi there.

I am reiterating this, while trying to diagnose the problem.
It is a DIY box with Asus P5GDC-V Deluxe motherboard with Marvel 88E8053 GB
ethernet (for info see [1]) and WD740GD (10k RPM) harddisk.

The NIC was not found by the in kernel driver, so I used a patch to sk98lin
binary driver, later tried sky2; both with intermittent succes. Now I have a
r8169 NIC and have disabled on board one in BIOS and put a new vanilla
linux-2.6.5.1

After some time (30 minutes to 3 days) the machine dies, first the disk,
some partitions mounted RO by the kernel, finally everything is dead (no
response to ping and KBD).

What I get in the dmesg is this:
...
[ 23.464209] hub 5-0:1.0: USB hub found
[ 23.464221] hub 5-0:1.0: 8 ports detected
[ 25.819331] r8169: eth0: link up
[13091.397797] ata1: handling error/timeout
[13091.397805] ata1: port reset, p_is 0 is 0 pis 0 cmd 4017 tf d0 ss 113 se 0
[13091.397823] ata1: status=0x50 { DriveReady SeekComplete }
[13091.397828] sda: Current: sense key=0x0
[13091.397831] ASC=0x0 ASCQ=0x0
[13091.481534] ata1: port reset, p_is 40000001 is 1 pis 0 cmd 4017 tf 471 ss
113 se 0
[13091.481542] ata1: translated ATA stat/err 0x71/04 to SCSI SK/ASC/ASCQ
0xb/00/00
[13091.481544] ata1: status=0x71 { DriveReady DeviceFault SeekComplete Error }
[13091.481549] ata1: error=0x04 { DriveStatusError }
...

The full dmesg can be found under [1] as 2.6.15.1-K01_P4_server.3.dmesg

I checked the drive (on the same machine) both with smartctl and with the
boot floppy I downloaded from WD support site (Data lifeguard tools).
Neither reported anything bad (yes I looked the status after the test).

The filesystem (reiserfs) does fscheck on every bood, but so far corruption
has not occured as far as I can see.

As always, the usual question is:

What is the cause of this? Bug?

What can I do to better diagnose it?

Is any additional info helpful (see [1])?

Dmesg and other hardware info can be found here:
[1]: http://linux.tar.bz/reports/oopses/char/

Kalin.
--
|[ ~~~~~~~~~~~~~~~~~~~~~~ ]|
+-> http://ThinRope.net/ <-+
|[ ______________________ ]|

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/