Re: SATA 3112 errors on 2.6.7

From: Ricky Beam
Date: Sat Jun 19 2004 - 16:42:44 EST


On Sat, 19 Jun 2004, R. J. Wysocki wrote:
>Anyway, it looks like a pattern is forming which smells bad to me.

The pattern formed a long time ago... there are several web pages dedicated
to the failure that is SATA. (1.5GHz signal on an unshielded cable? WTF
were they thinking?)

>Apparently, we have:
>1) A serious error condition that occurs on Seagate SATA drives connected to
>Silicon Image controllers.

Not all drives and not all ways Seagate. The seagate drive that fails
most often (always?) is FW 3.05 while the other 3 are FW 3.18.

>2) As of today we can say that it only occurs on Seagate drives (Ricky, do I
>remember correctly that you see faulty behavior of such drives with a 3ware
>RAID?).

The 3ware card has 250G Maxtor drives. The drive that fails has different
firmware than the other three. It fails irrespective of which port it's
on. And powermax diag fails recalibration 50% of the time.
(FW YAR51EW0 works -- 3 on the 3ware, and 2 in Dells, YAR51BW0 fails.)

>Afterwards, the drive blocks its SATA bus in a "busy" mode and cannot be
>accessed by any means (ie. hardware reset is necessary).

Actually, I think it's the controller that's borked. There's no way to
"hardware reset" a SATA drive without powering down the system which I'm
not doing. And the same problems do not happen in windows using si's
driver. Accorinding to FreeBSD developers, si's hardware is, point blank,
broken.

>4) The most "reliable" way to trigger this condition is to copy a lot of data
>(eg. 2 GB) to the drive in one shot.

Any sustained burst write will eventually lock up the channel. It can take
seconds or hours. It is as if a packet is being dropped during the DMA
transfer or the DMA completing without an ack.

--Ricky


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/