Re: SATA disks resets in a md setup

From: Tejun Heo
Date: Tue May 12 2009 - 04:25:17 EST


Vassilis Virvilis wrote:
> Ok I changed
> M/B,
> PSU
> and cables.
>
> Now the stress test passes only one SATA reset instead of 3 or 4 before the fatal one.
>
>
> [ 1804.915319] ata1.01: exception Emask 0x10 SAct 0x0 SErr 0x10000 action 0xe frozen
> [ 1804.915319] ata1.01: ST-ATA: DRQ=1 with device error, dev_stat 0x0
> [ 1804.915319] ata1: SError: { PHYRdyChg }
> [ 1804.915319] ata1.01: cmd b0/d5:01:09:4f:c2/00:00:00:00:00/10 tag 0 pio 512 in
> [ 1804.915319] res 00/00:01:09:4f:c2/00:00:00:00:00/10 Emask 0x212 (ATA bus error)
> [ 1804.915319] ata1: hard resetting link
> [ 1810.279540] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)

PHYRdyChg under load is very symptomatic of inadequate power supply.
If you run "smartctl -a" on the device before and after the error,
what counters change?

If you have two PSUs around, one thing worth trying is to power up the
second PSU separately and put half of the drives on the separate PSU
and see whether the problem goes away or the pattern of failures
changes. PSU can be easily powered up w/o motherboard.

http://modtown.co.uk/mt/article2.php?id=psumod

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/