Re: MD/RAID time out writing superblock

From: Chris Webb
Date: Wed Sep 16 2009 - 19:24:09 EST


Mark Lord <liml@xxxxxx> writes:

> I suspect we're missing some info from this specific failure.
> Looking back at Chris's earlier posting, the whole thing started
> with a FLUSH_CACHE_EXT failure. Once that happens, all bets are
> off on anything that follows.
>
> >Everything will be running fine when suddenly:
> >
> > ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
> > ata1.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
> > res 40/00:00:80:17:91/00:00:37:00:00/40 Emask 0x4 (timeout)
> > ata1.00: status: { DRDY }
> > ata1: hard resetting link
> > ata1: softreset failed (device not ready)
> > ata1: hard resetting link
> > ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> > ata1.00: configured for UDMA/133
> > ata1: EH complete
> > end_request: I/O error, dev sda, sector 1465147272
> > md: super_written gets error=-5, uptodate=0
> > raid10: Disk failure on sda3, disabling device.
> > raid10: Operation continuing on 5 devices.

Hi Mark. Yes, when the first timeout after a clean boot happens, it's with
an 0xea flush command every time:

[...]
ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata5.00: ATA-8: ST3750523AS, CC34, max UDMA/133
ata5.00: 1465149168 sectors, multi 0: LBA48 NCQ (depth 31/32)
ata5.00: configured for UDMA/133
scsi 4:0:0:0: Direct-Access ATA ST3750523AS CC34 PQ: 0 ANSI: 5
sd 4:0:0:0: [sde] 1465149168 512-byte hardware sectors: (750 GB/698 GiB)
sd 4:0:0:0: [sde] Write Protect is off
sd 4:0:0:0: [sde] Mode Sense: 00 3a 00 00
sd 4:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 4:0:0:0: [sde] 1465149168 512-byte hardware sectors: (750 GB/698 GiB)
sd 4:0:0:0: [sde] Write Protect is off
sd 4:0:0:0: [sde] Mode Sense: 00 3a 00 00
sd 4:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sde: sde1 sde2 sde3
sd 4:0:0:0: [sde] Attached SCSI disk
sd 4:0:0:0: Attached scsi generic sg4 type 0

[later]
ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
ata5.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout)
ata5.00: status: { DRDY }
ata5: hard resetting link
ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata5.00: configured for UDMA/133
ata5: EH complete
sd 4:0:0:0: [sde] 1465149168 512-byte hardware sectors: (750 GB/698 GiB)
sd 4:0:0:0: [sde] Write Protect is off
sd 4:0:0:0: [sde] Mode Sense: 00 3a 00 00
sd 4:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
end_request: I/O error, dev sde, sector 1465147264
md: super_written gets error=-5, uptodate=0
raid10: Disk failure on sde3, disabling device.
raid10: Operation continuing on 4 devices.

Best wishes,

Chris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/