Re: RAID-5 design bug (or misfeature)

From: Alan Cox
Date: Mon May 30 2005 - 06:59:16 EST


On Llu, 2005-05-30 at 03:47, Mikulas Patocka wrote:
> > In article <Pine.LNX.4.58.0505300043540.5305@xxxxxxxxxxxxxxxxxxxxxxxx> you wrote:
> > > I think Linux should stop accessing all disks in RAID-5 array if two disks
> > > fail and not write "this array is dead" in superblocks on remaining disks,
> > > efficiently destroying the whole array.

It discovered the disks had failed because they had outstanding I/O that
failed to complete and errorred. At that point your stripes *are*
inconsistent. If it didn't mark them as failed then you wouldn't know it
was corrupted after a power restore. You can then clean it fsck it,
restore it, use mdadm as appropriate to restore the volume and check it.

> But root disk might fail too... This way, the system can't be taken down
> by any single disk crash.

It only takes on disk in an array to short 12v and 5v due to a component
failure to total the entire disk array, and with both IDE and SCSI a
drive fail can hang the entire bus anyway.

Alan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/