Re: FYI: RAID5 unusably unstable through 2.6.14

From: Neil Brown
Date: Wed Jan 18 2006 - 18:35:47 EST


On Wednesday January 18, lkml@xxxxxx wrote:
> Helge Hafting wrote:
> >
> > As other have showed - "mdadm" can reassemble your
> > broken raid - and it'll work well in those cases where
> > the underlying drives indeed are ok. It will fail
> > spectacularly if you have a real double fault though,
> > but then nothing short of raid-6 can save you.
>
> No, actually there are several things we *could* do,
> if only the will-to-do-so existed.

You not only need the will. You also need the ability and the time,
and the three must be combined into the one person...

>
> For example, one bad sector on a drive doesn't mean that
> the entire drive has failed. It just means that one 512-byte
> chunk of the drive has failed.
>
> We could rewrite the failed area of the drive, allowing the
> onboard firmware to repair the fault internally, likely by
> remapping physical sectors. This is nothing unusual, as all
> drives these days ship from the factory with many bad sectors
> that have already been remapped to "fix" them. One or two
> more in the field is no reason to toss a perfectly good drive.

Very recent 2.6 kernels do exactly this. They don't drop a drive on a
read error, only on a write error. On a read error they generate the
data from elsewhere and schedule a write, then a re-read.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/