Re: critical bugs in md raid5

From: pcg
Date: Thu Jan 27 2005 - 12:03:26 EST


On Thu, Jan 27, 2005 at 10:51:02AM +0100, Andi Kleen <ak@xxxxxx> wrote:
> The nasty part there is that it can affect completely unrelated
> data too (on a traditional disk you normally only lose the data
> that is currently being written) because of of the relationship
> between stripes on different disks.

Sorry, I must be a bit dense at times I understood that now, you meant in
the case where parity is lost and you have an I/O error in other cases.

> There were some suggestions in the past
> to be a bit nicer on read IO errors - often if a read fails and you rewrite
> the block from the reconstructed data the disk would allocate a new block
> and then be error free again.
>
> The problem is just that when there are user visible IO errors
> on a modern disk something is very wrong and it will likely run quickly out

Also, linux already does re-write failed parity blocks automatically on
a crash, so whatever damage you might think might be done to the disk
will already be done at numerous occasions, as linux in general nor the
raid driver in particular checks for bad blocks before rewriting (I don't
suggets that it does, just that linux already rewrites failed blocks if it
doesn't know about them, and this hasn't been a particular bad problem).

--
The choice of a
-----==- _GNU_
----==-- _ generation Marc Lehmann
---==---(_)__ __ ____ __ pcg@xxxxxxxx
--==---/ / _ \/ // /\ \/ / http://schmorp.de/
-=====/_/_//_/\_,_/ /_/\_\ XX11-RIPE
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/