Re: [testcase] test your fs/storage stack (was Re: [patch] ext2/3: document conditions when reliable operation is possible)

From: Rob Landley
Date: Wed Sep 02 2009 - 18:45:47 EST


On Wednesday 02 September 2009 15:12:10 Pavel Machek wrote:
> > (2) RAID5 protects you against a single failure and your test case
> > purposely injects a double failure.
>
> Most people would be surprised that press of reset button is 'failure'
> in this context.

Apparently because most people haven't read Documentation/md.txt:

Boot time assembly of degraded/dirty arrays
-------------------------------------------

If a raid5 or raid6 array is both dirty and degraded, it could have
undetectable data corruption. This is because the fact that it is
'dirty' means that the parity cannot be trusted, and the fact that it
is degraded means that some datablocks are missing and cannot reliably
be reconstructed (due to no parity).

And so on for several more paragraphs. Perhaps the documentation needs to be
extended to note that "journaling will not help here, because the lost data
blocks render entire stripes unreconstructable"...

Hmmm, I'll take a stab at it. (I'm not addressing the raid 0 issues brought
up elsewhere in this thread because I don't comfortably understand the current
state of play...)

Rob
--
Latency is more important than throughput. It's that simple. - Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/