Re: [testcase] test your fs/storage stack (was Re: [patch] ext2/3:document conditions when reliable operation is possible)

From: Ric Wheeler
Date: Fri Sep 04 2009 - 07:50:04 EST


On 09/04/2009 03:44 AM, Rob Landley wrote:
On Thursday 03 September 2009 09:14:43 jim owens wrote:
Rob Landley wrote:
I think he understands he was clueless too, that's why he investigated
the failure and wrote it up for posterity.

And Ric said do not stigmatize whole classes of A) devices, B) raid,
and C) filesystems with "Pavel says...".
I don't care what "Pavel says", so you can leave the ad hominem at the
door, thanks.
See, this is exactly the problem we have with all the proposed
documentation. The reader (you) did not get what the writer (me)
was trying to say. That does not say either of us was wrong in
what we thought was meant, simply that we did not communicate.
That's why I've mostly stopped bothering with this thread. I could respond to
Ric Wheeler's latest (what does write barriers have to do with whether or not
a multi-sector stripe is guaranteed to be atomically updated during a panic or
power failure?) but there's just no point.

The point of that post was that the failure that you and Pavel both attribute to RAID and journalled fs happens whenever the storage cannot promise to do atomic writes of a logical FS block (prevent torn pages/split writes/etc). I gave a specific example of why this happens even with simple, single disk systems.

Further, if you have the write cache enabled on your local S-ATA/SAS drives and do not have working barriers (as is the case with MD RAID5/6), you have a hard promise of data loss on power outage and these split writes are not going to be the cause of your issues.

You can verify this by testing. Or, try to find people that do storage and file systems that you would listen to and ask.
The LWN article on the topic is out, and incomplete as it is I expect it's the
best documentation anybody will actually _read_.

Rob

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/