Re: raid is dangerous but that's secret (was Re: [patch] ext2/3:

From: George Spelvin
Date: Tue Sep 01 2009 - 07:18:17 EST

Next message: Metzger, Markus T: "[discuss] BTS overflow handling, was: [PATCH] perf_counter: Fix arace on perf_counter_ctx"
Previous message: Bart Van Assche: "Re: [PATCH] SCSI driver for VMware's virtual HBA."
In reply to: Pavel Machek: "Re: raid is dangerous but that's secret (was Re: [patch] ext2/3:"
Next in thread: NeilBrown: "Re: raid is dangerous but that's secret (was Re: [patch] ext2/3:"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

>> An embedded checksum, no matter how good, can't tell you if
>> the data is stale; you need a way to distinguish versions in the pointer.

> I would disagree with that.
> If the embedded checksum is a function of both the data and the address
> of the data (in whatever address space seems most appropriate) then it can
> still verify that the data found with the checksum is the data that was
> expected.
> And storing the checksum with the data (where it is practical) means
> index blocks can be more dense so on average fewer accesses to storage
> are needed.

I must not have been clear. Originally, block 100 has contents version 1.
This includes a correctly computed checksum.

Then you write version 2 of the data there. But there's a bit error in
the address and the write goes to block 256+100 = 356. So block
100 still has the version 1 contents, complete with valid checksum.
(Yes, block 356 is now corrupted, but perhaps it's not even allocated.)

Then we go to read block 100, find a valid checksum, and return incorrect
data. Namely, version 1 data, when we expact and want version 2.

Basically, the pointer has to say which *version* of the data it points to,
not just the block address. Otherwise, it can't detect a missing write.

If density is a big issue, then including a small version field is a
possibility.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Metzger, Markus T: "[discuss] BTS overflow handling, was: [PATCH] perf_counter: Fix arace on perf_counter_ctx"
Previous message: Bart Van Assche: "Re: [PATCH] SCSI driver for VMware's virtual HBA."
In reply to: Pavel Machek: "Re: raid is dangerous but that's secret (was Re: [patch] ext2/3:"
Next in thread: NeilBrown: "Re: raid is dangerous but that's secret (was Re: [patch] ext2/3:"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]