I'm not an ext2 fs expert but here's my 2 cents on automatically validating
disks: always have two bitmaps, both ending (this is important) with some
sort of a counter. The bitmap with the higher counter value is the 'currently
active one'. When the currently active bitmap has changed, increase the
counter and write the bitmap to the blocks where the second (currently
inactive) bitmap is located. As the counter is being written at the end,
you can always be sure which bitmap is the valid (current) one by checking
the counter value (if the write is interrupted before the counter is written,
the first bitmap stays the valid one).
You can apply this approach to almost anything else you have. You just need
two copies of 'whatever you have', ending with a counter. Write the one that
has the lower counter value, then write the counter. If the counter writes
successfully, this is now your current 'whatever you have', otherwise the
other one stays the current one.
The counter needs only accept 3 values (0, 1 (>0), 2 (>1), *wrap* 0 (>2))
so 2 bits suffice..
This approach presumes that only one process is doing the updates.
> > Has anyone thought about this very much? If so, is there a mailing list or
> > archive that I can browse?
>
> I have been - its important for embedded Linux boxes. Im stuffed right now
> because I have no real way of forcing that down to disk order of writes. In
> fact if my tests are right then IDE drives are themselves re-ordering my I/O
> requests sometimes.
..which gets into way of the previously described algorithm..
> The other problem is turning off an IDE drive during a write can create
> permanent bad blocks. (Take an old 40Mb drive and yank its power a few times)
> so an fsck or cleanup has to do some kind of remap around those.
I think that the propper way is that the permanent bad blocks should be
discovered and remapped at access time.
Andrej