Re: writing file to disk: not as easy as it looks

From: Mikulas Patocka
Date: Wed Dec 03 2008 - 10:50:24 EST


> Yes. fsync() seeems surprisingly high on Rusty's list of broken
> interfaces classification ('impossible to use correctly').
>
> I wonder if some reasonable solution exists? Mark filesystem as failed
> on first write error is one of those (and default for ext2/3?). Did
> SGI/big unixen solve this somehow?

When OS/2 hit write error, it wrote to another location on disk and added
this to its sector remap table. It could remap both metadata and data this
way. But today it is meaningless, because the same algorithm is
implemented in disk firmware. Write errors are reported for disk
connection problems, not media problems.

For connection problems, another solution may be to retry writes
indefinitely until the admin aborts it or reconnects the disk. But I don't
know how common these recoverable disk connection errors are.

> > The reality is that most applications don't proper error checking, and
> > even fewer actually call fsync(), so if you are putting your root
> > filesytem on a 32G flash card, and it pops out easily due to hardware
> > design issues, the question of whether fsync() gets properly progated
> > to all potentially interested applications is the ***least*** of your
> > worries.

If you are running a transaction processing software, then it is a very
important worry. All the database software is written with the assumption
that when the database returns transaction committed, then the changes are
permanent.

Most of the business software can deal with the fact that the server
crashes, but can't deal with the fact that database returnes committed
status for transaction that wasn't really committed.

Mikulas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/