Re: Ext4 and the "30 second window of death"

From: Andreas T.Auer
Date: Tue Mar 31 2009 - 19:37:31 EST




On 01.04.2009 00:02 Alberto Gonzalez wrote:
> In fact, thinking about it, this option would be the ideal one for desktops
> and especially laptops (servers running databases are a different thing). What
> we need is that _no_ application uses fsync. The decision as to when the data
> should be written to disk should be left to the filesystem. And then the user
> can choose how often they want this to happen (every 5, 15, 30, 60...
> seconds). So if Ext4 could have a "nofsync" mount option that would disable
> fsync from applications (i.e, it wouldn't honor an fsync call), that would be
> wonderful. But then of course we have to make sure that if the kernel crashes
> (or there's a power-off, etc..), we will just lose the new data that hasn't
> been written to disk, but the old data will still be there. So maybe this
> could be achieved with mounting the filesystem with nofsync, nodelalloc?
>
>
You are always thinking about the few seconds/minutes of work you gonna
lose, but there are different situations, too.

E.g. your POP3 client receives a very important mail, saves it to disk,
uses fsync to make sure it is out and tells the server to delete it. If
you are gonna delay the fsync, you will have a long window in which the
mail can get lost instead of a minimum window. Or are there any POP3
clients, which can synchronize the mail-polling with a spinning a disk?

There are tasks that are not very important, that should not spin up the
disk and there are tasks, that might better do so. It is the preference
of the user, which tasks should or should not spin up the disk, but the
application developer has to decide globally, whether or not to use
fsync() and the filesystem can't even distinguish the tasks at all,
except that it receives fsyncs or not.

So fine-tuning the system to the ideal disk-writing policy is really
problematic, especially given a lot of different people turning knobs:
- different filesystem developers using different methods and default
behaviors, which can be changed by distros and sys admins.
- different applications trying to use or not use fsync() and other
methods to get the best policies for any kind of fs. Or the developers
are incompetent enough to expect features from the filesystem which are
not always given, whether trained by ext3 data=ordered or trained by
reiserfs or just bare of any better fs knowledge.
- different users having different preferences on what data is how
important, but usually they can not change the fsync-policy of the
applications.

Andreas

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/