Re: imapd and synchronous writes

sct@dcs.ed.ac.uk
Mon, 18 Mar 96 18:11 GMT


Hi,

On Thu, 14 Mar 1996 12:40:37 -0500 (EST), John Gardiner Myers
<jgm+@cmu.edu> said:

> sct@dcs.ed.ac.uk writes:
>> You really can't just blindly assume synchronous directory updates.
>> Even on systems using ffs, where directories are updated synchronously
>> (currently), it is not a wise assumption, for things may change in the
>> future.

> What other option do we application developers have? Until Linux came
> along, directory updates had always been committed before the call
> returned.

Not true. FreeBSD has had the option for async (unordered) metadata
writes for some time, and ext2fs has the option for sync metadata
updates too. Modern EIDE drives with write-behind can also screw the
O/S's sync write ordering. And finally, NFS has NEVER made any
guarantees like this --- if you lose a request or ack packet for a
dirent operation under NFS, chaos can break loose and your application
cannot determine, from the return status of the system calls,
precisely what state the directory is in (although, admittedly, a
success return does normally indicate a hardware commit --- but that's
only half the problem).

> There are no facilities provided to applications to let
> them specify that a directory update needs to be committed to disk.

Not true. {int fd = open(".", O_RDONLY, 0); int rc = fsync(fd); close(fd)}

> "Sorry, you're just screwed" is not an acceptable answer.

I never said it was. If you really want that behaviour, ext2fs gives
you three ways to request it: by filesystem default, by per-directory
attribute, or explicitly on demand by the application.

> "Ulrich Windl" <Ulrich.Windl@rz.uni-regensburg.de> writes:
>> I seems that we'll need something like "opendir(..., O_SYNC)", or at

> Does that help rename(), link(), unlink(), symlink() calls? Those
> calls don't supply the DIR returned by opendir()

Ulrich's point is that open/fsync gives the application flexibility to
request a sync on any given directory. The application programmer can
decide whether or not to do that after the rename etc. It's worth
remembering that even on a sync-metadata ffs, rename() is not
guaranteed to be atomic, and you can be left with both the old and the
new dirents present after a crash.

> Doesn't help applications. Do you really expect sysadmins to create a
> separate partition, mounted synchronously, for their mail spool?

No, but they can easily "chattr -R +S /var/spool/mail". If you mount
a ffs partition on /var/spool with delayed writes enabled, you have
exactly the same problem. That comes down to a broken installation.
If you need the guarantee, you need to disable deferred metadata
writes. If you don't want to have to think about it, set sync updates
system wide via mount options.

> "Theodore Ts'o" <tytso@MIT.EDU> writes:
>> Actually, there is, but it's not portable. If you open the directory
>> using open, and then call fsync on the resulting file descriptor, you
>> will forcibly commit the directory change. This is *not* guaranteed to
>> work on all POSIX systems, and indeed it may not work on many. But it
>> will work under Linux.

> How is an application, written to compile on a broad range of unix
> systems, to know it has to take this particular set of steps?

It can try and see. The attempt is guaranteed to have no harmful side
effects. And there is always sync(), which is universally
implemented on Unix (although it is not in POSIX).

*ANY* implicit assumption by an application that every directory
manipulation is atomic over crashes is non-portable. These days,
there are simply too many cases where the old assumptions don't apply.
The problem is not restricted to Linux; at least Linux gives you a
choice. You simply can't pretend that the problem isn't there. The
real deficiency is the lack of any defined semantics in Unix/POSIX,
and the lack of any standard way for an application to request a
certain level of service with regards to directories.

Cheers,
Stephen.

--
Stephen Tweedie <sct@dcs.ed.ac.uk>
Department of Computer Science, Edinburgh University, Scotland.