Re: [RFC] ext3: per-process soft-syncing data=ordered mode

From: Valdis . Kletnieks
Date: Thu Jan 24 2008 - 16:59:19 EST


On Thu, 24 Jan 2008 23:36:00 +0300, Al Boldi said:
> data=ordered mode has proven reliable over the years, and it does this by
> ordering filedata flushes before metadata flushes. But this sometimes
> causes contention in the order of a 10x slowdown for certain apps, either
> due to the misuse of fsync or due to inherent behaviour like db's, as well
> as inherent starvation issues exposed by the data=ordered mode.

If they're misusing it, they should be fixed. There should be a limit to
how much the kernel will do to reduce the pain of doing stupid things.

> This RFC proposes to introduce a tunable which allows to disable fsync and
> changes ordered into writeback writeout on a per-process basis like this:

Well-written programs only call fsync() when they really do need the semantics
of fsync. Disabling that is just *asking* for trouble.

>From rfc2821:

6.1 Reliable Delivery and Replies by Email

When the receiver-SMTP accepts a piece of mail (by sending a "250 OK"
message in response to DATA), it is accepting responsibility for
delivering or relaying the message. It must take this responsibility
seriously. It MUST NOT lose the message for frivolous reasons, such
as because the host later crashes or because of a predictable
resource shortage.

Some people really *do* think "the CPU took a machine check and after replacing
the motherboard, the resulting fsck ate the file" is a "frivolous" reason to
lose data.

But if you want to give them enough rope to shoot themselves in the foot with,
I'd suggest abusing LD_PRELOAD to replace the fsync() glibc code instead. No
need to clutter the kernel with rope that can be (and has been) done in userspace.

Attachment: pgp00000.pgp
Description: PGP signature