Re: [patch] sys_sync livelock fix

From: Daniel Phillips (phillips@bonn-fries.net)
Date: Wed Feb 13 2002 - 10:11:51 EST


On February 13, 2002 04:46 am, Jeff Garzik wrote:
> Bill Davidsen wrote:
> >
> > On Wed, 13 Feb 2002, Alan Cox wrote:
> >
> > > > > Whats wrong with sync not terminating when there is permenantly I/O left ?
> > > > > Its seems preferably to suprise data loss
> > > >
> > > > Hard call. What do we *want* sync to do?
> > >
> > > I'd rather not change the 2.4 behaviour - just in case. For 2.5 I really
> > > have no opinion either way if SuS doesn't mind
> >
> > Alan, I think you have this one wrong, although SuS seems to have it wrong
> > as well, and if Linux did what SuS said there would be no problem.
> >
> > - What SuS seems to say is that all dirty buffers will queued for physical
> > write. I think if we did that the livelock would disappear, but data
> > integrity might suffer.
> > - sync() could be followed by write() at the very next dispatch, and it
> > was never intended to be the last call after which no writes would be
> > done. It is a point in time.
> > - the most common use of sync() is to flush data write to all files of the
> > current process. If there was a better way to do it which was portable,
> > sync() would be called less. I doubt there are processes which alluse
> > that no write will be done after sync() returns.
> > - since sync() can't promise "no new writes" why try to make it do so? It
> > should mean "write current sirty buffers" and that's far more than SuS
> > requires.
> >
> > I don't think benchmarks are generally important, but in this case the
> > benchmark reveals that we have been implementing a system call in a way
> > which not only does more than SuS requires, but more than the user
> > expects. To leave it trying to do even more than that seems to have no
> > benefit and a high (possible) cost.
>
> Yow, your message inspired me to re-read SuSv2 and indeed confirm,
> sync(2) schedules I/O but can return before completion,

I think that's just stupid and we have a duty to fix it, it's an anachronism.
The _natural_ expectation for the user is that sync means 'don't come back
until data is on disk' and any other interpretation is just apologizing for
halfway implementations.

Sync should mean: come back after all filesystem transactions submitted before
the sync are recorded, such that if the contents of RAM vanishes the data can
still be retrieved.

For dumb filesystems, this can degenerate to 'just try to write all the dirty
blocks', the traditional Linux interpretation, but for journalling filesystems
we can do the job properly.

> while fsync(2) schedules I/O and waits for completion.

Yes, right.

> So we need to implement system call checkpoint(2) ? schedule I/O,
> introduce an I/O barrier, then sleep until that I/O barrier and all I/O
> scheduled before it occurs.

How about adding: sync --old-broken-way

On this topic, it would make a lot of sense from the user's point of view to
have a way of syncing a single volume, how would we express that?

-- 
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Feb 15 2002 - 21:00:55 EST