Re: [PATCH] Remove process freezer from suspend to RAM pathway

From: Nigel Cunningham
Date: Thu Jul 05 2007 - 17:49:20 EST


Good morning!

On Thursday 05 July 2007 23:59:57 Rafael J. Wysocki wrote:
> On Thursday, 5 July 2007 15:36, Nigel Cunningham wrote:
> > On Thursday 05 July 2007 23:35:45 Rafael J. Wysocki wrote:
> > > On Thursday, 5 July 2007 14:38, Nigel Cunningham wrote:
> > > > On Thursday 05 July 2007 22:25:06 Rafael J. Wysocki wrote:
> > > > > On Thursday, 5 July 2007 01:45, Pavel Machek wrote:
> > > > > > On Tue 2007-07-03 21:32:20, Oliver Neukum wrote:
> > > > > > > Am Dienstag, 3. Juli 2007 schrieb Miklos Szeredi:
> > > > > > > > > And a further question. The freezer is not atomic. What do
you
> > do
> > > > > > > > > if a task not yet frozen calls sys_sync(), but fuse is
already
> > > > frozen?
> > > > > > > >
> > > > > > > > What do you do if a task not yet frozen writes to a pipe, on
the
> > other
> > > > > > > > end of which is a task already frozen?
> > > > > >
> > > > > > There's some difference between uninterruptible and interruptible
> > > > > > sleep I'd say.
> > > > > >
> > > > > > > > It doesn't matter. The only thing that should matter during
> > suspend
> > > > > > > > (not hibernate) is saving the state of devices to ram, and
putting
> > the
> > > > > > > > devices to sleep.
> > > > > > >
> > > > > > > Well, but you did remove sys_sync() from the freezer, which is
> > > > > > > and must be called in the hibernate path.
> > > > > >
> > > > > > Not "must". In fact, hibernation should be safe without
sys_sync(). It
> > > > > > is just user un-friendly.
> > > > >
> > > > > In fact, I'd like to remove the sys_sync() from the freezer
entirely,
> > > > because
> > > > > it just doesn't belong in there.
> > > > >
> > > > > The only advantege of having sys_sync() in freeze_processes() is
that we
> > > > > have a chance to write out everything when applications cannot
produce
> > more
> > > > > data to write, but there are filesystems which don't do that anyway
(eg.
> > > > XFS),
> > > > > so generally there's no reason to bother.
> > > >
> > > > Shouldn't XFS - and fuse - be considered to be broken? Sync should
sync
> > data
> > > > and if XFS isn't doing that, it's wrong.
> > > >
> > > > In the case of fuse, we should have a mechanism by which fuse
processes
> > can be
> > > > made to sync if they do have any pending I/O, and by which they can be
> > frozen
> > > > later than other userspace processes.
> > > >
> > > > I'd like to see the sync stay, because it improves reliability and
data
> > > > integrity in the fail-to-resume case. Calling scripts would probably
> > invoke
> > > > sync themselves if they don't already, but that's racy. As it is at
the
> > > > moment, we know userspace is stopped, so syncing isn't racy.
> > >
> > > I'd like to move the sync out of the freezer, but to call it from the
> > > suspend/hibernation code, so that we do
> > >
> > > sys_sync();
> > > error = freeze_processes();
> >
> > Yeah, I understand that. The problem then is that you're racing against
> > userspace. That's not usually a problem, but that doesn't mean it's never
a
> > problem. Try running the stress suite while testing hibernating and you'll
> > see what I mean. If something is submitting lots of I/O when you try to
> > suspend, your sync call will race against that process if it's not yet
> > frozen, and its continued activity will make your sync pointless (there'll
be
> > more unsynced data when you sys_sync call finishes). Stopping userspace
> > before syncing removes that race.
>
> Yes, that will make the suspend/hibernation less reliable in case the resume
> fails (some data, written after the sync, may be lost). However, the sync
done
> from within the freezer doesn't guarantee that there are no data lost
anyway,
> so we don't lose much by not doing it.
>
> Now, there's a question how much data may be lost, potentially, if we do the
> sync before the freezer and I don't think that's a lot.

You're missing the point. I'm arguing that a sync from within the freezer
should guarantee that there is no data loss. As I said about, XFS should be
fixed to properly sync its data, and something should be done about fuse
filesystems too.

Regards,

Nigel
--
See http://www.tuxonice.net for Howtos, FAQs, mailing
lists, wiki and bugzilla info.

Attachment: pgp00000.pgp
Description: PGP signature