Re: [PATCH v8 0/5] fs: multigrain timestamps for XFS's change_cookie

From: Jeff Layton
Date: Mon Sep 25 2023 - 06:14:18 EST


On Mon, 2023-09-25 at 08:18 +1000, Dave Chinner wrote:
> On Sat, Sep 23, 2023 at 05:52:36PM +0300, Amir Goldstein wrote:
> > On Sat, Sep 23, 2023 at 1:46 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > >
> > > On Sat, 2023-09-23 at 10:15 +0300, Amir Goldstein wrote:
> > > > On Fri, Sep 22, 2023 at 8:15 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > > > >
> > > > > My initial goal was to implement multigrain timestamps on most major
> > > > > filesystems, so we could present them to userland, and use them for
> > > > > NFSv3, etc.
> > > > >
> > > > > With the current implementation however, we can't guarantee that a file
> > > > > with a coarse grained timestamp modified after one with a fine grained
> > > > > timestamp will always appear to have a later value. This could confuse
> > > > > some programs like make, rsync, find, etc. that depend on strict
> > > > > ordering requirements for timestamps.
> > > > >
> > > > > The goal of this version is more modest: fix XFS' change attribute.
> > > > > XFS's change attribute is bumped on atime updates in addition to other
> > > > > deliberate changes. This makes it unsuitable for export via nfsd.
> > > > >
> > > > > Jan Kara suggested keeping this functionality internal-only for now and
> > > > > plumbing the fine grained timestamps through getattr [1]. This set takes
> > > > > a slightly different approach and has XFS use the fine-grained attr to
> > > > > fake up STATX_CHANGE_COOKIE in its getattr routine itself.
> > > > >
> > > > > While we keep fine-grained timestamps in struct inode, when presenting
> > > > > the timestamps via getattr, we truncate them at a granularity of number
> > > > > of ns per jiffy,
> > > >
> > > > That's not good, because user explicitly set granular mtime would be
> > > > truncated too and booting with different kernels (HZ) would change
> > > > the observed timestamps of files.
> > > >
> > >
> > > Thinking about this some more, I think the first problem is easily
> > > addressable:
> > >
> > > The ctime isn't explicitly settable and with this set, we're already not
> > > truncating the atime. We haven't used any of the extra bits in the mtime
> > > yet, so we could just carve out a flag in there that says "this mtime
> > > was explicitly set and shouldn't be truncated before presentation".
> > >
> >
> > I thought about this option too.
> > But note that the "mtime was explicitly set" flag needs
> > to be persisted to disk so you cannot store it in the high nsec bits.
> > At least XFS won't store those bits if you use them - they have to
> > be translated to an XFS inode flag and I don't know if changing
> > XFS on-disk format was on your wish list.
>
> Remember: this multi-grain timestamp thing was an idea to solve the
> NFS change attribute problem without requiring *any* filesystem with
> sub-jiffie timestamp capability to change their on-disk format to
> implement a persistent change attribute that matches the new
> requires of the kernel nfsd.
>
> If we now need to change the on-disk format to support
> some whacky new timestamp semantic to do this, then people have
> completely lost sight of what problem the multi-grain timestamp idea
> was supposed to address.
>

Yep. The main impetus for all of this was to fix XFS's change attribute
without requiring an on-disk format change. If we have to rev the on-
disk format, we're probably better off plumbing in a proper i_version
counter and tossing this idea aside.

That said, I think all we'd need for this scheme is a single flag per
inode (to indicate that the mtime shouldn't be truncated before
presentation). If that's possible to do without fully revving the inode
format, then we could still pursue this. I take it that's probably not
the case though.
--
Jeff Layton <jlayton@xxxxxxxxxx>