[RFC PATCH 0/3][RESEND] fs: opportunistic high-res file timestamps

From: Jeff Layton
Date: Tue Apr 11 2023 - 10:37:15 EST


(Apologies for the resend, but I didn't send this with a wide enough
distribution list originally).

A few weeks ago, during one of the discussions around i_version, Dave
Chinner wrote this:

"You've missed the part where I suggested lifting the "nfsd sampled
i_version" state into an inode state flag rather than hiding it in
the i_version field. At that point, we could optimise away the
secondary ctime updates just like you are proposing we do with the
i_version updates. Further, we could also use that state it to
decide whether we need to use high resolution timestamps when
recording ctime updates - if the nfsd has not sampled the
ctime/i_version, we don't need high res timestamps to be recorded
for ctime...."

While I don't think we can practically optimize away ctime updates
like we do with i_version, I do like the idea of using this scheme to
indicate when we need to use a high-res timestamp.

This patchset is a first stab at a scheme to do this. It declares a new
i_state flag for this purpose and adds two new vfs-layer functions to
implement conditional high-res timestamp fetching. It then converts both
tmpfs and xfs to use it.

This seems to behave fine under xfstests, but I haven't yet done
any performance testing with it. I wouldn't expect it to create huge
regressions though since we're only grabbing high res timestamps after
each query.

I like this scheme because we can potentially convert any filesystem to
use it. No special storage requirements like with i_version field. I
think it'd potentially improve NFS cache coherency with a whole swath of
exportable filesystems, and helps out NFSv3 too.

This is really just a proof-of-concept. There are a number of things we
could change:

1/ We could use the top bit in the tv_sec field as the flag. That'd give
us different flags for ctime and mtime. We also wouldn't need to use
a spinlock.

2/ We could probably optimize away the high-res timestamp fetch in more
cases. Basically, always do a coarse-grained ts fetch and only fetch
the high-res ts when the QUERIED flag is set and the existing time
hasn't changed.

If this approach looks reasonable, I'll plan to start working on
converting more filesystems.

One thing I'm not clear on is how widely available high res timestamps
are. Is this something we need to gate on particular CONFIG_* options?

Thoughts?

Jeff Layton (3):
fs: add infrastructure for opportunistic high-res ctime/mtime updates
shmem: mark for high-res timestamps on next update after getattr
xfs: mark the inode for high-res timestamp update in getattr

fs/inode.c | 40 +++++++++++++++++++++++++++++++--
fs/stat.c | 10 +++++++++
fs/xfs/libxfs/xfs_trans_inode.c | 2 +-
fs/xfs/xfs_acl.c | 2 +-
fs/xfs/xfs_inode.c | 2 +-
fs/xfs/xfs_iops.c | 15 ++++++++++---
include/linux/fs.h | 5 ++++-
mm/shmem.c | 23 ++++++++++---------
8 files changed, 80 insertions(+), 19 deletions(-)

--
2.39.2