Re: [PATCH v2 4/5] xfs: introduce XFS_DIFLAG2_IOMAP_IMMUTABLE

From: Dave Chinner
Date: Fri Aug 04 2017 - 19:46:43 EST


On Fri, Aug 04, 2017 at 01:33:12PM -0700, Darrick J. Wong wrote:
> On Thu, Aug 03, 2017 at 07:28:30PM -0700, Dan Williams wrote:
> > Add an on-disk inode flag to record the state of the S_IOMAP_IMMUTABLE
> > in-memory vfs inode flags. This allows the protections against reflink
> > and hole punch to be automatically restored on a sub-sequent boot when
> > the in-memory inode is established.
> >
> > The FS_XFLAG_IOMAP_IMMUTABLE is introduced to allow xfs_io to read the
> > state of the flag, but toggling the flag requires going through
> > fallocate(FALLOC_FL_[UN]SEAL_BLOCK_MAP). Support for toggling this
> > on-disk state is saved for a later patch.
> >
> > Cc: Jan Kara <jack@xxxxxxx>
> > Cc: Jeff Moyer <jmoyer@xxxxxxxxxx>
> > Cc: Christoph Hellwig <hch@xxxxxx>
> > Cc: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
> > Suggested-by: Dave Chinner <david@xxxxxxxxxxxxx>
> > Suggested-by: "Darrick J. Wong" <darrick.wong@xxxxxxxxxx>
> > Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>
> > ---
> > fs/xfs/libxfs/xfs_format.h | 5 ++++-
> > fs/xfs/xfs_inode.c | 2 ++
> > fs/xfs/xfs_ioctl.c | 1 +
> > fs/xfs/xfs_iops.c | 8 +++++---
> > include/uapi/linux/fs.h | 1 +
> > 5 files changed, 13 insertions(+), 4 deletions(-)
> >
> > diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h
> > index d4d9bef20c3a..9e720e55776b 100644
> > --- a/fs/xfs/libxfs/xfs_format.h
> > +++ b/fs/xfs/libxfs/xfs_format.h
> > @@ -1063,12 +1063,15 @@ static inline void xfs_dinode_put_rdev(struct xfs_dinode *dip, xfs_dev_t rdev)
> > #define XFS_DIFLAG2_DAX_BIT 0 /* use DAX for this inode */
> > #define XFS_DIFLAG2_REFLINK_BIT 1 /* file's blocks may be shared */
> > #define XFS_DIFLAG2_COWEXTSIZE_BIT 2 /* copy on write extent size hint */
> > +#define XFS_DIFLAG2_IOMAP_IMMUTABLE_BIT 3 /* set S_IOMAP_IMMUTABLE for this inode */
>
> So... the greedy part of my brain that doesn't want to give out flags2
> bits has been wondering,

FWIW, I made di_flags2 a 64 bit value in the first place precisely
so we didn't have a scarcity problem and can just give out flag
bits for enabling new functionality like this...

> what if we just didn't have an on-disk
> IOMAP_IMMUTABLE bit, and set FS_XFLAG based only on the in-core
> S_IOMAP_IMMUTABLE bit? If a program wants the immutable iomap
> semantics, they will have to code some variant on the following:
>
> fd = open(...);
> ret = fallocate(fd, FALLOC_FL_SEAL_BLOCK_MAP, 0, len...)
> if (ret) {
> printf("couldn't seal block map");
> close(fd);
> return;
> }
>
> mmap(fd...);
> /* do sensitive io operations here */
> munmap(fd...);
>
> close(fd);
>
> Therefore the cost of not having the on-disk flag is that we'll have to
> do more unshare/alloc/test/set cycles than we would if we could remember
> the iomap-immutable state across unmounts and inode reclaiming.
> However, if the data map is already ready to go, this shouldn't have a
> lot of overhead since we only have to iterate the in-core extents.
>
> Just trying to make sure we /need/ the inode flag bit. :)

IMO, fallocate() is for making permanent changes to file extents. If
this is not going to be a permanent state change but only a
runtime-while-the-inode-is-in-cache flag, then it's probably not the
right interface to use.

This also seems problematic for applications other than DAX where
the block map may be sealed, the fd closed and access handed off to
another entity for remote storage access. If the inode gets
reclaimed due to memory pressure, the system loses the fact that
that the inode has been sealed. Hence another process can come
along, re-read the inode and modify the block map because it hasn't
been sealed in this new cache life cycle.....

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx