Re: [GIT PULL] Btrfs updates for 5.18

From: Darrick J. Wong
Date: Tue Mar 22 2022 - 17:19:44 EST


On Tue, Mar 22, 2022 at 04:55:17PM -0400, Josef Bacik wrote:
> On Tue, Mar 22, 2022 at 11:23:21AM -0700, Linus Torvalds wrote:
> > On Mon, Mar 21, 2022 at 2:37 PM David Sterba <dsterba@xxxxxxxx> wrote:
> > >
> > > - allow reflinks/deduplication from two different mounts of the same
> > > filesystem
> >
> > So I've pulled this, and it looks ok, but I'm not getting the warm and fuzzies.
> >
> > In particular, I'm not seeing any commentary about different
> > filesystems for this.
> >
> > There are several filesystems that use that ->remap_file_range()
> > operation, so these relaxed rules don't just affect btrfs.
> >
> > Yes, yes, checking for i_sb matching does seem sensible, but I'd
> > *really* have liked some sign that people checked with other
> > filesystem maintainers and this is ok for all of them, and they didn't
> > make assumptions about "always same mount" rather than "always same
> > filesystem".
> >
>
> > This affects at least cifs, nfs, overlayfs and ocfs2.
>
> I had a talk with Darrick Wong about this on IRC, and his Reviewed-by is on the
> patch. This did surprise nfsd when xfstests started failing, but talking with
> Bruce he didn't complain once he understood what was going on.

FWIW, I remember talking about this with Bruce and (probably Anna too)
during a hallway BOF at the last LSFMMBPFBBQ that I went to, which was
2018(?) At the time, I think we resolved that nfs42_remap_file_range
was capable of detecting and dealing with unsupported requests, so a
direct comparison of the ->remap_file_range or ->f_op wasn't necessary
for them.

> Believe me I
> have 0 interest in getting the other maintainers upset with me by sneaking
> something by them, I made sure to run it by people first, tho I probably should
> have checked with people directly other than Darrick.

I /am/ a little curious what Steve French has to say w.r.t CIFS.

AFAICT overlayfs passes the request down to the appropriate fs
under-layer, so its correctness mostly depends on the under-layer's
implementation. But I'll let Amir or someone chime in on that. ;)

As for ocfs2, back when I added support for ->remap_file_range to ocfs2,
cross-mount reflink and dedupe worked fine, or at least as well as
anything works on ocfs2.

(XFS has always supported cross-mount remappings.)

> >
> > Adding fsdevel, and pointing to that
> >
> > - if (src_file->f_path.mnt != dst_file->f_path.mnt)
> > + if (file_inode(src_file)->i_sb != file_inode(dst_file)->i_sb)
> >
> > change in commit 9f5710bbfd30 ("fs: allow cross-vfsmount reflink/dedupe")
> >
> > And yes, there was already a comment about "Practically, they only
> > need to be on the same file system" from before that matches the new
> > behavior, but hey, comments have been known to be wrong in the past
> > too.
> >
> > And yes, I'm also aware that do_clone_file_range() already had that
> > exact same i_sb check and it's not new, but since ioctl_file_clone()
> > cheched for the mount path, I don't think you could actually reach it
> > without being on the same mount.
> >
> > And while discussing these sanity checks: wouldn't it make sense to
> > check that *both* the source file and the destination file support
> > that remap_file_range() op, and it's the same op?
> >
> > Yes, yes, it probably always is in practice, but I could imagine some
> > type confusion thing. So wouldn't it be nice to also have something
> > like
> >
> > if (dst_file->f_op != src_file->f_op)
> > goto out_drop_write;
> >
> > in there? I'm thinking "how about dedupe from a directory to a regular
> > file" kind of craziness...
> >
>
> This more fine-grained checking is handled by generic_remap_file_range_prep() to
> make sure we don't try to dedup a directory or pipe or some other nonsense.

Yes. The VFS only allows remapping between regular files.

--D

> Thanks,
>
> Josef