Re: [PATCH v2] vfs: prevent copy_file_range to copy across devices

From: Luis Henriques
Date: Thu Feb 18 2021 - 09:07:19 EST


Luis Henriques <lhenriques@xxxxxxx> writes:

> Amir Goldstein <amir73il@xxxxxxxxx> writes:
>
>> On Thu, Feb 18, 2021 at 9:42 AM Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
>>>
>>> Looks good:
>>>
>>> Reviewed-by: Christoph Hellwig <hch@xxxxxx>
>>>
>>> This whole idea of cross-device copie has always been a horrible idea,
>>> and I've been arguing against it since the patches were posted.
>>
>> Ok. I'm good with this v2 as well, but need to add the fallback to
>> do_splice_direct()
>> in nfsd_copy_file_range(), because this patch breaks it.
>>
>> And the commit message of v3 is better in describing the reported issue.
>
> Except that, as I said in a previous email, v2 doesn't really fix the
> issue: all the checks need to be done earlier in generic_copy_file_checks().
>
> I'll work on getting v4, based on v2 and but moving the checks and
> implementing your review suggestions to v3 (plus this nfs change).

There's something else:

The filesystems (nfs, ceph, cifs, fuse) rely on the fallback to
generic_copy_file_range() if something's wrong. And this "something's
wrong" is fs specific. For example: in ceph it is possible to offload the
file copy to the OSDs even if the files are in different filesystems as
long as these filesystems are on the *same* ceph cluster. If the copy
being done is across two different clusters, then the copy reverts to
splice. This means that the boilerplate code being removed in v2 of this
patch needs to be restored and replace by:

ret = __ceph_copy_file_range(src_file, src_off, dst_file, dst_off,
len, flags);

if (ret == -EOPNOTSUPP || ret == -EXDEV)
ret = do_splice_direct(src_file, &src_off, dst_file, &dst_off,
len > MAX_RW_COUNT ? MAX_RW_COUNT : len,
flags);
return ret;

A quick look at the other filesystems code indicate similar patterns.
Since at this point we've gone through all the syscall checks already,
calling do_splice_direct() shouldn't be a huge change. But I may be
missing something. Again. Which is quite likely :-)

Cheers,
--
Luis