Re: [PATCH v2] vfs: prevent copy_file_range to copy across devices

From: Luis Henriques
Date: Tue Feb 16 2021 - 11:43:36 EST


Amir Goldstein <amir73il@xxxxxxxxx> writes:

>> Ugh. And I guess overlayfs may have a similar problem.
>
> Not exactly.
> Generally speaking, overlayfs should call vfs_copy_file_range()
> with the flags it got from layer above, so if called from nfsd it
> will allow cross fs copy and when called from syscall it won't.
>
> There are some corner cases where overlayfs could benefit from
> COPY_FILE_SPLICE (e.g. copy from lower file to upper file), but
> let's leave those for now. Just leave overlayfs code as is.

Got it, thanks for clarifying.

>> > This is easy to solve with a flag COPY_FILE_SPLICE (or something) that
>> > is internal to kernel users.
>> >
>> > FWIW, you may want to look at the loop in ovl_copy_up_data()
>> > for improvements to nfsd_copy_file_range().
>> >
>> > We can move the check out to copy_file_range syscall:
>> >
>> > if (flags != 0)
>> > return -EINVAL;
>> >
>> > Leave the fallback from all filesystems and check for the
>> > COPY_FILE_SPLICE flag inside generic_copy_file_range().
>>
>> Ok, the diff bellow is just to make sure I understood your suggestion.
>>
>> The patch will also need to:
>>
>> - change nfs and overlayfs calls to vfs_copy_file_range() so that they
>> use the new flag.
>>
>> - check flags in generic_copy_file_checks() to make sure only valid flags
>> are used (COPY_FILE_SPLICE at the moment).
>>
>> Also, where should this flag be defined? include/uapi/linux/fs.h?
>
> Grep for REMAP_FILE_
> Same header file, same Documentation rst file.
>
>>
>> Cheers,
>> --
>> Luis
>>
>> diff --git a/fs/read_write.c b/fs/read_write.c
>> index 75f764b43418..341d315d2a96 100644
>> --- a/fs/read_write.c
>> +++ b/fs/read_write.c
>> @@ -1383,6 +1383,13 @@ ssize_t generic_copy_file_range(struct file *file_in, loff_t pos_in,
>> struct file *file_out, loff_t pos_out,
>> size_t len, unsigned int flags)
>> {
>> + if (!(flags & COPY_FILE_SPLICE)) {
>> + if (!file_out->f_op->copy_file_range)
>> + return -EOPNOTSUPP;
>> + else if (file_out->f_op->copy_file_range !=
>> + file_in->f_op->copy_file_range)
>> + return -EXDEV;
>> + }
>
> That looks strange, because you are duplicating the logic in
> do_copy_file_range(). Maybe better:
>
> if (WARN_ON_ONCE(flags & ~COPY_FILE_SPLICE))
> return -EINVAL;
> if (flags & COPY_FILE_SPLICE)
> return do_splice_direct(file_in, &pos_in, file_out, &pos_out,
> len > MAX_RW_COUNT ? MAX_RW_COUNT : len, 0);

My initial reasoning for duplicating the logic in do_copy_file_range() was
to allow the generic_copy_file_range() callers to be left unmodified and
allow the filesystems to default to this implementation.

With this change, I guess that the calls to generic_copy_file_range() from
the different filesystems can be dropped, as in my initial patch, as they
will always get -EINVAL. The other option would be to set the
COPY_FILE_SPLICE flag in those calls, but that would get us back to the
problem we're trying to solve.

> if (!file_out->f_op->copy_file_range)
> return -EOPNOTSUPP;
> return -EXDEV;
>
>> }
>> @@ -1474,9 +1481,6 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
>> {
>> ssize_t ret;
>>
>> - if (flags != 0)
>> - return -EINVAL;
>> -
>
> This needs to move to the beginning of SYSCALL_DEFINE6(copy_file_range,...

Yep, I didn't included that change in my diff as I wasn't sure if you'd
like to have the flag visible in userspace.

Anyway, thanks for your patience!

Cheers,
--
Luis