Re: [PATCH] fs/read_write.c: Fix a broken signed integer overflow check.

From: Al Viro
Date: Mon Feb 07 2022 - 12:11:32 EST


On Mon, Feb 07, 2022 at 06:44:55PM +0200, Ari Sundholm wrote:
> Hello, Al,
>
> On 2/7/22 16:58, Al Viro wrote:
> > On Mon, Feb 07, 2022 at 02:07:11PM +0200, Ari Sundholm wrote:
> > > The function generic_copy_file_checks() checks that the ends of the
> > > input and output file ranges do not overflow. Unfortunately, there is
> > > an issue with the check itself.
> > >
> > > Due to the integer promotion rules in C, the expressions
> > > (pos_in + count) and (pos_out + count) have an unsigned type because
> > > the count variable has the type uint64_t. Thus, in many cases where we
> > > should detect signed integer overflow to have occurred (and thus one or
> > > more of the ranges being invalid), the expressions will instead be
> > > interpreted as large unsigned integers. This means the check is broken.
> >
> > I must be slow this morning, but... which values of pos_in and count are
> > caught by your check, but not by the original?
> >
>
> Thank you for your response and questions.
>
> Assuming an x86-64 target platform, please consider:
>
> loff_t pos_out = 0x7FFFFFFFFFFEFFFFLL;
> and
> uint64_t count = 65537;
>
> The type of the expression (pos_out + count) is a 64-bit unsigned type, by
> C's integer promotion rules. Its value is 0x8000000000000000ULL, that is,
> bit 63 is set.
>
> The comparison (pos_out + count) < pos_out, again due to C's integer
> promotion rules, is unsigned. Thus, the comparison, in this case, is
> equivalent to:
>
> 0x8000000000000000ULL < 0x7FFFFFFFFFFEFFFFULL,
>
> which is false. Please note that the LHS is not expressible as a positive
> integer of type loff_t. With larger values for count, the problem should
> become quite obvious, as some the offsets within the file would not be
> expressible as positive integers of type loff_t. But I digress. As we can
> see above, the overflow is missed.
>
> With the LHS explicitly cast to loff_t, the comparison is equivalent to:
>
> 0x8000000000000000LL < 0x7FFFFFFFFFFEFFFFLL,
>
> which is true, as the LHS is negative.
>
> This has also been verified in practice, and was detected when running tests
> on special cases of the copy_file_range syscall on different filesystems.

Er... I still don't see the problem here. If the destination filesystem
explicitly allows offsets in excess of 2^63, what's the point in that
-EOVERFLOW? And if it doesn't, you'll get count truncated by
generic_write_check_limits(), down to the amount remaining until the
fs limit...

Same on the input side - if your source file is at least 2^63, what's the
problem? And if not, you'll get count capped by file size - pos_in, right
under that check...

Which filesystems had been involved and what was the test?