Re: [PATCH] fs: Fix signed integer overflow for vfs_setpos

From: Al Viro
Date: Thu Dec 07 2017 - 10:27:06 EST


On Thu, Dec 07, 2017 at 09:19:10PM +0800, Ding Tianhong wrote:
> The undefined behaviour sanatizer detected an signed integer overflow like this:
>
> r0 = memfd_create(&(0x7f0000002000-0x12)="2e726571756573745f6b65795f6175746800",0x0)
> lseek(r0, 0x4040000000000000, 0x1)
> setsockopt$inet6_IPV6_FLOWLABEL_MGR(r0, 0x29, 0x20,
> &(0x7f000000b000-0xd)={@empty={[0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
> 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0]}, 0x9, 0x1, 0xff, 0x2, 0x6, 0x1,0xd27}, 0x20)
> mmap(&(0x7f000000e000/0x1000)=nil, 0x1000, 0x3, 0x32,0xffffffffffffffff, 0x0)
> ioctl$sock_SIOCGSKNS(r0, 0x894c, &(0x7f000000f000-0x4)=0x10000)
> ---------------------------------------------------------------------------------
> UBSAN: Undefined behaviour in fs/read_write.c:107:12
> signed integer overflow:
> 4629700416936869888 + 4629700416936869888 cannot be represented in type
> 'long long int'
> CPU: 0 PID: 11653 Comm: syz-executor0 Not tainted 4.x.xx+ #2
> Hardware name: linux,dummy-virt (DT)
> Call trace:
> [<ffffffc00008f4d0>] dump_backtrace+0x0/0x2a0
> [<ffffffc00008f790>] show_stack+0x20/0x30
> [<ffffffc000ec3b5c>] dump_stack+0x11c/0x16c
> [<ffffffc000ec3e80>] ubsan_epilogue+0x18/0x70
> [<ffffffc000ec4ca0>] handle_overflow+0x14c/0x188
> [<ffffffc000ec4d10>] __ubsan_handle_add_overflow+0x34/0x44
> [<ffffffc000327740>] generic_file_llseek_size+0x1f8/0x2a0
> [<ffffffc0002826fc>] shmem_file_llseek+0x7c/0x1f8
> [<ffffffc000327b88>] SyS_lseek+0xc0/0x118
> --------------------------------------------------------------------------------
>
> The problem happened because the calculation of signed integer resulted
> an overflow for the signed integer, so use the unsigned integer to avoid
> undefined behaviour when it does overflow.

TBH, I don't like that solution - there's too much of "make UBSAN STFU" in
it. Besides, there are very similar places elsewhere. Right next to this
one there's default_llseek(), with its
case SEEK_CUR:
if (offset == 0) {
retval = file->f_pos;
goto out;
}
offset += file->f_pos;
break;
and offset is loff_t there. Exact same issue, IOW. Grepping around shows
tons of similar places. E.g. ceph_llseek() has
if (offset == 0) {
ret = file->f_pos;
goto out;
}
offset += file->f_pos;
break;
with offset being loff_t and ocfs2_file_llseek() is the same. memory_lseek()
does something very similar, except that it doesn't use vfs_setpos(),
ditto for xillybus_llseek(), wil_pmc_llseek(), hmcdrv_dev_seek(), etc.

That kind of whack-a-mole ("UBSAN has stepped on that one, let's plug it",
while the other places like that keep breeding) is, IMO, the wrong approach ;-/

BTW, a fun unrelated bogosity:
static loff_t scom_llseek(struct file *file, loff_t offset, int whence)
{
switch (whence) {
case SEEK_CUR:
break;
case SEEK_SET:
file->f_pos = offset;
break;
default:
return -EINVAL;
}

return offset;
}
IOW, lseek(fd, SEEK_CUR, n) quietly returns n there. Separate issue, though...