Re: WARNING in __kernel_read

From: Matthew Wilcox
Date: Wed Oct 06 2021 - 08:18:33 EST


On Wed, Oct 06, 2021 at 05:33:47PM +0800, Hao Sun wrote:
> C reproducer: https://drive.google.com/file/d/1RzAsyIZzw5X_m340nY9fu4KWjGdG98pv/view?usp=sharing

It's easier than this reproducer makes it look.

res = syscall(__NR_openat, -1, 0x20000080ul, 0x4c003ul, 0x10ul);
syscall(__NR_finit_module, r[0], 0ul, 3ul);

should be enough. Basically, userspace opens an fd without FMODE_READ
and passes it to finit_module().

> ------------[ cut here ]------------
> WARNING: CPU: 1 PID: 28082 at fs/read_write.c:429
> __kernel_read+0x3bb/0x410 fs/read_write.c:429
> Modules linked in:
> CPU: 1 PID: 28082 Comm: syz-executor Not tainted 5.15.0-rc3+ #21
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
> RIP: 0010:__kernel_read+0x3bb/0x410 fs/read_write.c:429
> Call Trace:
> kernel_read+0x47/0x60 fs/read_write.c:461
> kernel_read_file+0x20a/0x370 fs/kernel_read_file.c:93
> kernel_read_file_from_fd+0x55/0x90 fs/kernel_read_file.c:184
> __do_sys_finit_module+0x89/0x110 kernel/module.c:4180

finit_module() is not the only caller of kernel_read_file_from_fd()
which passes it a fd that userspace passed in, for example
kexec_file_load() doesn't validate the fd either. We could validate
the fd in individual syscalls, in kernel_read_file_from_fd()
or just do what vfs_read() does and return -EBADF without warning.

So, one of these two patches. Christoph, Al, what's your preference?

diff --git a/fs/kernel_read_file.c b/fs/kernel_read_file.c
index 87aac4c72c37..1f28b693d1db 100644
--- a/fs/kernel_read_file.c
+++ b/fs/kernel_read_file.c
@@ -178,7 +178,7 @@ int kernel_read_file_from_fd(int fd, loff_t offset, void **buf,
struct fd f = fdget(fd);
int ret = -EBADF;

- if (!f.file)
+ if (!f.file || !(file->f_mode & FMODE_READ))
goto out;

ret = kernel_read_file(f.file, offset, buf, buf_size, file_size, id);

diff --git a/fs/read_write.c b/fs/read_write.c
index af057c57bdc6..bab43b8532d1 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -426,8 +426,8 @@ ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
struct iov_iter iter;
ssize_t ret;

- if (WARN_ON_ONCE(!(file->f_mode & FMODE_READ)))
- return -EINVAL;
+ if (!(file->f_mode & FMODE_READ))
+ return -EBADF;
if (!(file->f_mode & FMODE_CAN_READ))
return -EINVAL;
/*