Re: [PATCH RFC 0/1] mount: universally disallow mounting over symlinks

From: Aleksa Sarai
Date: Mon Dec 30 2019 - 02:30:58 EST


On 2019-12-30, Aleksa Sarai <cyphar@xxxxxxxxxx> wrote:
> On 2019-12-30, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> > On Mon, Dec 30, 2019 at 04:20:35PM +1100, Aleksa Sarai wrote:
> >
> > > A reasonably detailed explanation of the issues is provided in the patch
> > > itself, but the full traces produced by both the oopses and deadlocks is
> > > included below (it makes little sense to include them in the commit since we
> > > are disabling this feature, not directly fixing the bugs themselves).
> > >
> > > I've posted this as an RFC on whether this feature should be allowed at
> > > all (and if anyone knows of legitimate uses for it), or if we should
> > > work on fixing these other kernel bugs that it exposes.
> >
> > Umm... Are all of those traces
> > a) reproducible on mainline and
>
> This was on viro/for-next, I'll retry it on v5.5-rc4.

The NULL deref oops is reproducible on v5.5-rc4. Strangely it seems
harder to reproduce than on viro/for-next (I kept reproducing it there
by accident), but I'll double-check if that really is the case.

The simplest reproducer is (using the attached programs and .config):

ln -s . link
sudo ./umount_symlink link

There's also a few other whacky behaviours where you get -ELOOP or
-EACCES in cases where you shouldn't -- which results in MNT_DETACH
failing and the mount being impossible to get rid of. A good example is

sudo ./mount_to_symlink /proc/self/exe link
sudo ./umount_symlink link # -EACCES

Or

ln -s . link1
ln -s . link2
sudo ./mount_to_symlink link1 link2
sudo ./umount_symlink link1 # -ELOOP
sudo ./umount_symlink link2 # -ELOOP

But I am trying to find a reproducer for the "umount of a mount
triggering an Oops" issue.

On another note -- I guess this is considered a feature which should
"just work" and not a bug?

BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
PGD 80000003c6fca067 P4D 80000003c6fca067 PUD 3c6f42067 PMD 0
Oops: 0010 [#1] SMP PTI
CPU: 4 PID: 4486 Comm: umount_symlink Tainted: G E 5.5.0-rc4-cyphar #126
Hardware name: LENOVO 20KHCTO1WW/20KHCTO1WW, BIOS N23ET55W (1.30 ) 08/31/2018
RIP: 0010:0x0
Code: Bad RIP value.
RSP: 0018:ffffb70b82963cc0 EFLAGS: 00010206
RAX: 0000000000000000 RBX: ffff906d0cc3bb40 RCX: 0000000000000abc
RDX: 0000000000000089 RSI: ffff906d74623cc0 RDI: ffff906d74475df0
RBP: ffff906d74475df0 R08: ffffd70b7fb24c20 R09: ffff906d066a5000
R10: 0000000000000000 R11: 8080807fffffffff R12: ffff906d74623cc0
R13: 0000000000000089 R14: ffffb70b82963dc0 R15: 0000000000000080
FS: 00007fbc2a8f0540(0000) GS:ffff906dcf500000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 00000003c68f8001 CR4: 00000000003606e0
Call Trace:
__lookup_slow+0x94/0x160
lookup_slow+0x36/0x50
path_mountpoint+0x1be/0x360
filename_mountpoint+0xa5/0x150
? __lookup_hash+0xa0/0xa0
ksys_umount+0x78/0x490
__x64_sys_umount+0x12/0x20
do_syscall_64+0x64/0x240
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7fbc2a8274e7
Code: 09 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 31 f6 e9 09
00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 <48> 3d 01 f0
ff ff 73 01 c3 48 8b 0d 69 09 0c 00 f7 d8 64 89 01 48
RSP: 002b:00007ffd1da9b3f8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a6
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbc2a8274e7
RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000001300310
RBP: 00007ffd1da9b4c0 R08: 0000000000000000 R09: 000000000000000f
R10: 00007fbc2a92f800 R11: 0000000000000202 R12: 0000000000401090
R13: 00007ffd1da9b5a0 R14: 0000000000000000 R15: 0000000000000000
Modules linked in: [snip]
CR2: 0000000000000000
---[ end trace ae473813e34e641d ]---
RIP: 0010:0x0
Code: Bad RIP value.
RSP: 0018:ffffb70b82963cc0 EFLAGS: 00010206
RAX: 0000000000000000 RBX: ffff906d0cc3bb40 RCX: 0000000000000abc
RDX: 0000000000000089 RSI: ffff906d74623cc0 RDI: ffff906d74475df0
RBP: ffff906d74475df0 R08: ffffd70b7fb24c20 R09: ffff906d066a5000
R10: 0000000000000000 R11: 8080807fffffffff R12: ffff906d74623cc0
R13: 0000000000000089 R14: ffffb70b82963dc0 R15: 0000000000000080
FS: 00007fbc2a8f0540(0000) GS:ffff906dcf500000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 00000003c68f8001 CR4: 00000000003606e0

--
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

Attachment: .config
Description: application/config

#define _GNU_SOURCE
#include <sys/types.h>
#include <sys/mount.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <errno.h>

#define bail(msg) \
do { printf("mount_to_symlink: %s: %m\n", msg); exit(1); } while (0)

int is_symlink(const char *path)
{
struct stat stat = {};
if (lstat(path, &stat) < 0)
bail("lstat(<path>)");
return S_ISLNK(stat.st_mode);
}

int main(int argc, char **argv)
{
struct stat stat = {};
char *src, *dst, *src_fdpath, *dst_fdpath;
int src_fd, dst_fd;

if (argc != 3)
bail("usage: mount_to_symlink <src> <dst>");

src_fdpath = src = argv[1];
dst_fdpath = dst = argv[2];

if (is_symlink(src)) {
// open source fd
src_fd = open(src, O_PATH | O_CLOEXEC | O_NOFOLLOW);
if (src_fd < 0)
bail("open(<src>, O_PATH|O_NOFOLLOW)");
// construct fd path
asprintf(&src_fdpath, "/proc/self/fd/%d", src_fd);
}

if (is_symlink(dst)) {
// open target fd
dst_fd = open(dst, O_PATH | O_CLOEXEC | O_NOFOLLOW);
if (dst_fd < 0)
bail("open(<dst>, O_PATH|O_NOFOLLOW)");
// construct fd path
asprintf(&dst_fdpath, "/proc/self/fd/%d", dst_fd);
}

// try to mount
mount(src_fdpath, dst_fdpath, "", MS_BIND, "");
printf("mount(%s, %s, MS_BIND) = %m (%d)\n", src, dst, -errno);
return 0;
}
#define _GNU_SOURCE
#include <sys/types.h>
#include <sys/mount.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <errno.h>

#define bail(msg) \
do { printf("mount_to_symlink: %s: %m\n", msg); exit(1); } while (0)

int main(int argc, char **argv)
{
struct stat stat = {};
char *mnt, *mnt_fdpath;
int mnt_fd;

if (argc != 2)
bail("need <mount> argument");

mnt = argv[1];

// open mountpoint fd
mnt_fd = open(mnt, O_PATH | O_CLOEXEC | O_NOFOLLOW);
if (mnt_fd < 0)
bail("open(<dst>, O_PATH|O_NOFOLLOW)");

// get fdpaths
asprintf(&mnt_fdpath, "/proc/self/fd/%d", mnt_fd);

// try to mount
umount2(mnt_fdpath, MNT_DETACH);
printf("umount2(%s, MNT_DETACH) = %m (%d)\n", mnt, -errno);
return 0;
}

Attachment: signature.asc
Description: PGP signature