Re: [PATCH 03/34] teach move_mount(2) to work with OPEN_TREE_CLONE [ver #12]

From: David Howells
Date: Thu Oct 11 2018 - 05:17:36 EST


Alan Jenkins <alan.christopher.jenkins@xxxxxxxxx> wrote:

> # unshare --mount=private_mnt/child_ns --propagation=shared ls -l /proc/self/ns/mnt

I think the problem is that the mount of the nsfs object done by unshare here
pins the new mount namespace - but doesn't add the namespace's contents into
the mount tree, so the mount struct cycle-detection code is bypassed.

I think it's fine for all other namespaces, just not the mount namespace.

It looks like this bug might theoretically exist upstream also, though I don't
think there's any way to actually effect it given that mount() doesn't take a
dirfd argument.

The reason that you can do this with open_tree()/move_mount() is that it
allows you to create a mount tree (OPEN_TREE_CLONE) that has no namespace
assignment, pass it through the namespace switch and then attach it inside the
child namespace. The cross-namespace checks in do_move_mount() are bypassed
because the root of the newly-cloned mount tree doesn't have one.

Unfortunately, just searching the newly-cloned mount tree for a conflicting
nsfs mount doesn't help because the potential loop could be hidden several
levels deep.

I think the simplest solution is to either reject a request for
open_tree(OPEN_TREE_CLONE) if there are any nsfs objects in the source tree,
or to just not copy said objects.

David
---

Test script:

mount -t tmpfs none /a
mount --make-shared /a
cd /a
mkdir private_mnt
mount -t tmpfs xxx private_mnt
mount --make-private private_mnt
touch private_mnt/child_ns
unshare --mount=private_mnt/child_ns --propagation=shared \
ls -l /proc/self/ns/mnt
findmnt

~/open_tree 3</a/private_mnt 3 \
nsenter --mount=/a/private_mnt/child_ns \
sh -c '~/move_mount 4</mnt'

grep Shmem: /proc/meminfo
dd if=/dev/zero of=/a/private_mnt/bigfile bs=1M count=10

umount -l /a/private_mnt/
grep Shmem: /proc/meminfo