[PATCH 0/3] fuse: Allow mounts in containers

From: Seth Forshee
Date: Mon Jul 14 2014 - 15:19:20 EST


These patches allow unprivileged users to mount with fuse from within
containers. The first patch is really just a bug fix and related only
because the bug allows unprivileged users to crash the system. The
second patch translates the pid which is making a request into the
server's pid namespace, and the third adds user namespace support to
fuse. This is limited only to the "fuse" fs type. fuseblk could likely
be supported as well, but I haven't spent any time testing it, and I
haven't really given cuse much consideration at all (though cuse ioctls
look rather frightening).

The server's pid and user namespaces are both assumed to be those of the
process which calls mount. This does't necessarily have to be the same
as those of the server, especially since fuse mounts are routinely done
by a process other than the server. However I didn't find any way to
ensure that we use those of the server with the information currently
available to fuse in the kernel. If the mount is done from a different
namespace it could result in reduced functionality, however it should
not result in any privileges not already available to the user.

In preparing these patches I spent some time considering the security
aspects of allowing fuse mounts from containers. fuse is already
sufficiently untrusting of input from userspace, and it has mechanisms
to prevent several types of attacks. However some of these mechanisms
rely on having a trusted setuid root helper (fusermount) to enforce
policy, such as forcing certain mount options for unprivileged monts. In
a container we can't rely on a userspace helper to enforce policy. Here
are details about how these issues work out:

* devices: fusermount forces nodev for unprivileged mounts. In these
patches I use the existing kernel support for forcing nodev for fuse
mounts from user namspaces.

* set[ug]id files: fusermount also forces nosuid for unprivileged
mounts. In a user namespace all file uids and gids are treated as
being mapped into the user ns, so it's not possible to setuid to
anything outside the server's namespace. This means setuid can't be
used to gain elevated privileges, and thus the kernel doesn't need to
force nosuid.

* mounting over files or directories: fusermount ensures that the
unprivileged user has write permissions to the mountpoint before
mounting. But since mounting is only allowed by CAP_SYS_ADMIN in the
user ns of the mount ns, a user cannot use a namespace to mount over
any files or directories unless the user already had the ability to do
so, or if it does so in a different mount ns. Namespaces therefore
don't open the door to this type of attack, and kernel enforcment is
not needed.

* affecting behavior of other users' processes: A user could DoS other
users' processes if those processes accessed files or directories
within a fuse mount. For this reason the default behavoior of fuse is
that only the mount owner can access the filesystem. This can be
overridden with the allow_other mount option, but fusermount forbids
this option unless allowed by system policy in /etc/fuse.conf.

To protect against this, these patches patches change the meaning of
allow_other slightly, from "any user can access this filesystem" to
"users in the mount owner's namespace or a child namespace can access
this filesystem." This protects more privileged contexts while
maintaining the existing behavior.

* {user,group}_id mount options: These are being mapped into the user
ns, which prevents specifying any user outside the ns. Any ids which
do not map to the user ns wil cause the mount to fail.

That represents everything I could think of that would be possible as a
consequence of allowing mounts from user namespaces. I also read through
the fuse kernel code (espeically the parts handling input from
userspace) looking for additional vectors for attack or any other
weaknesses, but I didn't find anything. So I believe these should be all
the changes needed to make fuse mounts from user namespaces safe, but
please let me know if I missed anything.

Thanks,
Seth


Seth Forshee (3):
fuse/dev: Fix unbalanced calls to kunmap_atomic() during splice I/O
fuse: Translate pid making a request into the server's pid namespace
fuse: Allow mounts from user namespaces

fs/fuse/dev.c | 19 +++++++++----------
fs/fuse/dir.c | 30 +++++++++++++++++++-----------
fs/fuse/fuse_i.h | 8 ++++++++
fs/fuse/inode.c | 21 ++++++++++++++-------
4 files changed, 50 insertions(+), 28 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/