Quoting Stefan Berger (stefanb@xxxxxxxxxxxxxxxxxx):
On 05/08/2017 02:11 PM, Serge E. Hallyn wrote:Yes, that's how this works here. I'd considered allowing multiple
Root in a non-initial user ns cannot be trusted to write a traditionalHi Serge,
security.capability xattr. If it were allowed to do so, then any
unprivileged user on the host could map his own uid to root in a private
namespace, write the xattr, and execute the file with privilege on the
host.
However supporting file capabilities in a user namespace is very
desirable. Not doing so means that any programs designed to run with
limited privilege must continue to support other methods of gaining and
dropping privilege. For instance a program installer must detect
whether file capabilities can be assigned, and assign them if so but set
setuid-root otherwise. The program in turn must know how to drop
partial capabilities, and do so only if setuid-root.
I have been looking at patch below primarily to learn how we could
apply a similar technique to security.ima and security.evm for a
namespaced IMA. From the paragraphs above I thought that you solved
the problem of a shared filesystem where one now can write different
security.capability xattrs by effectively supporting for example
security.capability[uid=1000] and security.capability[uid=2000]
written into the filesystem. Each would then become visible as
security.capability if the userns mapping is set appropriately.
However, this doesn't seem to be how it is implemented. There seems
to be only a single such entry with uid appended to it and, if it
was a shared filesystem, the first one to set this attribute blocks
everyone else from writing the xattr. Is that how it works? Would
entries, but I didn't feel that was needed for this case. In a previous
implementation (which is probably in the lkml archives somewhere) I
supported variable length xattr so that multiple containers could
each write a value tagged with their own userns.rootid. Instead,
in the final version, if root in any parent container writes an
xattr, it will take effect in child user namespaces. Which is
sensible - the parent presumbly laid out the filesystem to create
the child container.
that work differently with an overlay filesystem ? I think a similarCertainly an overlay filesystem should be an easy case as the container
can have its own copy of the inode with its own xattr. Btrfs/zfs
would be nicer as the whole file wouldn't need to be copied.
model could also work for IMA, but maybe you have some thoughts. TheSo if you have container c1 creating child container c2 on host h1,
only thing I would be concerned about is blocking the parent
container's root user from setting an xattr.
then if c1 creates an xattr, can c2 not use that? And if h1 writes it,
can c1 and c2 use it?
If they can't, then I guess for IMA multiple xattrs would need to be
supported.
-serge