Re: [RFC 17/20] ima: Use integrity_admin_ns_capable() to check corresponding capability

From: Casey Schaufler
Date: Thu Dec 02 2021 - 10:58:22 EST


On 12/2/2021 5:01 AM, Christian Brauner wrote:
On Thu, Dec 02, 2021 at 01:59:55PM +0100, Christian Brauner wrote:
On Wed, Dec 01, 2021 at 02:29:09PM -0500, James Bottomley wrote:
On Wed, 2021-12-01 at 12:35 -0500, Stefan Berger wrote:
On 12/1/21 11:58, James Bottomley wrote:
On Tue, 2021-11-30 at 11:06 -0500, Stefan Berger wrote:
From: Denis Semakin <denis.semakin@xxxxxxxxxx>

Use integrity_admin_ns_capable() to check corresponding
capability to allow read/write IMA policy without CAP_SYS_ADMIN
but with CAP_INTEGRITY_ADMIN.

Signed-off-by: Denis Semakin <denis.semakin@xxxxxxxxxx>
---
security/integrity/ima/ima_fs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/security/integrity/ima/ima_fs.c
b/security/integrity/ima/ima_fs.c
index fd2798f2d224..6766bb8262f2 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -393,7 +393,7 @@ static int ima_open_policy(struct inode
*inode,
struct file *filp)
#else
if ((filp->f_flags & O_ACCMODE) != O_RDONLY)
return -EACCES;
- if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN))
+ if (!integrity_admin_ns_capable(ns->user_ns))
so this one is basically replacing what you did in RFC 16/20, which
seems a little redundant.

The question I'd like to ask is: is there still a reason for
needing CAP_INTEGRITY_ADMIN? My thinking is that now IMA is pretty
much tied to requiring a user (and a mount, because of
securityfs_ns) namespace, there might not be a pressing need for an
admin capability separated from CAP_SYS_ADMIN because the owner of
the user namespace passes the ns_capable(..., CAP_SYS_ADMIN)
check. The rationale in
Casey suggested using CAP_MAC_ADMIN, which I think would also work.

CAP_MAC_ADMIN (since Linux 2.6.25)
Allow MAC configuration or state changes. Implemented
for
the Smack Linux Security Module (LSM).


Down the road I think we should cover setting file extended
attributes with the same capability as well for when a user signs
files or installs packages with file signatures. A container runtime
could hold CAP_SYS_ADMIN while setting up a container and mounting
filesystems and drop it for the first process started there. Since we
are using the user namespace to spawn an IMA namespace, we would then
require CAP_SYSTEM_ADMIN to be left available so that the user can do
IMA related stuff in the container (set or append to the policy,
write file signatures). I am not sure whether that should be the case
or rather give the user something finer grained, such as
CAP_MAC_ADMIN. So, it's about granularity...

The important rationale for capabilities is separation
of privilege from user id. Granularity has always been a
contentious issue. Whether you use CAP_SYS_ADMIN or CAP_MAC_ADMIN
you are using privilege, and need to be diligent.

It's possible ... any orchestration system that doesn't enter a user
namespace has to strictly regulate capabilities. I'm probably biased
because I always use a user_ns so I never really had to mess with
capabilities.

https://kernsec.org/wiki/index.php/IMA_Namespacing_design_considerations

Is effectively "because CAP_SYS_ADMIN is too powerful" but that's
no longer true of the user namespace owner. It only passes the
ns_capable() check not the capable() one, so while it does get
CAP_SYS_ADMIN, it can only use it in a few situations which
represent quite a power reduction already.
At least docker containers drop CAP_SYS_ADMIN.
Well docker doesn't use the user_ns. But even given that,
CAP_SYS_ADMIN is always dropped for most container systems. What
happens when you enter a user namespace is the ns_capable( ...,
CAP_SYS_ADMIN) check returns true if you're the owner of the user_ns,
in the same way it would for root. So effectively entering a user
namespace without CAP_SYS_ADMIN but mapping the owner id to 0 (what
unshare -r --user does) gives you back a form of CAP_SYS_ADMIN that
responds only in the places in the kernel that have a ns_capable()
check instead of a capable() one (most of the places you list below).
This is the principle of how unprivileged containers actually work ...
and the source of some of our security problems if you get back an
ability to do something you shouldn't be allowed to do as an
unprivileged user.

I am not sure what the decision was based on but probably they don't
want to give the user what is not absolutely necessary, but usage of
user namespaces (with IMA namespaces) would kind of force it to be
available then to do IMA-related stuff ...

Following this man page here
https://man7.org/linux/man-pages/man7/user_namespaces.7.html

CAP_SYS_ADMIN in a user namespace is about

- bind-mounting filesystems

- mounting /proc filesystems

- creating nested user namespaces

- configuring UTS namespace

- configuring whether setgroups() can be used

- usage of setns()


Do we want to add '- only way of *setting up* IMA related stuff' to
this list?
I don't see why not, but other container people should weigh in
because, as I said, I mostly use the user namespace and unprivileged
containers and don't bother with capabilities.
There are very few scenarios where dropping capabilities in an
unprivileged container makes sense. In a lot of other scenarios it is
just a misunderstanding of the meaning of capabilities and their
relationship to user namespaces. Usually, granting a full set of
capabilities to the payload of an unprivigileged container is the right
thing to do. All things that are properly namespaced will check
capabilities in the relevant user namespace. Those that aren't will
check them against the initial user namespaces.

But I do think the question of whether or not ima should go into
cap_sys_admin is more a question of capability semantics then it is in
how exactly ima is namespaced. We do have agreed before that overloading
cap_sys_admin further isn't ideal. Often we end up rectifying that
mistake later. For example, how we moved stuff like criu, bpf, and perf
to their own capability. Now we're left with stuff like:

static inline bool perfmon_capable(void)
{
return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN);
}

static inline bool bpf_capable(void)
{
return capable(CAP_BPF) || capable(CAP_SYS_ADMIN);
}

static inline bool checkpoint_restore_ns_capable(struct user_namespace *ns)
{
return ns_capable(ns, CAP_CHECKPOINT_RESTORE) ||
ns_capable(ns, CAP_SYS_ADMIN);
}

for the sake of adhering to legacy behavior. I think we can skip over
that mistake and introduce cap_sys_integrity.
(Or under CAP_MAC_ADMIN as suggested elsewhere in the thread as I saw
just now.)