Re: [PATCH v2] capabilities: new kernel.ns_modules_allowed sysctl

From: Vegard Nossum
Date: Thu Oct 06 2022 - 05:17:02 EST



On 8/15/22 17:50, Serge E. Hallyn wrote:
On Mon, Aug 15, 2022 at 10:27:53AM +0200, Vegard Nossum wrote:
Creating a new user namespace grants you the ability to reach a lot of code
(including loading certain kernel modules) that would otherwise be out of
reach of an attacker. We can reduce the attack surface and block exploits
by ensuring that user namespaces cannot trigger module (auto-)loading.

[...]

+ /*
+ * Disallow if we're in a user namespace and we don't have
+ * CAP_SYS_MODULE in the init namespace.
+ */
+ if (current_user_ns() != &init_user_ns &&
+ !capable(CAP_SYS_MODULE) &&

It's monday, so maybe I'm thinking wrongly - but I don't believe that you can
possible pass capable(CAP_SYS_MODULE) if current_user_ns() != &init_user_ns.
So I think you can drop the second check.

Hm, I think I see what you're saying -- cap_capable() will not even
search for caps outside the current_cred() namespace and return -EPERM?

/*
* If we're already at a lower level than we're looking for,
* we're done searching.
*/
if (ns->level <= cred->user_ns->level)
return -EPERM;

I'll submit a v3 -- this sysctl is still useful even with the security
hook for userns creation that just got merged.

Thanks,


Vegard