Re: [RFC PATCH v3 3/3] devguard: added device guard for mknod in non-initial userns

From: Christian Brauner
Date: Fri Dec 15 2023 - 11:36:41 EST


On Fri, Dec 15, 2023 at 03:15:33PM +0100, Christian Brauner wrote:
> On Fri, Dec 15, 2023 at 02:26:53PM +0100, Michael Weiß wrote:
> > On 15.12.23 13:31, Christian Brauner wrote:
> > > On Wed, Dec 13, 2023 at 03:38:13PM +0100, Michael Weiß wrote:
> > >> devguard is a simple LSM to allow CAP_MKNOD in non-initial user
> > >> namespace in cooperation of an attached cgroup device program. We
> > >> just need to implement the security_inode_mknod() hook for this.
> > >> In the hook, we check if the current task is guarded by a device
> > >> cgroup using the lately introduced cgroup_bpf_current_enabled()
> > >> helper. If so, we strip out SB_I_NODEV from the super block.
> > >>
> > >> Access decisions to those device nodes are then guarded by existing
> > >> device cgroups mechanism.
> > >>
> > >> Signed-off-by: Michael Weiß <michael.weiss@xxxxxxxxxxxxxxxxxxx>
> > >> ---
> > >
> > > I think you misunderstood me... My point was that I believe you don't
> > > need an additional LSM at all and no additional LSM hook. But I might be
> > > wrong. Only a POC would show.
> >
> > Yeah sorry, I got your point now.
>
> I think I might have had a misconception about how this works.
> A bpf LSM program can't easily alter a kernel object such as struct
> super_block I've been told.

Which is why you need that new hook in there. I get it now. In any case,
I think we can do this slightly nicer (for some definition of nice)...

So the thing below moves the capability check for mknod into the
security_inode_mknod() hook (This should be a separate patch.).

It moves raising SB_I_NODEV into security_sb_device_access() and the old
semantics are retained if no LSM claims device management. If an LSM
claims device management we raise the new flag and don't even raise
SB_I_NODEV. The capability check is namespace aware if device management
is claimed by an LSM. That's backward compatible. And we don't need any
sysctl.

What's missing is that all devcgroup_*() calls should be moved into a
new, unified security_device_access() hook that's called consistently in
all places where that matters such as blkdev_get_by_dev() and so on. Let
the bpf lsm implement that new hook.

Then write a sample BPF LSM as POC that this works. This would also
all other LSMs to do device management if they wanted to.

Thoughts?