Re: device namespaces

From: Greg Kroah-Hartman
Date: Tue Jun 08 2021 - 08:42:00 EST


On Tue, Jun 08, 2021 at 02:30:50PM +0200, Christian Brauner wrote:
> On Tue, Jun 08, 2021 at 11:38:16AM +0200, Enrico Weigelt, metux IT consult wrote:
> > Hello folks,
> >
> >
> > I'm going to implement device namespaces, where containers can get an
> > entirely different view of the devices in the machine (usually just a
> > specific subset, but possibly additional virtual devices).
> >
> > For start I'd like to add a simple mapping of dev maj/min (leaving aside
> > sysfs, udev, etc). An important requirement for me is that the parent ns
> > can choose to delegate devices from those it full access too (child
> > namespaces can do the same to their childs), and the assignment can
> > change (for simplicity ignoring the case of removing devices that are
> > already opened by some process - haven't decided yet whether they should
> > be forcefully closed or whether keeping them open is a valid use case).
> >
> > The big question for me now is how exactly to do the table maintenance
> > from userland. We already have entries in /proc/<pid>/ns/*. I'm thinking
> > about using them as command channel, like this:
> >
> > * new child namespaces are created with empty mapping
> > * mapping manipulation is done by just writing commands to the ns file
> > * access is only granted if the writing process itself is in the
> > parent's device ns and has CAP_SYS_ADMIN (or maybe their could be some
> > admin user for the ns ? or the 'root' of the corresponding user_ns ?)
> > * if the caller has some restrictions on some particular device, these
> > are automatically added (eg. if you're restricted to readonly, you
> > can't give rw to the child ns).
> >
> > Is this a good way to go ? Or what would be a better one ?
>
> Ccing Greg. Without adressing specific problems, I should warn you that
> this idea is not new and the plan is unlikely to go anywhere. Especially
> not without support from Greg.

Hah, yeah, this is a non-starter.

Enrico, what real problem are you trying to solve by doing this? And
have you tried anything with this yet? We almost never talk about
"proposals" without seeing real code as it's pointless to discuss things
when you haven't even proven that it can work.

So let's see code before even talking about this...

And as Christian points out, you can do this today without any kernel
changes, so to think you need to modify the kernel means that you
haven't even tried this at all?

greg k-h