Re: [RFC PATCH 5/8] KEYS: exec request-key within the requesting task's init namespace

From: Ian Kent
Date: Fri Feb 20 2015 - 04:33:49 EST


On Wed, 2015-02-18 at 20:31 -0500, J. Bruce Fields wrote:
> On Thu, Feb 19, 2015 at 08:39:01AM +0800, Ian Kent wrote:
> > On Wed, 2015-02-18 at 15:59 -0500, J. Bruce Fields wrote:
> > > On Wed, Feb 18, 2015 at 12:31:32PM -0500, J. Bruce Fields wrote:
> > > > On Wed, Feb 18, 2015 at 12:06:20PM -0500, J. Bruce Fields wrote:
> > > > > On Fri, Feb 06, 2015 at 09:47:25AM +0800, Ian Kent wrote:
> > > > > > On Thu, 2015-02-05 at 15:14 +0000, David Howells wrote:
> > > > > > >
> > > > > > > > + /* If running within a container use the container namespace */
> > > > > > > > + if (current->nsproxy->net_ns != &init_net)
> > > > > > >
> > > > > > > Is that a viable check? Is it possible to have a container that shares
> > > > > > > networking details?
> > > > > >
> > > > > > That's up for discussion.
> > > > > >
> > > > > > I thought about it and concluded that the check is probably not
> > > > > > sufficient for any of the cases.
> > > > > >
> > > > > > I left it like that because I'm not sure exactly what the use cases are,
> > > > > > hoping it promote discussion and here we are.
> > > > > >
> > > > > > I also think the current container environments don't share net
> > > > > > namespace with the root init net namspace, necessarily, because thy are
> > > > > > containers, ;)
> > > > > >
> > > > > > TBH I haven't looked at the user space container creation code but I
> > > > > > expect it could be done that way if it was needed, so the answer is yes
> > > > > > and no, ;)
> > > > > >
> > > > > > The questions then are do we need to check anything else, and what
> > > > > > environment should the callback use in the different cases, and what
> > > > > > other cases might break if we change it?
> > > > > >
> > > > > > For example, should the fs namespace also be checked for all of these
> > > > > > cases, since we're executing a callback, or is whatever that's set to in
> > > > > > the container always what's required for locating the executable.
> > > > >
> > > > > What would be the disadvantage of setting UMH_USE_NS unconditionally
> > > > > here?
> > > >
> > > > In the nfs idmapping case, the mapping is per-nfs_client.
> > > >
> > > > Can nfs_idmap_new be the one that calls umh_get_init_task, with the
> > > > corresponding put done in nfs_idmap_delete, or is there some reason that
> > > > doesn't work?
> > >
> > > It's confusing sorting out possible use cases, but I think both of these
> > > are reasonable:
> > >
> > > - mount an nfs filesystem from the host, then spawn containers
> > > that all use that filesystem.
> > > - mount an nfs filesystem from within a container.
> > >
> > > Your approach might work for the second, but in the first case we'll end
> > > up with idmappers from multiple containers all trying to do the
> > > idmapping for the shared filesystem.
> >
> > These patches are examples for context.
> >
> > Working out whether to run in a namespace or not was always going to be
> > difficult, specifically for the case you point out. Maybe we can make
> > use of some other information, namespace information in the super block
> > perhaps, or something else, or perhaps we will need to add some
> > information for this, not sure yet. We'll need to work together on that.
> >
> > TBH, I'm not that focused on the use cases because the base
> > implementation is still undergoing significant change although I believe
> > the use of a flag to request namespace execution is a good approach.
> > That probably won't change.
>
> The flag requests that we use the container of the currently executing
> task. In neither the nfs idmapper nor the nfsd state-recovery case is
> that the correct choice.

There's a bit more to do but I've done the changes arising from Oleg's
comments so I've turned my attention to the recent discussion here.

Please understand that I added the example usage patches largely to get
this sort of comment because I was pretty sure there would be more to it
than what was in the patches.

I still don't fully understand the requirements so lets talk about them
and to start with the former case above.

I think your right that something needs to be saved in order to be able
to make a decision at access time in a container. That might be as
simple as working out whether the nn->umh_flags needs UMH_USE_NS at a
different time, say at mount time. Taking a reference to something like
the net namespace probably isn't necessary.

But I don't understand the relationship between the net namespace (eg.
is there a dependency on rpc requests and what the helper needs to do),
the user in a container, and the idmapper operation. Can you help with
that please.

The case of nfsd state-recovery might be similar but you'll need to help
me out a bit with that too.

Ian


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/