Re: [COMMERCIAL] Re: [PATCH 0/3] kobject: support namespace aware udev

From: Michael J Coss
Date: Wed Sep 09 2015 - 16:55:36 EST


On 9/9/2015 4:28 PM, Greg KH wrote:
> On Wed, Sep 09, 2015 at 04:16:49PM -0400, Michael J Coss wrote:
>> On 9/9/2015 4:09 PM, Greg KH wrote:
>>> On Wed, Sep 09, 2015 at 03:05:29PM -0400, Michael J Coss wrote:
>>>> On 9/8/2015 11:54 PM, Greg KH wrote:
>>>>> On Tue, Sep 08, 2015 at 10:10:27PM -0400, Michael J. Coss wrote:
>>>>>> Currently when a uevent occurs, the event is replicated and sent to every
>>>>>> listener on the kernel netlink socket, ignoring network namespaces boundaries,
>>>>>> forwarding events to every listener in every network namespace.
>>>>>>
>>>>>> With the expanded use of containers, it would be useful to be able to
>>>>>> regulate this flow of events to specific containers. By restricting
>>>>>> the events to only the host network namespace, it allows for a userspace
>>>>>> program to provide a system wide policy on which events are routed where.
>>>>> Interesting, but why do you need a container to get a uevent at all?
>>>>> What uevents do a container care about?
>>>>>
>>>>> thanks,
>>>>>
>>>>> greg k-h
>>>>>
>>>> In our use case, we run a full desktop inside the container, including
>>>> X.
>>> Ugh, I was worried you were going to say that :(
>>>
>>>> We run the Xserver in headless mode, and forward a uevent to the
>>>> container to allow binding/unbinding of remote keyboard, mice, and
>>>> displays. So I want the add/del keyboard events, add/del mouse events,
>>>> and add/del display events. This is just one use case, I could image
>>>> others. The bottom line is that the current behavior is to broadcast to
>>>> everyone all uevents, and I don't see that as correct as it crosses the
>>>> network namespace boundaries. It seems to me that you would want to
>>>> provide controls as to where you want to forward those uevents, and
>>>> that is not a policy that I believe should be in the kernel but rather
>>>> in user space.
>>> devices are not in namespaces, which is why we don't partition them off
>>> at all. And that's why I really don't want to add this type of
>>> filtering either. It's up to the "master" container/process/whatever to
>>> send uevents to child containers if it really wants to. If we were to
>>> ever have devices bound only to namespaces, then it would make sense to
>>> only send the uevents for those devices to that namespace.
>>>
>>> But as that's never going to happen, I don't want to give people a false
>>> sense of "separation" here that isn't really there at all.
>>>
>>> sorry,
>>>
>>> greg k-h
>>>
>> Agreed that devices are not in namespaces, but the events are, or at
>> least could be.
> No, there's no way to tell which event for which device goes to which
> namespace, as devices are not in a namespace.
Why? The host certainly can have a policy for what devices go to which
container. And as such knows which events goes to which container. The
container *is* a set on namespace, and control groups. So a user
program reads the events on the master, looks in a database and forwards
it to that container. The uevents represent the device add/del so it
seems natural that it should be the mechanism by which that
communication happens. I just want to see it controlled by a policy on
the host.
>> That master is the host, and to do that I want to
>> forward events that the host receives to those individual containers.
>> But since the kernel is broadcasting them, I can't have that policy on
>> the host, and would have to filter on each container. Or I can do as
>> you say and have the master forward events. I don't see this as putting
>> the devices into a namespace, but rather managing devices from the
>> outside and notifying the container of the event. Just like plugging in
>> a monitor to the container.
> But you can't "plug a monitor into a container". Nor can you "add a
> keyboard to a container". Or a tty device. Or anything else (except
> for network devices). Don't try to fake things out as that's not what
> is happening here. The kernel shouldn't be allowing things to be sent
> only to specific namespaces, as that's a lie, the devices are "global"
> and not in a namespace at all.
Again why? Why are network devices *different*? They are a resources
that is bound to the container, not to a namespace per se, but the
container is a construct. A collection of namespaces, and cgroups.
Again, I don't see why you can't add a keyboard to the container.

> sorry,
>
> greg k-h
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>


--
---Michael J Coss

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/