Re: [PATCH] sysctl: add support for poll()

From: Eric W. Biederman
Date: Sun Jun 12 2011 - 11:35:01 EST


Kay Sievers <kay.sievers@xxxxxxxx> writes:

> On Thu, Jun 2, 2011 at 15:56, Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote:
>> Lucas De Marchi <lucas.demarchi@xxxxxxxxxxxxxx> writes:
>
>>> With this patch in, if anyone wants to manage a file under /proc/sys
>>> there's really a small amount of code to write. He only has to define
>>> the new poll struct for that file.
>>
>> The support currently appears cumbersome to add, and it adds what appear
>> to be unnecessary wake ups (say when the hostname in another uts
>> namespace changes).
>
> Yeah, *possibly* waking up once a day compared to *unconditionally*
> waking up every second in every namespace and check if something has
> changed. If that *possibly* should be optimized, which I think isn't
> necessary at all, I guess the logic could be moved down to the
> namespaced data.

Unless you happen to be on a system, where someone has decided that it
has a daemon that likes creating a new set of namespaces per connection
to reduce the effect of code bugs. At which point you could be talking
much more than a wake up per second.

>> There is no explanation at all of why you care about the nis domainname.
>
> Just the same reason as the hostname, we need to know when the
> kernel's internal state changes. regardless who did it and why, system
> services need to follow it.

So the problem is that you have a system where userspace can't get it's
act together?

>> Since there does not appear to be a specific problem that this problem
>> is being aimed at, since the code just looks like extra maintenance and
>> since the code needed to support this appears to be unnecessarily
>> cumbersome I am going to nack the patch for now.
>>
>> Nacked-by: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx>
>
> Please provide solid technical reasons, why we can't to the same we do
> everywhere else since years, or provide working alternatives. Until
> then:
>
> Acked-By: Kay Sievers <kay.sievers@xxxxxxxx>

Looking a little closer the patch is broken.

It changes the return of poll for every file under /proc/sys reporting
no file under /proc/sys is writable unless it implements the new poll
method.

Also with having the wait_queue owned by the calling code either I am
mistake or this breaks hotplug type situations. How do you get things
off of your wait queue when you remove a file from /proc/sys. This
infrastructure as written looks like a setup for use after free
problems.

> I think poll() is the natural interface if you care about the data in
> a file. It's the same an single fd you care, and not some iniotify
> watch, fd, pathname to register, and filter for whatever comes out
> there.

Sort of. poll is really designed for socket and pipe data.

The problem with poll is there is no POLLUPDATED.

> We already use exactly the same semantics for quite some years for
> /proc/mounts, /proc/swaps, /proc/mdstat, ... and all over /sys. The
> patch just provides the missing pieces that /proc provides, but
> /proc/sys is missing.

Good point.

There is still the problem that the infrastructure code for the proc
sysctl files are in much worse shape than the sysfs files.

> I don't disagree, it might be nice to have generic inotify support on
> /proc for this or other problems. But unless you want to work on it
> now, and it would solve these problems, I don't see how we can get the
> functionality we need, and that seems to solved with this patch.
> Besides the fact that I think poll() is the simple and better
> solution.

Which is a question that has never been answered. What functionality
do you need? To me this all looks like a bad science experiment.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/