Re: [rfc] epoll interface change and glibc bits ...

From: Davide Libenzi (davidel@xmailserver.org)
Date: Tue Nov 19 2002 - 22:46:44 EST


On Wed, 20 Nov 2002, Jamie Lokier wrote:

> I have a question about:
>
> > struct epoll_fd {
> > int fd;
> > unsigned short events;
> > unsigned short revents;
> > __uint64_t obj;
> > };
>
> What value does the `fd' field have when a file descriptor being
> polled has been renumbered (by dup/close or dup2/close or
> fcntl(F_DUPFD)/close or passing through a unix domain socket)?
>
> If we are honest, the `obj' field is absolutely essential as its the
> only value which uniquely identifies the file descriptor if you have
> done anything unusual with the fds.
>
> The `fd' field, on the other hand, is not guaranteed to correspond
> with the correct file descriptor number. So.... perhaps the structure
> should contain an `obj' field and _no_ `fd' field?
>
> This doesn't affect applications. Those which use `obj' for something
> interesting (i.e. a pointer) will have the `fd' value stored in the
> pointed-to data structure, while simple applications can just store
> the original `fd' value in `obj' in the first place.

Even if I agree with you here, this will make the API asymmetrical. We
will have :

struct epoll_fd {
        unsigned short events;
        unsigned short revents;
        __uint64_t obj;
};

int epoll_ctl(int epfd, int op, int fd, struct epoll_fd *pfd);

Where the "fd" is used only for EPOLL_CTL_ADD, and "obj" for EPOLL_CTL_DEL
and EPOLL_CTL_MOD.

> > It'll be possible to add epfd1 inside epfd2, not epfd1 inside epfd1.
>
> Beware of overflowing the kernel stack. If epfd4 becomes readable,
> and wakes up epfd3, which wakes up epfd2, which wakes up epfd1... If
> that is implemented recursively than I can write malicious code which
> will crash the kernel. Note that this isn't a cycle. It's possible
> to code the wakeups so this cannot happen but still have the expected
> behaviour.
>
> A circular arrangement should be fine, if silly. The semantics are
> quite logical and don't require special cases: epfd2 becoming readable
> will trigger epfd1 to become readable if it isn't already. If you
> make a cycle, that's silly but still behaves as you'd expect. If
> epfd1 becomes readable, it wakes up epfd1... which is already
> readable so nothing further happens. Similarly with larger cycles.
> Assuming you've avoided stack overflow for acyclic graphs, there won't
> be any problem with cyclic ones.

This is a problem with the new callback'd wake_up(). I'd be tempted to not
permit epoll fd inclusion inside other epoll fds.

- Davide

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Nov 23 2002 - 22:00:30 EST