Re: [RFC] EPOLL_KILLME: New flag to epoll_wait() that subscribes process to death row (new syscall)

From: Colin Walters
Date: Wed Nov 01 2017 - 11:16:43 EST




On Wed, Nov 1, 2017, at 01:32 AM, Shawn Landden wrote:
> It is common for services to be stateless around their main event loop.
> If a process passes the EPOLL_KILLME flag to epoll_wait5() then it
> signals to the kernel that epoll_wait5() may not complete, and the kernel
> may send SIGKILL if resources get tight.
>

I've thought about something like this in the past too and would love
to see it land. Bigger picture, this also comes up in (server) container
environments, see e.g.:

https://docs.openshift.com/container-platform/3.3/admin_guide/idling_applications.html

There's going to be a long slog getting apps to actually make use
of this, but I suspect if it gets wrapped up nicely in some "framework"
libraries for C/C++, and be bound in the language ecosystems like golang
we could see a fair amount of adoption on the order of a year or two.

However, while I understand why it feels natural to tie this to epoll,
as the maintainer of glib2 which is used by a *lot* of things; I'm not
sure we're going to port to epoll anytime soon.

Why not just make this a prctl()? It's not like it's really any less racy to do:

prctl(PR_SET_IDLE)
epoll()

and this also allows:

prctl(PR_SET_IDLE)
poll()

And as this is most often just going to be an optional hint it's easier to e.g. just ignore EINVAL
from the prctl().