Re: [RFC PATCH v2] Minimal non-child process exit notification support

From: Aleksa Sarai
Date: Thu Nov 01 2018 - 03:00:48 EST


On 2018-10-29, Daniel Colascione <dancol@xxxxxxxxxx> wrote:
> This patch adds a new file under /proc/pid, /proc/pid/exithand.
> Attempting to read from an exithand file will block until the
> corresponding process exits, at which point the read will successfully
> complete with EOF. The file descriptor supports both blocking
> operations and poll(2). It's intended to be a minimal interface for
> allowing a program to wait for the exit of a process that is not one
> of its children.
>
> Why might we want this interface? Android's lmkd kills processes in
> order to free memory in response to various memory pressure
> signals. It's desirable to wait until a killed process actually exits
> before moving on (if needed) to killing the next process. Since the
> processes that lmkd kills are not lmkd's children, lmkd currently
> lacks a way to wait for a process to actually die after being sent
> SIGKILL; today, lmkd resorts to polling the proc filesystem pid
> entry. This interface allow lmkd to give up polling and instead block
> and wait for process death.

I agree with the need for this interface (with a few caveats), but there
are a few points I'd like to make:

* I don't think that making a new procfile is necessary. When you open
/proc/$pid you already have a handle for the underlying process, and
you can already poll to check whether the process has died (fstatat
fails for instance). What if we just used an inotify event to tell
userspace that the process has died -- to avoid userspace doing a
poll loop?

* There is a fairly old interface called the proc_connector which gives
you global fork+exec+exit events (similar to kevents from FreeBSD
though much less full-featured). I was working on some patches to
extend proc_connector so that it could be used inside containers as
well as unprivileged users. This would be another way we could
implement this.

I'm really not a huge fan of the "blocking read" semantic (though if we
have to have it, can we at least provide as much information as you get
from proc_connector -- such as the exit status?). Also maybe we should
integrate this into the exit machinery instead of this loop...

--
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

Attachment: signature.asc
Description: PGP signature