Re: [PATCH v2 0/5] pid: add pidfd_open()

From: Christian Brauner
Date: Sun Mar 31 2019 - 17:10:49 EST


On Sun, Mar 31, 2019 at 02:03:25PM -0700, Linus Torvalds wrote:
> On Sun, Mar 31, 2019 at 1:38 PM Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
> >
> > openat(fd to pidfdâs proc directory, âstatusâ, ...);
> >
> > And we want a non-utterly-crappy way to do this. The ioctl is certainly ugly, but it *works*.
>
> It's beyond clunky. It's a disgrace.
>
> If people really want equivalency between open("/proc/%d") and some
> new pidfd_open(), then just *make* the two equivalent.

I don't think that we want or can make them equivalent since that would
mean we depend on procfs. If userspace really wants to turn a pidfd into
an fd for /proc/<pid> then they can be burdened to do so by parsing out
the pid relative to their procfs pid namespace from the pidfds fdinfo:

int pidfd = pidfd_open(pid, 0);
int pid = parse_fdinfo("/proc/self/fdinfo/<pidfd>");
int procpidfd = open("/proc/<pid>", ...);

/* Test if process still exists by sending signal 0 through our pidfd. */
int ret = pidfd_send_signal(pid, 0, NULL, PIDFD_SIGNAL_THREAD);
if (ret < 0 && errno == ESRCH) {
/* pid has been recycled and procpidfd refers to another process */
}

it's race free and no ioctl() is needed.

>
> No effing crappy ioctl idiocy to create one from the other. Just make
> the damn things be the exact same thing (and then if we extend clone()
> to return one, make that return the same exact thing too).
>
> Btw, we have several clone bits left:
>
> - if we don't have CLONE_PARENT set, the low 8 bits are still available
>
> - ignoring that, wehave bit #12 free: It used to be CLONE_IDLETASK
> long long ago, but it was always kernel-only so it's never been
> exposed as a user space bit.

That's good to know.