Re: pidfd design

From: Christian Brauner
Date: Mon Mar 25 2019 - 19:45:55 EST


On Mon, Mar 25, 2019 at 04:42:14PM -0700, Andy Lutomirski wrote:
> On Mon, Mar 25, 2019 at 1:23 PM Daniel Colascione <dancol@xxxxxxxxxx> wrote:
> >
> > On Mon, Mar 25, 2019 at 1:14 PM Jann Horn <jannh@xxxxxxxxxx> wrote:
> > >
> > > On Mon, Mar 25, 2019 at 8:44 PM Andy Lutomirski <luto@xxxxxxxxxx> wrote:
>
> > > One ioctl on procfs roots to translate pidfds into that procfs,
> > > subject to both the normal lookup permission checks and only working
> > > if the pidfd has a translation into the procfs:
> > >
> > > int proc_root_fd = open("/proc", O_RDONLY);
> > > int proc_dir_fd = ioctl(proc_root_fd, PROC_PIDFD_TO_PROCFSFD, pidfd);
> > >
> > > And one ioctl on procfs directories to translate from PGIDs and PIDs to pidfds:
> > >
> > > int proc_pgid_fd = open("/proc/self", O_RDONLY);
> > > int self_pg_pidfd = ioctl(proc_pgid_fd, PROC_PROCFSFD_TO_PIDFD, 0);
> > > int proc_pid_fd = open("/proc/thread-self", O_RDONLY);
> > > int self_p_pidfd = ioctl(proc_pid_fd, PROC_PROCFSFD_TO_PIDFD, 0);
> > >
>
> This sounds okay to me. Or we could make it so that a procfs
> directory fd also works as a pidfd, but that seems more likely to be
> problematic than just allowing two-way translation like this
>
> > >
> > > And then, as you proposed, the new sys_clone() can just return a
> > > pidfd, and you can convert it into a procfs fd yourself if you want.
> >
> > I think that's the consensus we reached on the other thread. The
> > O_DIRECTORY open on /proc/self/fd/mypidfd seems like it'd work well
> > enough.
>
> I must have missed this particular email.
>
> IMO, if /proc/self/fd/mypidfd allows O_DIRECTORY open to work, then it
> really ought to do function just like /proc/self/fd/mypidfd/. and
> /proc/self/fd/mypidfd/status should work. And these latter two
> options seem nutty.
>
> Also, this O_DIRECTORY thing is missing the entire point of the ioctl
> interface -- it doesn't require procfs access.

The other option was to encode the pid in the callers pid namespace into
the pidfd's fdinfo so that you can parse it out and open /proc/<pid>.
You'd just need an event on the pidfd to tell you when the process has
died. Jonathan and I just discussed this.