Re: [PATCH 0/4] pid: add pidctl()

From: Jann Horn
Date: Mon Mar 25 2019 - 17:15:52 EST


On Mon, Mar 25, 2019 at 9:40 PM Jonathan Kowalski <bl0pbl33p@xxxxxxxxx> wrote:
> On Mon, Mar 25, 2019 at 8:34 PM Jann Horn <jannh@xxxxxxxxxx> wrote:
> >
> > [...SNIP...]
> >
> > Please don't do that. /proc/$pid/fd refers to the set of file
> > descriptors the process has open, and semantically doesn't have much
> > to do with the identity of the process. If you want to have a procfs
> > directory entry for getting a pidfd, please add a new entry. (Although
> > I don't see the point in adding a new procfs entry for this when you
> > could instead have an ioctl or syscall operating on the procfs
> > directory fd.)
>
> There is no new entry. What I was saying (and I should have been
> clearer) is that the existing entry for the fd when open'd with
> O_DIRECTORY makes the kernel resolve the symlink to /proc/<PID> of the
> process it maps to, so it would become:
>
> int dirfd = open("/proc/self/fd/3", O_DIRECTORY|O_CLOEXEC);

That still seems really weird. This magically overloads O_DIRECTORY,
which means "fail if the thing is not a directory", to suddenly have
an entirely different meaning for one magical special type of file. On
top of that, unlike an ioctl or a new syscall, it doesn't convey
explicit intent and increases the risk of confused deputy issues.

> This also means you cannot cross the filesystem boundry, the said
> process needs to have a visible entry (which would mean hidepid= and
> gid= based access controls are honored), and you can only open the
> dirfd of a process in the current ns (as the PID will not map to an
> existent process if the pidfd maps to a process not in the same or
> children pid ns, in fdinfo it lists -1 in the pid field (we might not
> even need fdinfo anymore)).

AFAICS that doesn't have anything to do with whether you do this as a
syscall, as an ioctl, or as a jumped symlink. The kernel would have to
do the same security checks in any of those cases - only a classic,
non-jumped symlink would implicitly go through the existing permission
checks. And if you implement this with a non-jumped symlink, you get
races.