Re: [RFC 1/3] pidfd: allow pidfd_open() on non-thread-group leaders

From: Tycho Andersen
Date: Fri Dec 08 2023 - 15:04:22 EST


On Thu, Dec 07, 2023 at 10:25:09PM +0100, Christian Brauner wrote:
> > If these concerns are correct
>
> So, ok. I misremebered this. The scenario I had been thinking of is
> basically the following.
>
> We have a thread-group with thread-group leader 1234 and a thread with
> 4567 in that thread-group. Assume current thread-group leader is tsk1
> and the non-thread-group leader is tsk2. tsk1 uses struct pid *tg_pid
> and tsk2 uses struct pid *t_pid. The struct pids look like this after
> creation of both thread-group leader tsk1 and thread tsk2:
>
> TGID 1234 TID 4567
> tg_pid[PIDTYPE_PID] = tsk1 t_pid[PIDTYPE_PID] = tsk2
> tg_pid[PIDTYPE_TGID] = tsk1 t_pid[PIDTYPE_TGID] = NULL
>
> IOW, tsk2's struct pid has never been used as a thread-group leader and
> thus PIDTYPE_TGID is NULL. Now assume someone does create pidfds for
> tsk1 and for tsk2:
>
> tg_pidfd = pidfd_open(tsk1) t_pidfd = pidfd_open(tsk2)
> -> tg_pidfd->private_data = tg_pid -> t_pidfd->private_data = t_pid
>
> So we stash away struct pid *tg_pid for a pidfd_open() on tsk1 and we
> stash away struct pid *t_pid for a pidfd_open() on tsk2.
>
> If we wait on that task via P_PIDFD we get:
>
> /* waiting through pidfd */
> waitid(P_PIDFD, tg_pidfd) waitid(P_PIDFD, t_pidfd)
> tg_pid[PIDTYPE_TGID] == tsk1 t_pid[PIDTYPE_TGID] == NULL
> => succeeds => fails
>
> Because struct pid *tg_pid is used a thread-group leader struct pid we
> can wait on that tsk1. But we can't via the non-thread-group leader
> pidfd because the struct pid *t_pid has never been used as a
> thread-group leader.
>
> Now assume, t_pid exec's and the struct pids are transfered. IIRC, we
> get:
>
> tg_pid[PIDTYPE_PID] = tsk2 t_pid[PIDTYPE_PID] = tsk1
> tg_pid[PIDTYPE_TGID] = tsk2 t_pid[PIDTYPE_TGID] = NULL
>
> If we wait on that task via P_PIDFD we get:
>
> /* waiting through pidfd */
> waitid(P_PIDFD, tg_pidfd) waitid(P_PIDFD, t_pid)
> tg_pid[PIDTYPE_TGID] == tsk2 t_pid[PIDTYPE_TGID] == NULL
> => succeeds => fails
>
> Which is what we want. So effectively this should all work and I
> misremembered the struct pid linkage. So afaict we don't even have a
> problem here which is great.

It sounds like we need some tests for waitpid() directly though, to
ensure the semantics stay stable. I can add those and send a v3,
assuming the location of do_notify_pidfd() looks ok to you in v2:

https://lore.kernel.org/all/20231207170946.130823-1-tycho@tycho.pizza/

Tycho