Re: [PATCH v3 1/3] pidfd: allow pidfd_open() on non-thread-group leaders

From: Tycho Andersen
Date: Fri Jan 26 2024 - 16:51:13 EST


Hi Oleg,

On Thu, Jan 25, 2024 at 03:08:31PM +0100, Oleg Nesterov wrote:
> What do you think?

Thank you, it passes all my tests.

> + /* unnecessary if do_notify_parent() was already called,
> + we can do better */
> + do_notify_pidfd(tsk);

"do better" here could be something like,

diff --git a/kernel/exit.c b/kernel/exit.c
index efe8f1d3a6af..7e545393f2f5 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -742,6 +742,7 @@ static void exit_notify(struct task_struct *tsk, int group_dead)
bool autoreap;
struct task_struct *p, *n;
LIST_HEAD(dead);
+ bool needs_notify = true;

write_lock_irq(&tasklist_lock);
forget_original_parent(tsk, &dead);
@@ -756,16 +757,21 @@ static void exit_notify(struct task_struct *tsk, int group_dead)
!ptrace_reparented(tsk) ?
tsk->exit_signal : SIGCHLD;
autoreap = do_notify_parent(tsk, sig);
+ needs_notify = false;
} else if (thread_group_leader(tsk)) {
- autoreap = thread_group_empty(tsk) &&
- do_notify_parent(tsk, tsk->exit_signal);
+ autoreap = false;
+ if (thread_group_empty(tsk)) {
+ autoreap = do_notify_parent(tsk, tsk->exit_signal);
+ needs_notify = false;
+ }
} else {
autoreap = true;
}

/* unnecessary if do_notify_parent() was already called,
we can do better */
- do_notify_pidfd(tsk);
+ if (needs_notify)
+ do_notify_pidfd(tsk);

if (autoreap) {
tsk->exit_state = EXIT_DEAD;


but even with that, there's other calls in the tree to
do_notify_parent() that might double notify.

This brings up another interesting behavior that I noticed while
testing this, if you do a poll() on pidfd, followed quickly by a
pidfd_getfd() on the same thread you just got an event on, you can
sometimes get an EBADF from __pidfd_fget() instead of the more
expected ESRCH higher up the stack.

I wonder if it makes sense to abuse ->f_flags to add a PIDFD_NOTIFIED?
Then we can refuse further pidfd syscall operations in a sane way, and
also "do better" above by checking this flag from do_pidfd_notify()
before doing it again?

Tycho