Re: [PATCH v3 1/3] pidfd: allow pidfd_open() on non-thread-group leaders

From: Oleg Nesterov
Date: Fri Jan 26 2024 - 09:35:12 EST


On 01/26, Christian Brauner wrote:
>
> No, it doesn't. I'm trying to understand what you are suggesting though.
> Are you saying !task || tas->exit_state is enough

If PIDFD_THREAD then I think it is enough. Otherwise we still need
!task || (exit_state && thread_group_empty)

> and we shouldn't use
> the helper that was added in commit 38fd525a4c61 ("exit: Factor
> thread_group_exited out of pidfd_poll"). If so what does that buy us
> open-coding the check instead of using that helper? Is there an actual
> bug here?

The patch adds the new xxx_exited(task, excl) helper which checks

!task || (exit_state && (excl || thread_group_empty))

yes, the naming is not good.

> > Well, I didn't say this is a problem. I simply do not know how/why people
> > use pidfd_poll().
>
> Sorry, I just have a hard time understanding what you wanted then. :)
>
> "I guess it is too late to change this behavior." made it sound like a)
> there's a problem and b) that you would prefer to change behavior. Thus,
> it seems that wait(WNOHANG) hanging when a traced leader of an empty
> thread-group has exited is a problem in your eyes.

Again, I mostly tried to argue with do_notify_pidfd() called by realese_task().

I think that with PIDFD_THREAD set pidfd_poll() should succeed right
after the exiting thread becomes a zombie (passes exit_notify), whether
it is a leader or not.

Let me quote part of my reply to Tycho's patch

> + /*
> + * If we're not the leader, notify any waiters on our pidfds. Note that
> + * we don't want to notify the leader until /everyone/ in the thread
> + * group is dead, viz. the condition below.
> + *
> + * We have to do this here, since __exit_signal() will
> + * __unhash_processes(), and break do_notify_pidfd()'s lookup.
> + */
> + if (!thread_group_leader(p))
> + do_notify_pidfd(p);

This doesn't look consistent.

If the task is a group leader do_notify_pidfd() is called by exit_notify()
when it becomes a zombie (if no other threads), before it is reaped by its
parent (unless autoreap).

If it is a sub-thread, it is called by release_task() above. Note that a
sub-thread can become a zombie too if it is traced.

Not to mention that this is racy.

I would not mind if we simply move do_notify_pidfd() from exit_notify() to
release_task() and do it regardless of thread_group_leader(). And in some
sense this looks more logical to me.

But as I said:

- I do not know how/why people actually use poll(pidfd)

- it is too late to change the current behaviour

Sorry for confusion.

> I'm not sure whether you remember that but when we originally did the
> pidfd work you and I discussed thread support and already decided back
> then that having a flag like PIDFD_THREAD would likely be the way to go.

All I can recall is that, yes, we had some discussions about pidfd in
the past ;)

> The PIDFD_THREAD flag would be would be interesting because we could
> make pidfd_send_signal() support this flag

Agreed,

Oleg.