Re: sched/autogroup: race if !sysctl_sched_autogroup_enabled ?

From: Oleg Nesterov
Date: Fri Nov 11 2016 - 11:58:41 EST


On 11/10, Oleg Nesterov wrote:
>
> And the 3rd case which I didn't think about yesterday. And now I really hope
> it can explain the vmcore we have.
>
> If sysctl_sched_autogroup_enabled was enabled and then disabled, it is
> possible that the "autogrouped" process runs with ag->kref.refcount == 1,
> and if it does setsid() it frees its active task_group.

And yet another problem ;)

The exiting thread must call sched_move_task() somewhere before exit_notify()
or it can run with the freed task_group() after that. And this means that the
no-longer-needed PF_EXITING check in task_wants_autogroup() will be needed
again. Simple, but needs the comments/changelog...

> So I am going to send the patch which simply moves the sysctl check from
> autogroup_move_group() to sched_autogroup_create_attach(), but perhaps I
> should split this change?
>
> I mean, the first patch for -stable could just remove the current check,
> the 2nd one will add it into sched_autogroup_create_attach().

No, this is not enough, see above.

I am starting to think that we should just move ->autogroup from signal_struct
to task_struct. This will simplify the code and fix all these problems. But
I need a simple fix for backporting anyway.

Oleg.