Re: finish_task_switch && prev_state (Was: sched, timers: use after free in __lock_task_sighand when exiting a process)

From: Peter Zijlstra
Date: Tue Jul 15 2014 - 09:24:22 EST


On Tue, Jul 15, 2014 at 03:12:40PM +0200, Oleg Nesterov wrote:
> Ah, I am stupid, please ignore.
>
> Of course a TASK_DEAD task can not schedule, but we can race with RUNNING ->
> DEAD transition. So we should only do put_task_struct() if "prev" was already
> TASK_DEAD before we drop the rq locks.

Not so stupid; that is, when I read your question yesterday I didn't
have a ready answer and queued the email to look at later.

This does mean the comment isn't clear enough at the very least.

Does this make it better?

---
kernel/sched/core.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 7bc599dc4aa4..aa67f1cfa58e 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2211,13 +2211,15 @@ static void finish_task_switch(struct rq *rq, struct task_struct *prev)

/*
* A task struct has one reference for the use as "current".
+ *
* If a task dies, then it sets TASK_DEAD in tsk->state and calls
- * schedule one last time. The schedule call will never return, and
- * the scheduled task must drop that reference.
- * The test for TASK_DEAD must occur while the runqueue locks are
- * still held, otherwise prev could be scheduled on another cpu, die
- * there before we look at prev->state, and then the reference would
- * be dropped twice.
+ * schedule one last time. The schedule call will never return, and the
+ * scheduled task must drop that reference.
+ *
+ * The test for TASK_DEAD must occur while the runqueue locks are still
+ * held, otherwise we can race with RUNNING -> DEAD transitions, and
+ * then the reference would be dropped twice.
+ *
* Manfred Spraul <manfred@xxxxxxxxxxxxxxxx>
*/
prev_state = prev->state;

Attachment: pgpKw_VxA50hK.pgp
Description: PGP signature