Re: [PATCH 6/6] sched: Simplify set_affinity_pending refcounts

From: Valentin Schneider
Date: Wed Feb 24 2021 - 13:00:00 EST


On 24/02/21 16:34, Peter Zijlstra wrote:
> Elsewhere Valentin argued something like the below ought to be possible.
> I've not drawn diagrams yet, but if I understood his argument right it
> should be possible.
>
> ---
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 1c56ac4df2c9..3ffbd1b76f3e 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2204,9 +2204,10 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag
> * then complete now.
> */
> pending = p->migration_pending;
> - if (pending && !pending->stop_pending) {
> + if (pending) {
> p->migration_pending = NULL;
> - complete = true;
> + if (!pending->stop_pending)
> + complete = true;
> }
>
> task_rq_unlock(rq, p, rf);
> @@ -2286,10 +2287,9 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag
> if (task_on_rq_queued(p))
> rq = move_queued_task(rq, rf, p, dest_cpu);
>
> - if (!pending->stop_pending) {
> - p->migration_pending = NULL;
> + p->migration_pending = NULL;
> + if (!pending->stop_pending)
> complete = true;
> - }
> }
> task_rq_unlock(rq, p, rf);
>

I was thinking of the "other way around"; i.e. modify migration_cpu_stop()
into:

---
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 9492f8eb242a..9546f0263970 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1926,6 +1926,11 @@ static int migration_cpu_stop(void *data)
raw_spin_lock(&p->pi_lock);
rq_lock(rq, &rf);

+ /*
+ * If we were passed a pending, then ->stop_pending was set, thus
+ * p->migration_pending must have remained stable.
+ */
+ WARN_ON_ONCE(pending && pending != p->migration_pending);
/*
* If task_rq(p) != rq, it cannot be migrated here, because we're
* holding rq->lock, if p->on_rq == 0 it cannot get enqueued because
@@ -1936,8 +1941,7 @@ static int migration_cpu_stop(void *data)
goto out;

if (pending) {
- if (p->migration_pending == pending)
- p->migration_pending = NULL;
+ p->migration_pending = NULL;
complete = true;
}

@@ -1976,8 +1980,7 @@ static int migration_cpu_stop(void *data)
* somewhere allowed, we're done.
*/
if (cpumask_test_cpu(task_cpu(p), p->cpus_ptr)) {
- if (p->migration_pending == pending)
- p->migration_pending = NULL;
+ p->migration_pending = NULL;
complete = true;
goto out;
}
---

Your change reinstores the "triple SCA" pattern, where a stopper can run
with arg->pending && arg->pending != p->migration_pending, which I was
kinda happy to see go away...