[PATCH 4/4] sched: Fix CPU hotplug / tighten is_per_cpu_kthread()

From: Peter Zijlstra
Date: Tue Jan 12 2021 - 09:52:37 EST


Prior to commit 1cf12e08bc4d ("sched/hotplug: Consolidate task
migration on CPU unplug") we'd leave any task on the dying CPU and
break affinity and force them off at the very end.

This scheme had to change in order to enable migrate_disable(). One
cannot wait for migrate_disable() to complete while stuck in
stop_machine(). Furthermore, since we need at the very least: idle,
hotplug and stop threads at any point before stop_machine, we can't
break affinity and/or push those away.

Under the assumption that all per-cpu kthreads are sanely handled by
CPU hotplug, the new code no long breaks affinity or migrates any of
them (which then includes the critical ones above).

However, there's an important difference between per-cpu kthreads and
kthreads that happen to have a single CPU affinity which is lost. The
latter class very much relies on the forced affinity breaking and
migration semantics previously provided.

Use the new kthread_is_per_cpu() infrastructure to tighten
is_per_cpu_kthread() and fix the hot-unplug problems stemming from the
change.

Fixes: 1cf12e08bc4d ("sched/hotplug: Consolidate task migration on CPU unplug")
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Tested-by: Paul E. McKenney <paulmck@xxxxxxxxxx>
---
kernel/sched/core.c | 8 +++++++-
kernel/sched/sched.h | 12 ++++++++++--
2 files changed, 17 insertions(+), 3 deletions(-)

--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7276,8 +7276,14 @@ static void balance_push(struct rq *rq)
/*
* Both the cpu-hotplug and stop task are in this case and are
* required to complete the hotplug process.
+ *
+ * XXX: the idle task does not match is_per_cpu_kthread() due to
+ * histerical raisins.
*/
- if (is_per_cpu_kthread(push_task) || is_migration_disabled(push_task)) {
+ if (rq->idle == push_task ||
+ is_per_cpu_kthread(push_task) ||
+ is_migration_disabled(push_task)) {
+
/*
* If this is the idle task on the outgoing CPU try to wake
* up the hotplug control thread which might wait for the
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -2692,15 +2692,23 @@ static inline void membarrier_switch_mm(
#endif

#ifdef CONFIG_SMP
+/*
+ * Match geniune per-cpu kthreads; threads that are bound to a single CPU for
+ * correctness, not kernel threads that happen to have a single CPU affinity.
+ *
+ * Such threads will have PF_NO_SETAFFINITY to ensure userspace cannot
+ * accidentally place them elsewhere -- this also filters out 'early' kthreads
+ * that have PF_KTHREAD set but do not have a struct kthread.
+ */
static inline bool is_per_cpu_kthread(struct task_struct *p)
{
if (!(p->flags & PF_KTHREAD))
return false;

- if (p->nr_cpus_allowed != 1)
+ if (!(p->flags & PF_NO_SETAFFINITY))
return false;

- return true;
+ return kthread_is_per_cpu(p);
}
#endif