[tip: sched/core] sched/rt: Fix bad task migration for rt tasks

From: tip-bot2 for Schspa Shi
Date: Sat Apr 22 2023 - 03:44:16 EST


The following commit has been merged into the sched/core branch of tip:

Commit-ID: feffe5bb274dd3442080ef0e4053746091878799
Gitweb: https://git.kernel.org/tip/feffe5bb274dd3442080ef0e4053746091878799
Author: Schspa Shi <schspa@xxxxxxxxx>
AuthorDate: Mon, 29 Aug 2022 01:03:02 +08:00
Committer: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
CommitterDate: Fri, 21 Apr 2023 13:24:21 +02:00

sched/rt: Fix bad task migration for rt tasks

Commit 95158a89dd50 ("sched,rt: Use the full cpumask for balancing")
allows find_lock_lowest_rq() to pick a task with migration disabled.
The purpose of the commit is to push the current running task on the
CPU that has the migrate_disable() task away.

However, there is a race which allows a migrate_disable() task to be
migrated. Consider:

CPU0 CPU1
push_rt_task
check is_migration_disabled(next_task)

task not running and
migration_disabled == 0

find_lock_lowest_rq(next_task, rq);
_double_lock_balance(this_rq, busiest);
raw_spin_rq_unlock(this_rq);
double_rq_lock(this_rq, busiest);
<<wait for busiest rq>>
<wakeup>
task become running
migrate_disable();
<context out>
deactivate_task(rq, next_task, 0);
set_task_cpu(next_task, lowest_rq->cpu);
WARN_ON_ONCE(is_migration_disabled(p));

Fixes: 95158a89dd50 ("sched,rt: Use the full cpumask for balancing")
Signed-off-by: Schspa Shi <schspa@xxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Reviewed-by: Steven Rostedt (Google) <rostedt@xxxxxxxxxxx>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
Reviewed-by: Valentin Schneider <vschneid@xxxxxxxxxx>
Tested-by: Dwaine Gonyier <dgonyier@xxxxxxxxxx>
---
kernel/sched/deadline.c | 1 +
kernel/sched/rt.c | 4 ++++
2 files changed, 5 insertions(+)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 4cc7e1c..5a9a4b8 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -2246,6 +2246,7 @@ static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq)
!cpumask_test_cpu(later_rq->cpu, &task->cpus_mask) ||
task_on_cpu(rq, task) ||
!dl_task(task) ||
+ is_migration_disabled(task) ||
!task_on_rq_queued(task))) {
double_unlock_balance(rq, later_rq);
later_rq = NULL;
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 9d67dfb..00e0e50 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -2000,11 +2000,15 @@ static struct rq *find_lock_lowest_rq(struct task_struct *task, struct rq *rq)
* the mean time, task could have
* migrated already or had its affinity changed.
* Also make sure that it wasn't scheduled on its rq.
+ * It is possible the task was scheduled, set
+ * "migrate_disabled" and then got preempted, so we must
+ * check the task migration disable flag here too.
*/
if (unlikely(task_rq(task) != rq ||
!cpumask_test_cpu(lowest_rq->cpu, &task->cpus_mask) ||
task_on_cpu(rq, task) ||
!rt_task(task) ||
+ is_migration_disabled(task) ||
!task_on_rq_queued(task))) {

double_unlock_balance(rq, lowest_rq);