Re: [tip: sched/core] sched/rt: Make rt_rq->pushable_tasks updates drive rto_mask

From: Valentin Schneider
Date: Mon Sep 25 2023 - 08:22:54 EST


On 25/09/23 12:11, Peter Zijlstra wrote:
> On Mon, Sep 25, 2023 at 08:55:10AM -0000, tip-bot2 for Valentin Schneider wrote:
>> The following commit has been merged into the sched/core branch of tip:
>>
>> Commit-ID: 612f769edd06a6e42f7cd72425488e68ddaeef0a
>> Gitweb: https://git.kernel.org/tip/612f769edd06a6e42f7cd72425488e68ddaeef0a
>> Author: Valentin Schneider <vschneid@xxxxxxxxxx>
>> AuthorDate: Fri, 11 Aug 2023 12:20:44 +01:00
>> Committer: Ingo Molnar <mingo@xxxxxxxxxx>
>> CommitterDate: Mon, 25 Sep 2023 10:25:29 +02:00
>>
>> sched/rt: Make rt_rq->pushable_tasks updates drive rto_mask
>>
>> Sebastian noted that the rto_push_work IRQ work can be queued for a CPU
>> that has an empty pushable_tasks list, which means nothing useful will be
>> done in the IPI other than queue the work for the next CPU on the rto_mask.
>>
>> rto_push_irq_work_func() only operates on tasks in the pushable_tasks list,
>> but the conditions for that irq_work to be queued (and for a CPU to be
>> added to the rto_mask) rely on rq_rt->nr_migratory instead.
>>
>> nr_migratory is increased whenever an RT task entity is enqueued and it has
>> nr_cpus_allowed > 1. Unlike the pushable_tasks list, nr_migratory includes a
>> rt_rq's current task. This means a rt_rq can have a migratible current, N
>> non-migratible queued tasks, and be flagged as overloaded / have its CPU
>> set in the rto_mask, despite having an empty pushable_tasks list.
>>
>> Make an rt_rq's overload logic be driven by {enqueue,dequeue}_pushable_task().
>> Since rt_rq->{rt_nr_migratory,rt_nr_total} become unused, remove them.
>>
>> Note that the case where the current task is pushed away to make way for a
>> migration-disabled task remains unchanged: the migration-disabled task has
>> to be in the pushable_tasks list in the first place, which means it has
>> nr_cpus_allowed > 1.
>>
>> Reported-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
>> Signed-off-by: Valentin Schneider <vschneid@xxxxxxxxxx>
>> Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
>> Tested-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
>> Link: https://lore.kernel.org/r/20230811112044.3302588-1-vschneid@xxxxxxxxxx
>> ---
>> kernel/sched/debug.c | 3 +--
>> kernel/sched/rt.c | 70 ++++++-------------------------------------
>> kernel/sched/sched.h | 2 +-
>> 3 files changed, 10 insertions(+), 65 deletions(-)
>>
>
>> @@ -358,53 +357,6 @@ static inline void rt_clear_overload(struct rq *rq)
>> cpumask_clear_cpu(rq->cpu, rq->rd->rto_mask);
>> }
>>
>> -static void update_rt_migration(struct rt_rq *rt_rq)
>> -{
>> - if (rt_rq->rt_nr_migratory && rt_rq->rt_nr_total > 1) {
>> - if (!rt_rq->overloaded) {
>> - rt_set_overload(rq_of_rt_rq(rt_rq));
>> - rt_rq->overloaded = 1;
>> - }
>> - } else if (rt_rq->overloaded) {
>> - rt_clear_overload(rq_of_rt_rq(rt_rq));
>> - rt_rq->overloaded = 0;
>> - }
>> -}
>> -
>> -static void inc_rt_migration(struct sched_rt_entity *rt_se, struct rt_rq *rt_rq)
>> -{
>> - struct task_struct *p;
>> -
>> - if (!rt_entity_is_task(rt_se))
>> - return;
>> -
>> - p = rt_task_of(rt_se);
>> - rt_rq = &rq_of_rt_rq(rt_rq)->rt;
>> -
>> - rt_rq->rt_nr_total++;
>> - if (p->nr_cpus_allowed > 1)
>> - rt_rq->rt_nr_migratory++;
>> -
>> - update_rt_migration(rt_rq);
>> -}
>> -
>> -static void dec_rt_migration(struct sched_rt_entity *rt_se, struct rt_rq *rt_rq)
>> -{
>> - struct task_struct *p;
>> -
>> - if (!rt_entity_is_task(rt_se))
>> - return;
>> -
>> - p = rt_task_of(rt_se);
>> - rt_rq = &rq_of_rt_rq(rt_rq)->rt;
>> -
>> - rt_rq->rt_nr_total--;
>> - if (p->nr_cpus_allowed > 1)
>> - rt_rq->rt_nr_migratory--;
>> -
>> - update_rt_migration(rt_rq);
>> -}
>
> sched/deadline.c has something very similar, does that need updating
> too?

Hm I think so yes:
- push_dl_task() is an obvious noop if the pushable tree is empty
- pull_dl_task() can be kicked if !rq->dl.overloaded, which similarly to rt
is driven by nr_migratory but could be boiled down to having pushable
tasks (due to the nr_cpus_allowed > 1 constraint).

Lemme poke at it.