Re: [PATCH] kernel/sched: Update schedstats when migrating threads

From: Carlos Bilbao
Date: Wed Feb 23 2022 - 10:34:08 EST


On 2/23/2022 9:28 AM, Phil Auld wrote:
> On Wed, Feb 23, 2022 at 09:14:45AM -0600 Carlos Bilbao wrote:
>> On 2/23/2022 4:19 AM, Peter Zijlstra wrote:
>>> On Wed, Jan 26, 2022 at 09:22:23AM -0600, Carlos Bilbao wrote:
>>>> The kernel manages per-task scheduler statistics or schedstats. Such
>>>> counters should be reinitialized when the thread is migrated to a
>>>> different core rq, except for the values recording number of migrations.
>>>
>>> I'm confused, why should we reset schedstats on migrate? I'm thinking
>>> this breaks per-task, since tasks tend to bounce around quite a lot.
>>>
>>
>> Thanks for your comments, Peter.
>>
>> Looking at the documentation of schedstats I see that most values are
>> actually linked to the particular CPU: time spent on the cpu, timeslices
>> run on this cpu, number of times _something_ was called when the cpu was
>> idle, and so forth. Those values lose their meaning after migration and we
>> should reinitialize their counters. However, reviewing sched_statistics I
>> identify two fields that we should definitely keep increasing even after
>> migration (nr_migrations_cold, nr_forced_migrations).
>>
>
> The documentation is a little off. I think it should say "any cpu" instead
> of "this cpu". If you reset these per task counters (time on cpu, number
> of timeslices etc) on every migration then they become meaningless (and
> useless).
>
>
> Cheers,
> Phil
>

Well that clarifies it! Then, let me ask the opposite question... What
fields of schedstats should we clear when migrating? If there isn't any,
I will just increase the number of migrations.

>> So this patch will have to be upgraded if there's some other value(s) in
>> schedstats that we do not want to reinitialize either.
>>
>>>> Signed-off-by: Carlos Bilbao <carlos.bilbao@xxxxxxx>
>>>> ---
>>>> kernel/sched/core.c | 10 +++++++++-
>>>> 1 file changed, 9 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>>>> index fe53e510e711..d64c2a290176 100644
>>>> --- a/kernel/sched/core.c
>>>> +++ b/kernel/sched/core.c
>>>> @@ -8757,6 +8757,7 @@ bool sched_smp_initialized __read_mostly;
>>>> int migrate_task_to(struct task_struct *p, int target_cpu)
>>>> {
>>>> struct migration_arg arg = { p, target_cpu };
>>>> + uint64_t forced_migrations, migrations_cold;
>>>> int curr_cpu = task_cpu(p);
>>>>
>>>> if (curr_cpu == target_cpu)
>>>> @@ -8765,7 +8766,14 @@ int migrate_task_to(struct task_struct *p, int target_cpu)
>>>> if (!cpumask_test_cpu(target_cpu, p->cpus_ptr))
>>>> return -EINVAL;
>>>>
>>>> - /* TODO: This is not properly updating schedstats */
>>>> + if (schedstat_enabled()) {
>>>> + forced_migrations = schedstat_val(p->stats.nr_forced_migrations);
>>>> + migrations_cold = schedstat_val(p->stats.nr_migrations_cold);
>>>> + memset(&p->stats, 0, sizeof(p->stats));
>>>> + schedstat_set(p->stats.nr_forced_migrations, forced_migrations);
>>>> + schedstat_set(p->stats.nr_migrations_cold, migrations_cold);
>>>> + schedstat_inc(p->stats.nr_migrations_cold);
>>>> + }
>>>>
>>>> trace_sched_move_numa(p, curr_cpu, target_cpu);
>>>> return stop_one_cpu(curr_cpu, migration_cpu_stop, &arg);
>>>> --
>>>> 2.27.0
>>>>
>>
>> Thanks,
>> Carlos
>>
>

Thanks,
Carlos