Re: [PATCH v2 8/8] sched/fair: use utilization to select misfit task

From: Valentin Schneider
Date: Fri Aug 02 2019 - 06:49:11 EST


On 02/08/2019 09:29, Vincent Guittot wrote:
>> However what would be *even* better IMO would be:
>>
>> -----8<-----
>> @@ -8853,6 +8853,7 @@ voluntary_active_balance(struct lb_env *env)
>> return 1;
>> }
>>
>> + /* XXX: make sure current is still a misfit task? */
>> if (env->balance_type == migrate_misfit)
>> return 1;
>>
>> @@ -8966,6 +8967,20 @@ static int load_balance(int this_cpu, struct rq *this_rq,
>> env.src_rq = busiest;
>>
>> ld_moved = 0;
>> +
>> + /*
>> + * Misfit tasks are only misfit if they are currently running, see
>> + * update_misfit_status().
>> + *
>> + * - If they're not running, we'll get an opportunity at wakeup to
>> + * migrate them to a bigger CPU.
>> + * - If they're running, we need to active balance them to a bigger CPU.
>> + *
>> + * Do the smart thing and go straight for active balance.
>> + */
>> + if (env->balance_type == migrate_misfit)
>> + goto active_balance;
>> +
>
> This looks ugly and add a new bypass which this patchset tries to remove
> This doesn't work if your misfit task has been preempted by another
> one during the load balance and waiting for the runqueue

I won't comment on aesthetics, but when it comes to preempted misfit tasks
do consider that a task *has* to have a utilization >= 80% of its CPU's
capacity to be flagged as misfit.
If it's a busy loop, it can only be preempted ~20% of the time to still be
flagged as a misfit task, so going straight for active balance will be the
right thing to do in the majority of cases.

What we gain from doing that is we save ourselves from (potentially
needlessly) iterating over the rq's tasks. That's less work and less rq
lock contention.

To put a bit of contrast on this, Qais did some profiling of usual mobile
workloads on a regular 4+4 big.LITTLE smartphone and IIRC the rq depth rose
very rarely above 5, although the tail did reach ~100 tasks. So most of the
time it would be fine to go through the detach_tasks() path.

This all deserves some actual benchmarking, I'll give it a shot.

>> if (busiest->nr_running > 1) {
>> /*
>> * Attempt to move tasks. If find_busiest_group has found
>> @@ -9074,7 +9089,7 @@ static int load_balance(int this_cpu, struct rq *this_rq,
>> goto out_all_pinned;
>> }
>> }
>> -
>> +active_balance:
>> if (!ld_moved) {
>> schedstat_inc(sd->lb_failed[idle]);
>> /*
>> ----->8-----