Re: [PATCH v3 04/10] sched/fair: rework load_balance

From: Parth Shah
Date: Fri Oct 18 2019 - 01:34:21 EST




On 10/16/19 5:26 PM, Vincent Guittot wrote:
> On Wed, 16 Oct 2019 at 09:21, Parth Shah <parth@xxxxxxxxxxxxx> wrote:
>>
>>
>>
>> On 9/19/19 1:03 PM, Vincent Guittot wrote:
>>
>> [...]
>>
>>> Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
>>> ---
>>> kernel/sched/fair.c | 585 ++++++++++++++++++++++++++++++++++------------------
>>> 1 file changed, 380 insertions(+), 205 deletions(-)
>>>
>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>> index 017aad0..d33379c 100644
>>> --- a/kernel/sched/fair.c
>>> +++ b/kernel/sched/fair.c
>>> @@ -7078,11 +7078,26 @@ static unsigned long __read_mostly max_load_balance_interval = HZ/10;
>>>
>>> enum fbq_type { regular, remote, all };
>>>
>>> +/*
>>> + * group_type describes the group of CPUs at the moment of the load balance.
>>> + * The enum is ordered by pulling priority, with the group with lowest priority
>>> + * first so the groupe_type can be simply compared when selecting the busiest
>>> + * group. see update_sd_pick_busiest().
>>> + */
>>> enum group_type {
>>> - group_other = 0,
>>> + group_has_spare = 0,
>>> + group_fully_busy,
>>> group_misfit_task,
>>> + group_asym_packing,
>>> group_imbalanced,
>>> - group_overloaded,
>>> + group_overloaded
>>> +};
>>> +
>>> +enum migration_type {
>>> + migrate_load = 0,
>>> + migrate_util,
>>> + migrate_task,
>>> + migrate_misfit
>>> };
>>>
>>> #define LBF_ALL_PINNED 0x01
>>> @@ -7115,7 +7130,7 @@ struct lb_env {
>>> unsigned int loop_max;
>>>
>>> enum fbq_type fbq_type;
>>> - enum group_type src_grp_type;
>>> + enum migration_type balance_type;
>>> struct list_head tasks;
>>> };
>>>
>>> @@ -7347,7 +7362,7 @@ static int detach_tasks(struct lb_env *env)
>>> {
>>> struct list_head *tasks = &env->src_rq->cfs_tasks;
>>> struct task_struct *p;
>>> - unsigned long load;
>>> + unsigned long util, load;
>>> int detached = 0;
>>>
>>> lockdep_assert_held(&env->src_rq->lock);
>>> @@ -7380,19 +7395,53 @@ static int detach_tasks(struct lb_env *env)
>>> if (!can_migrate_task(p, env))
>>> goto next;
>>>
>>> - load = task_h_load(p);
>>> + switch (env->balance_type) {
>>> + case migrate_load:
>>> + load = task_h_load(p);
>>>
>>> - if (sched_feat(LB_MIN) && load < 16 && !env->sd->nr_balance_failed)
>>> - goto next;
>>> + if (sched_feat(LB_MIN) &&
>>> + load < 16 && !env->sd->nr_balance_failed)
>>> + goto next;
>>>
>>> - if ((load / 2) > env->imbalance)
>>> - goto next;
>>> + if ((load / 2) > env->imbalance)
>>> + goto next;
>>> +
>>> + env->imbalance -= load;
>>> + break;
>>> +
>>> + case migrate_util:
>>> + util = task_util_est(p);
>>> +
>>> + if (util > env->imbalance)
>>
>> Can you please explain what would happen for
>> `if (util/2 > env->imbalance)` ?
>> just like when migrating load, even util shouldn't be migrated if
>> env->imbalance is just near the utilization of the task being moved, isn't it?
>
> I have chosen uti and not util/2 to be conservative because
> migrate_util is used to fill spare capacity.
> With `if (util/2 > env->imbalance)`, we can more easily overload the
> local group or pick too much utilization from the overloaded group.
>

fair enough. I missed the point that unlike migrate_load, with
migrate_util, env->imbalance is just spare capacity of the local group.

Thanks,
Parth

>>
>>> + goto next;
>>> +
>>> + env->imbalance -= util;
>>> + break;
>>> +[ ... ]
>>
>> Thanks,
>> Parth
>>