Re: [PATCH 3/4] sched/deadline: Make DL capacity-aware

From: Juri Lelli
Date: Wed Apr 15 2020 - 09:21:53 EST


On 15/04/20 11:39, Dietmar Eggemann wrote:
> On 10.04.20 14:52, Juri Lelli wrote:
> > Hi,
> >
> > On 08/04/20 11:50, Dietmar Eggemann wrote:
> >> From: Luca Abeni <luca.abeni@xxxxxxxxxxxxxxx>
>
> [...]
>
> >> @@ -1623,10 +1624,19 @@ select_task_rq_dl(struct task_struct *p, int cpu, int sd_flag, int flags)
> >> * other hand, if it has a shorter deadline, we
> >> * try to make it stay here, it might be important.
> >> */
> >> - if (unlikely(dl_task(curr)) &&
> >> - (curr->nr_cpus_allowed < 2 ||
> >> - !dl_entity_preempt(&p->dl, &curr->dl)) &&
> >> - (p->nr_cpus_allowed > 1)) {
> >> + select_rq = unlikely(dl_task(curr)) &&
> >> + (curr->nr_cpus_allowed < 2 ||
> >> + !dl_entity_preempt(&p->dl, &curr->dl)) &&
> >> + p->nr_cpus_allowed > 1;
> >> +
> >> + /*
> >> + * We take into account the capacity of the CPU to
> >> + * ensure it fits the requirement of the task.
> >> + */
> >> + if (static_branch_unlikely(&sched_asym_cpucapacity))
> >> + select_rq |= !dl_task_fits_capacity(p, cpu);
> >
> > I'm thinking that, while dl_task_fits_capacity() works well when
> > selecting idle cpus, in this case we should consider the fact that curr
> > might be deadline as well and already consuming some of the rq capacity.
> >
> > Do you think we should try to take that into account, maybe using
> > dl_rq->this_bw ?
>
> So you're saying that cpudl_find(..., later_mask) could return 1 (w/
> best_cpu (cp->elements[0].cpu) in later_mask).
>
> And that this best_cpu could be a non-fitting CPU for p.
>
> This could happen if cp->free_cpus is empty (no idle CPUs) so we take
> cpudl_find()'s else path and in case p's deadline < cp->elements[0]
> deadline.
>
> We could condition the 'return 1' on best_cpu fitting p.
>
> But should we do this for cpudl_find(..., NULL) calls from
> check_preempt_equal_dl() as well or will this break GEDF?

So, even by not returning best_cpu, as above, if it doesn't fit p's bw
requirement, I think we would be breaking GEDF, which however doesn't
take asym capacities into account. OTOH, if we let p migrate to a cpu
that can't suit it, it will still be missing its deadlines (plus it
would be causing deadline misses on the task that was running on
best_cpu).

check_preempt_equal_dl() worries me less, as it is there to service
corner cases (hopefully not so frequent).