Re: [RFC v3 1/6] Track the active utilisation

From: Juri Lelli
Date: Tue Nov 08 2016 - 12:56:22 EST


On 01/11/16 22:10, Luca Abeni wrote:
> Hi Juri,
>
> On Tue, 1 Nov 2016 16:45:43 +0000
> Juri Lelli <juri.lelli@xxxxxxx> wrote:
>
> > Hi,
> >
> > a few nitpicks on subject and changelog and a couple of questions below.
> >
> > Subject should be changed to something like
> >
> > sched/deadline: track the active utilisation
> Ok; that's easy :)
> I guess a similar change should be applied to the subjects of all the
> other patches, right?
>

Yep. Subject have usually the form:

<modified_file(s)>: <short title>

>
> >
> > On 24/10/16 16:06, Luca Abeni wrote:
> > > The active utilisation here is defined as the total utilisation of the
> >
> > s/The active/Active/
> > s/here//
> > s/of the active/of active/
> Ok; I'll do this in the next revision of the patchset.
>

Thanks.

>
> > > active (TASK_RUNNING) tasks queued on a runqueue. Hence, it is increased
> > > when a task wakes up and is decreased when a task blocks.
> > >
> > > When a task is migrated from CPUi to CPUj, immediately subtract the task's
> > > utilisation from CPUi and add it to CPUj. This mechanism is implemented by
> > > modifying the pull and push functions.
> > > Note: this is not fully correct from the theoretical point of view
> > > (the utilisation should be removed from CPUi only at the 0 lag time),
> >
> > a more theoretically sound solution will follow.
> Notice that even the next patch (introducing the "inactive timer") ends up
> migrating the utilisation immediately (on tasks' migration), without waiting
> for the 0-lag time.
> This is because of the reason explained in the following paragraph:
>

OK, but is still _more_ theoretically sound. :)

> > > but doing the right thing would be _MUCH_ more complex (leaving the
> > > timer armed when the task is on a different CPU... Inactive timers should
> > > be moved from per-task timers to per-runqueue lists of timers! Bah...)
> >
> > I'd remove this paragraph above.
> Ok. Re-reading the changelog, I suspect this is not the correct place for this
> comment.
>
>
> > > The utilisation tracking mechanism implemented in this commit can be
> > > fixed / improved by decreasing the active utilisation at the so-called
> > > "0-lag time" instead of when the task blocks.
> >
> > And maybe this as well, or put it as more information about the "more
> > theoretically sound" solution?
> Ok... I can remove the paragraph, or point to the next commit (which
> implements the more theoretically sound solution). Is such a "forward
> reference" in changelogs ok?
>

I'd just say that a better solution will follow. The details about why
it's better might be then put in the changelog and as comments in the
code of the next patch.

> [...]
> > > @@ -947,14 +965,19 @@ static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags)
> > > return;
> > > }
> > >
> > > + if (p->on_rq == TASK_ON_RQ_MIGRATING)
> > > + add_running_bw(&p->dl, &rq->dl);
> > > +
> > > /*
> > > * If p is throttled, we do nothing. In fact, if it exhausted
> > > * its budget it needs a replenishment and, since it now is on
> > > * its rq, the bandwidth timer callback (which clearly has not
> > > * run yet) will take care of this.
> > > */
> > > - if (p->dl.dl_throttled && !(flags & ENQUEUE_REPLENISH))
> > > + if (p->dl.dl_throttled && !(flags & ENQUEUE_REPLENISH)) {
> > > + add_running_bw(&p->dl, &rq->dl);
> >
> > Don't rememeber if we discussed this already, but do we need to add the bw here
> > even if the task is not actually enqueued until after the replenishment timer
> > fires?
> I think yes... The active utilization does not depend on the fact that the task
> is on the runqueue or not, but depends on the task's state (in GRUB parlance,
> "inactive" vs "active contending"). In other words, even when a task is throttled
> its utilization must be counted in the active utilization.
>

OK. Could you add a comment about this point please (so that I don't
forget again :)?

>
> [...]
> > > /*
> > > * Since this might be the only -deadline task on the rq,
> > > * this is the right place to try to pull some other one
> > > @@ -1712,6 +1748,7 @@ static void switched_from_dl(struct rq *rq, struct task_struct *p)
> > > */
> > > static void switched_to_dl(struct rq *rq, struct task_struct *p)
> > > {
> > > + add_running_bw(&p->dl, &rq->dl);
> > >
> > > /* If p is not queued we will update its parameters at next wakeup. */
> > > if (!task_on_rq_queued(p))
> >
> > Don't we also need to remove bw in task_dead_dl()?
> I think task_dead_dl() is invoked after invoking dequeue_task_dl(), which takes care
> of this... Or am I wrong? (I think I explicitly tested this, and modifications to
> task_dead_dl() turned out to be unneeded)
>

Mmm. You explicitly check that TASK_ON_RQ_MIGRATING or DEQUEUE_SLEEP
(which btw can be actually put together with an or condition), so I
don't think that any of those turn out to be true when the task dies.
Also, AFAIU, do_exit() works on current and the TASK_DEAD case is
handled in finish_task_switch(), so I don't think we are taking care of
the "task is dying" condition.

Peter, does what I'm saying make any sense? :)

I still have to set up things here to test these patches (sorry, I was
travelling), but could you try to create some tasks and that kill them
from another shell to see if the accounting deviates or not? Or did you
already do this test?

Thanks,

- Juri