Re: [RFC v2 3/7] Improve the tracking of active utilisation

From: Peter Zijlstra
Date: Tue Apr 05 2016 - 11:00:47 EST


On Fri, Apr 01, 2016 at 05:12:29PM +0200, Luca Abeni wrote:
> +static void task_go_inactive(struct task_struct *p)
> +{
> + struct sched_dl_entity *dl_se = &p->dl;
> + struct hrtimer *timer = &dl_se->inactive_timer;
> + struct dl_rq *dl_rq = dl_rq_of_se(dl_se);
> + struct rq *rq = rq_of_dl_rq(dl_rq);
> + ktime_t now, act;
> + s64 delta;
> + u64 zerolag_time;
> +
> + WARN_ON(dl_se->dl_runtime == 0);
> +
> + /* If the inactive timer is already armed, return immediately */
> + if (hrtimer_active(&dl_se->inactive_timer))
> + return;

So while we start the timer on the local cpu, we don't migrate the timer
when we migrate the task, so the callback can happen on a remote cpu,
right?

Therefore, the timer function might still be running, but just have done
task_rq_unlock(), which would have allowed our cpu to acquire the
rq->lock and get here.

Then the above check is true, we'll quit, but effectively the inactive
timer will not run 'again'.

> +
> +
> + /*
> + * We want the timer to fire at the "0 lag time", but considering
> + * that it is actually coming from rq->clock and not from
> + * hrtimer's time base reading.
> + */
> + zerolag_time = dl_se->deadline -
> + div64_long((dl_se->runtime * dl_se->dl_period),
> + dl_se->dl_runtime);
> +
> + act = ns_to_ktime(zerolag_time);
> + now = hrtimer_cb_get_time(timer);
> + delta = ktime_to_ns(now) - rq_clock(rq);
> + act = ktime_add_ns(act, delta);
> +
> + /*
> + * If the "0-lag time" already passed, decrease the active
> + * utilization now, instead of starting a timer
> + */
> + if (ktime_us_delta(act, now) < 0) {
> + sub_running_bw(dl_se, dl_rq);
> + if (!dl_task(p))
> + __dl_clear_params(p);
> +
> + return;
> + }
> +
> + get_task_struct(p);
> + hrtimer_start(timer, act, HRTIMER_MODE_ABS);
> +}


> @@ -1071,6 +1164,23 @@ select_task_rq_dl(struct task_struct *p, int cpu, int sd_flag, int flags)
> }
> rcu_read_unlock();
>
> + if (rq != cpu_rq(cpu)) {

I don't think this is right, you want:

if (task_cpu(p) != cpu) {

because @cpu does not need to be task_cpu().

> + int migrate_active;
> +
> + raw_spin_lock(&rq->lock);

Which then also means @rq is 'wrong', so you'll have to add:

rq = task_rq(p);

before this.

> + migrate_active = hrtimer_active(&p->dl.inactive_timer);
> + if (migrate_active)
> + sub_running_bw(&p->dl, &rq->dl);
> + raw_spin_unlock(&rq->lock);

At this point task_rq() is still the above rq, so if the inactive timer
hits here it will lock this rq and subtract the running bw here _again_,
right?

> + if (migrate_active) {
> + rq = cpu_rq(cpu);
> + raw_spin_lock(&rq->lock);
> + add_running_bw(&p->dl, &rq->dl);
> + raw_spin_unlock(&rq->lock);
> + }
> + }