Re: [RFC][PATCH v2] sched/rt: Use IPI to trigger RT task push migration instead of pulling

From: Hillf Danton
Date: Wed Feb 25 2015 - 02:57:15 EST


> +static void try_to_push_tasks(void *arg)
> +{
> + struct rt_rq *rt_rq = arg;
> + struct rq *rq, *next_rq;
> + int next_cpu = -1;
> + int next_prio = MAX_PRIO + 1;
> + int this_prio;
> + int src_prio;
> + int prio;
> + int this_cpu;
> + int success;
> + int cpu;
> +
> + /* Make sure we can see csd_cpu */
> + smp_rmb();
> +
> + this_cpu = rt_rq->push_csd_cpu;
> +
> + /* Paranoid check */
> + BUG_ON(this_cpu != smp_processor_id());
> +
> + rq = cpu_rq(this_cpu);
> +
> + /*
> + * If there's nothing to push here, then see if another queue
> + * can push instead.
> + */
> + if (!has_pushable_tasks(rq))
> + goto pass_the_ipi;
> +
> + raw_spin_lock(&rq->lock);
> + success = push_rt_task(rq);
> + raw_spin_unlock(&rq->lock);
> +
> + if (success)
> + goto done;

The latency, 150us over a 20 hour run, goes up if we goto done directly?
Hillf
> +
> + /* Nothing was pushed, try another queue */
> +pass_the_ipi:
> +
> + /*
> + * We use the priority that determined to send to this CPU
> + * even if the priority for this CPU changed. This is used
> + * to determine what other CPUs to send to, to keep from
> + * doing a ping pong from each CPU.
> + */
> + this_prio = rt_rq->push_csd_prio;
> + src_prio = rt_rq->highest_prio.curr;
> +
> + for_each_cpu(cpu, rq->rd->rto_mask) {
> + if (this_cpu == cpu)
> + continue;
> +
> + /*
> + * This function was called because some rq lowered its
> + * priority. It then searched for the highest priority
> + * rq that had overloaded tasks and sent an smp function
> + * call to that cpu to call this function to push its
> + * tasks. But when it got here, the task was either
> + * already pushed, or due to affinity, could not move
> + * the overloaded task.
> + *
> + * Now we need to see if there's another overloaded rq that
> + * has an RT task that can migrate to that CPU.
> + *
> + * We need to be careful, we do not want to cause a ping
> + * pong between this CPU and another CPU that has an RT task
> + * that can migrate, but not to the CPU that lowered its
> + * priority. Since the lowering priority CPU finds the highest
> + * priority rq to send to, we will ignore any rq that is of higher
> + * priority than this current one. That is, if a rq scheduled a
> + * task of higher priority, the schedule itself would do the
> + * push or pull then. We can safely ignore higher priority rqs.
> + * And if there's one that is the same priority, since the CPUS
> + * are searched in order we will ignore CPUS of the same priority
> + * unless the CPU number is greater than this CPU's number.
> + */
> + next_rq = cpu_rq(cpu);
> +
> + /* Use a single read for the next prio for decision making */
> + prio = READ_ONCE(next_rq->rt.highest_prio.next);
> +
> + /* Looking for highest priority */
> + if (prio >= next_prio)
> + continue;
> +
> + /* Make sure that the rq can push to the source rq */
> + if (prio >= src_prio)
> + continue;
> +
> + /* If the prio is higher than the current prio, ignore it */
> + if (prio < this_prio)
> + continue;
> +
> + /*
> + * If the prio is equal to the current prio, only use it
> + * if the cpu number is greater than the current cpu.
> + * This prevents a ping pong effect.
> + */
> + if (prio == this_prio && cpu < this_cpu)
> + continue;
> +
> + next_prio = prio;
> + next_cpu = cpu;
> + }
> +
> + /* Nothing found, do nothing */
> + if (next_cpu < 0)
> + goto done;
> +
> + /*
> + * Now we can not send another smp async function due to locking,
> + * use irq_work instead.
> + */
> +
> + rt_rq->push_csd_cpu = next_cpu;
> + rt_rq->push_csd_prio = next_prio;
> +
> + /* Make sure the next cpu is seen on remote CPU */
> + smp_mb();
> +
> + irq_work_queue_on(&rt_rq->push_csd_work, next_cpu);
> +
> + return;
> +
> +done:
> + rt_rq->push_csd_pending = 0;
> +
> + /* Now make sure the src CPU can see this update */
> + smp_wmb();
> +}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/