Re: SCHED_RR vs push-pull

From: Steven Rostedt
Date: Fri Dec 11 2015 - 14:54:10 EST


On Fri, 11 Dec 2015 20:39:18 +0100
Luca Abeni <luca.abeni@xxxxxxxx> wrote:

> Hi Peter,
>
> On Fri, 11 Dec 2015 15:10:28 +0100
> Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> [...]
> > Thomas just reported a 'fun' problem with our rt 'load-balancer'.
> I suspect the root of the proble is that rt push/pull do not implement
> a load balancer, but just make sure that the M high priority tasks
> (where M is the number of CPUs) are scheduled for execution.
> The difference with a "real" load balancer can be seen when there are
> multiple tasks with the same priority :)

Yep.

>
>
> > The problem is 2 cpus 4 equal prio RR tasks.
> > Suppose an unequal distribution of these tasks among the CPUs; eg 1:3.
> >
> > Now one would expect; barring other constraints; that each CPU would
> > get 2 of the tasks and they would RR on their prio level.
> >
> > This does not happen.
> >
> > The push-pull thing only acts when there's idle or new tasks, and in
> > the above scenario, the CPU with only the single RR task will happily
> > continue running that task, while the other CPU will have to RR
> > between the remaining 3.
> I might be wrong, but I think this is due to the
> if (lowest_rq->rt.highest_prio.curr <= task->prio) {
> in rt.c::find_lock_lowest_rq().
> I suspect that changing "<=" in "<" might fix the issue, at the cost of
> generating a lot of useless tasks migrations.

I'm against this. Unless we only do this when current and the task we
want to move are both RR. It might help.


>
> > Now my initial thoughts were to define a global RR order using a
> > virtual timeline and you'll get something like EEVDF on a per RR prio
> > level with push-pull state between that.
> >
> > Which might be a tad over engineered.
> I suspect this issue can be fixed in a simpler way, by changing the
> check I mentioned above.

What happens when current is FIFO, and we just moved an RR task over
that will now never run?

>
> If you want to balance SCHED_RR tasks with the same priority, I think
> the "lowest_rq->rt.highest_prio.curr <= task->prio" should be extended
> to do the migration if:
> - the local task has a higher priority than the highest priority task
> on lowest_rq (this is what's currently done)
> - the local task has the same priority of the highest priority task on
> lowest_rq and they are SCHED_RR and the number of tasks with
> task->prio on the local RQ is larger than the number of tasks with
> lowest_rq->rt.highest_prio.curr on lowest_rq + 1.

Well, the number of tasks may not be good enough. We need to only look
at RR tasks. Perhaps if current is RR and the waiter on the other CPU
is RR, we can do a scan to see if a balance should be done.

>
> I think this could work, but I just looked at the code, without any
> real test. If you provide a simple program implementing a testcase, I
> can do some experiments in next week.
>
> The alternative (of course I have to mention it :) would be to use
> SCHED_DEADLINE instead of SCHED_RR.

Hmm, I wonder if we could have a wrapper around SCHED_DEADLINE to
implement SCHED_RR. Probably not, because SCHED_RR has hard coded
priorities and SCHED_DEADLINE is more dynamic (and still higher than
SCHED_FIFO).

> >
> > Happy thinking ;-)

Heh, I originally thought Peter said "Happy Thanksgiving".

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/