[PATCH] sched-rt: Reduce excessive task push rate by not pushing tasks with equal priority as the current task

From: Tim Chen
Date: Thu Jan 22 2015 - 13:54:10 EST



Commit 3be209a8 tries to migrate task of equal priority as the running
one to other cpus to balance load and eliminate any idle cpus. However,
for system that is fully busy and running workload of a few priorities,
we found this change to cause tasks getting pushed around without
improving cpu utilization. On a fully loaded system running a well known
OLTP benchmark, it causes 70% more run queue locking in the push task
path without improving cpu utilization and make throughput degrade by
1.5%. We observe much higher rq lock contention due to excessive lockings
of target run queues on task wakeup.

A previous patch we submitted that added a check only to
acquire lock on rq with lower priority tasks helped,
otherwise the regression will be 2.0%.
Our suspicion is there are higher priority tasks that wake up and run
for a short time, and balancing these tasks too much could hurt.

This patch reverts the change and we got 1.5% improvement to the well
known OLTP database benchmark. If reverting commit 3be209a8 is not an option,
I would appreciate suggestions on other ways to fix this regression.
Or perhaps provide an option not to push equal priority tasks on wake up?

Thanks.

Tim

Signed-off-by: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>
---
kernel/sched/rt.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 0e4382e..7cadc92 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -1334,7 +1334,7 @@ select_task_rq_rt(struct task_struct *p, int cpu, int sd_flag, int flags)
*/
if (curr && unlikely(rt_task(curr)) &&
(curr->nr_cpus_allowed < 2 ||
- curr->prio <= p->prio)) {
+ curr->prio < p->prio)) {
int target = find_lowest_rq(p);

if (target != -1 &&
@@ -1867,7 +1867,7 @@ static void task_woken_rt(struct rq *rq, struct task_struct *p)
p->nr_cpus_allowed > 1 &&
(dl_task(rq->curr) || rt_task(rq->curr)) &&
(rq->curr->nr_cpus_allowed < 2 ||
- rq->curr->prio <= p->prio))
+ rq->curr->prio < p->prio))
push_rt_tasks(rq);
}

--
1.8.3.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/