Re: [PATCH] sched: Reduce the rate of needless idle load balancing

From: Tim Chen
Date: Tue May 20 2014 - 17:39:38 EST


On Tue, 2014-05-20 at 14:09 -0700, Jason Low wrote:

> Hi Tim, Rik
>
> Yes, that makes sense that we want to balance if they are equal. We
> may also consider using "if (time_after_eq(jiffies,
> rq->next_balance)".
>
> Reviewed-by: Jason Low <jason.low2@xxxxxx>

Jason & Rik,

Thanks for reviewing the patch. I've updated the patch below as
suggested.

Tim

---

From: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>
Subject: [PATCH v2] sched: Reduce the rate of needless idle load balancing


The current no_hz idle load balancer do load balancing for *all* idle cpus,
even though the time due to load balance for a particular
idle cpu could be still a while in the future. This introduces a much
higher load balancing rate than what is necessary. The patch
changes the behavior by only doing idle load balancing on
behalf of an idle cpu only when it is due for load balancing.

On SGI's systems with over 3000 cores, the cpu responsible for idle balancing
got overwhelmed with idle balancing, and introduces a lot of OS noise
to workloads. This patch fixes the issue.

Acked-by: Russ Anderson <rja@xxxxxxx>
Reviewed-by: Rik van Riel <riel@xxxxxxxxxx>
Reviewed-by: Jason Low <jason.low2@xxxxxx>
Signed-off-by: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>
---
kernel/sched/fair.c | 17 +++++++++++------
1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 9b4c4f3..b826c3a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6764,12 +6764,17 @@ static void nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle)

rq = cpu_rq(balance_cpu);

- raw_spin_lock_irq(&rq->lock);
- update_rq_clock(rq);
- update_idle_cpu_load(rq);
- raw_spin_unlock_irq(&rq->lock);
-
- rebalance_domains(rq, CPU_IDLE);
+ /*
+ * If time for next balance is due,
+ * do the balance.
+ */
+ if (time_after_eq(jiffies, rq->next_balance)) {
+ raw_spin_lock_irq(&rq->lock);
+ update_rq_clock(rq);
+ update_idle_cpu_load(rq);
+ raw_spin_unlock_irq(&rq->lock);
+ rebalance_domains(rq, CPU_IDLE);
+ }

if (time_after(this_rq->next_balance, rq->next_balance))
this_rq->next_balance = rq->next_balance;
--
1.7.11.7




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/