[PATCH 5/6] sched: Remove irq time from available CPU power

From: Venkatesh Pallipadi
Date: Thu Sep 16 2010 - 21:58:09 EST


The idea suggested by Peter Zijlstra here.
http://marc.info/?l=linux-kernel&m=127476934517534&w=2

irq time is technically not available to the tasks running on the CPU.
This patch removes irq time from CPU power piggybacking on
sched_rt_avg_update().

Tested this by keeping CPU X busy with 75% irq processing (hard+soft) on
an 4-way system. And start 7 cycle soakers on the system. Without this change,
there will be 2 tasks on each CPU. With this change, there is still a
single task on irq busy CPU and remaining 7 tasks are spread around among
other 3 CPUs.

Signed-off-by: Venkatesh Pallipadi <venki@xxxxxxxxxx>
---
kernel/sched.c | 14 ++++++++++++++
kernel/sched_fair.c | 3 +++
kernel/sched_features.h | 5 +++++
3 files changed, 22 insertions(+), 0 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index f36697b..8ac5389 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -2025,6 +2025,18 @@ static u64 unaccount_irq_delta(u64 delta, int cpu, u64 *saved_irq_time)
#define unaccount_irq_delta_rt(delta, cpu, class_rq) \
unaccount_irq_delta(delta, cpu, &(class_rq)->saved_irq_time)

+static void sched_irq_power_update_fair(int cpu, struct cfs_rq *cfs_rq,
+ struct rq* rq)
+{
+ if (!sched_clock_irqtime)
+ return;
+
+ if (likely(rq->total_irq_time > cfs_rq->saved_irq_time)) {
+ sched_rt_avg_update(rq,
+ rq->total_irq_time - cfs_rq->saved_irq_time);
+ }
+}
+
#else

#define update_irq_time(cpu, crq) do { } while (0)
@@ -2042,6 +2054,8 @@ static u64 unaccount_irq_delta_rt(u64 delta_exec, int cpu, struct rt_rq *rt_rq)
return delta_exec;
}

+#define sched_irq_power_update_fair(cpu, crq, rq) do { } while (0)
+
#endif

#include "sched_idletask.c"
diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index a64fdaf..937fded 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -526,6 +526,9 @@ static void update_curr(struct cfs_rq *cfs_rq)
if (unlikely(!curr))
return;

+ if (sched_feat(NONIRQ_POWER) && entity_is_task(curr))
+ sched_irq_power_update_fair(cpu, cfs_rq, rq_of(cfs_rq));
+
/*
* Get the amount of time the current task was running
* since the last time we changed load (this cannot
diff --git a/kernel/sched_features.h b/kernel/sched_features.h
index 83c66e8..185f920 100644
--- a/kernel/sched_features.h
+++ b/kernel/sched_features.h
@@ -61,3 +61,8 @@ SCHED_FEAT(ASYM_EFF_LOAD, 1)
* release the lock. Decreases scheduling overhead.
*/
SCHED_FEAT(OWNER_SPIN, 1)
+
+/*
+ * Decrement CPU power based on irq activity
+ */
+SCHED_FEAT(NONIRQ_POWER, 1)
--
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/