Re: [RFC PATCH 2/2] sched: idle: IRQ based next prediction for idle period

From: Daniel Lezcano
Date: Tue Jan 12 2016 - 09:52:50 EST


On 01/12/2016 03:26 PM, Thomas Gleixner wrote:
On Tue, 12 Jan 2016, Daniel Lezcano wrote:
On 01/12/2016 02:42 PM, Thomas Gleixner wrote:
On Tue, 12 Jan 2016, Daniel Lezcano wrote:
On 01/08/2016 04:43 PM, Thomas Gleixner wrote:
+ /*
+ * Register the setup/free irq callbacks, so new interrupt or
+ * freed interrupt will update their tracking.
+ */
+ ret = register_irq_timings(&irqt_ops);
+ if (ret) {
+ pr_err("Failed to register timings ops\n");
+ return ret;
+ }

So that stuff is installed unconditionally. Is it used unconditionally
as
well?

Sorry, I am not sure to understand your question. If the kernel is
compiled
with CONFIG_CPU_IDLE_GOV_SCHED=y, this code is enabled and use the irq
timings. The condition comes from the compilation option.

The question is whether the option also activates that thing or is there
still
some /sys/whatever/idlegov magic where you can (de)select it.

Yes, in the next patches of the series I did not send, we can switch to the
cpuidle's governor framework or idle-sched. I will look at how to disable it
when switching to the cpuidle's governors.

You better implement the switching part in the cpuidle core first, i.e. proper
callbacks when a governor is switched in/out. Then make use of this switcheroo
right away. Doing it the other way round is just wrong.

The problem is this code is not another governor but a 'predictor' where the scheduler will use the information to ask the cpuidle to go to a specific idle state without going through the governor code, so into the governor's callbacks. It is on top of cpuidle. The scheduler will become the governor.

The current straightforward code, does the switch in the cpu_idle_loop idle_task's function:

[ ... ]

if (cpu_idle_force_poll || tick_check_broadcast_expired())
cpu_idle_poll();
else {
if (sched_idle_enabled()) {
int latency = pm_qos_request(PM_QOS_CPU_DMA_LATENCY);
s64 duration = sched_idle_next_wakeup();
sched_idle(duration, latency);
} else {
cpuidle_idle_call();
}
}

Due to the complexity of the code, this first step introduce a mechanism to predict the next event and re-use it trivially in the idle task.

Perhaps, it would be acceptable to have cpuidle_idle_call() to be replaced by a callback and the switch acts at this level ?



--
<http://www.linaro.org/> Linaro.org â Open source software for ARM SoCs

Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog