Re: [RFC PATCHC 3/3] sched/fair: use the idle state info to choose the idlest cpu

From: Daniel Lezcano
Date: Thu Apr 17 2014 - 12:05:32 EST


On 04/17/2014 05:53 PM, Nicolas Pitre wrote:
On Thu, 17 Apr 2014, Daniel Lezcano wrote:

Ok, refreshed the patchset but before sending it out I would to discuss about
the rational of the changes and the policy, and change the patchset
consequently.

What order to choose if the cpu is idle ?

Let's assume all cpus are idle on a dual socket quad core.

Also, we can reasonably do the hypothesis if the cluster is in low power mode,
the cpus belonging to the same cluster are in the same idle state (putting
apart the auto-promote where we don't have control on).

If the policy you talk above is 'aggressive power saving', we can follow the
rules with decreasing priority:

1. We want to prevent to wakeup the entire cluster
=> as the cpus are in the same idle state, by choosing a cpu in
=> shallow
state, we should have the guarantee we won't wakeup a cluster (except if no
shallowest idle cpu are found).

This is unclear to me. Obviously, if an entire cluster is down, that
means all the CPUs it contains have been idle for a long time. And
therefore they shouldn't be subject to selection unless there is no
other CPUs available. Is that what you mean?

Yes, this is what I meant. But also what I meant is we can get rid for the moment of the cpu topology and the coupling idle state because if we do this described approach, as the idle state will be the same for the cpus belonging to the same cluster we won't select a cluster down (except if there is no other CPUs available).

2. We want to prevent to wakeup a cpu which did not reach the target residency
time (will need some work to unify cpuidle idle time and idle task run time)
=> with the target residency and, as a first step, with the idle
=> stamp,
we can determine if the cpu slept enough

Agreed. However, right now, the scheduler does not have any
consideration for that. So this should be done as a separate patch.

Yes, I thought as a very first step we can rely on the idle stamp until we unify the times with a big comment. Or I can first unify the idle times and then take into account the target residency. It is to comply with Rafael's request to have the 'big picture'.

3. We want to prevent to wakeup a cpu in deep idle state
=> by looking for the cpu in shallowest idle state

Obvious.

4. We want to prevent to wakeup a cpu where the exit latency is longer than
the expected run time of the task (and the time to migrate the task ?)

Sure. That would be a case for using task packing even if the policy is
set to performance rather than powersave whereas task packing is
normally for powersave.

Yes, I agree, task packing improves also the performances and it makes really sense to prevent task migration under some circumstances for a better cache efficiency.

Thanks for the comments

-- Daniel

--
<http://www.linaro.org/> Linaro.org â Open source software for ARM SoCs

Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/