[ PATCH 0/8] sched: remove cpu_load array

From: Alex Shi
Date: Thu Mar 13 2014 - 01:58:56 EST


In the cpu_load decay usage, we mixed the long term, short term load with
balance bias, randomly pick a big/small value from them according to balance
destination or source. This mix is wrong, the balance bias should be based
on task moving cost between cpu groups, not on random history or instant load.
History load maybe diverage a lot from real load, that lead to incorrect bias.

In fact, the cpu_load decays can be replaced by the sched_avg decay, that
also decays load on time. The balance bias part can fullly use fixed bias --
imbalance_pct, which is already used in newly idle, wake, forkexec balancing
and numa balancing scenarios.

Currently the only working idx is busy_idx and idle_idx.
As to busy_idx:
We mix history load decay and bias together. The ridiculous thing is, when
all cpu load are continuous stable, long/short term load is same. then we
lose the bias meaning, so any minimum imbalance may cause unnecessary task
moving. To prevent this funny thing happen, we have to reuse the
imbalance_pct again in find_busiest_group(). But that clearly causes over
bias in normal time. If there are some burst load in system, it is more worse.

As to idle_idx:
Though I have some cencern of usage corretion,
https://lkml.org/lkml/2014/3/12/247, but since we are working on cpu
idle migration into scheduler. The problem will be reconsidered. We don't
need to care it now.

This patch removed the cpu_load idx decay, since it can be replaced by
sched_avg feature. and left the imbalance_pct bias untouched, since only
idle_idx missed it, but it is fine. and will be reconsidered soon.


V5,
1, remove unify bias patch and biased_load function. Thanks for PeterZ's
comments!
2, remove get_sd_load_idx() in the 1st patch as SrikarD's suggestion.
3, remove LB_BIAS feature, it is not needed now.

V4,
1, rebase on latest tip/master
2, replace target_load by biased_load as Morten's suggestion

V3,
1, correct the wake_affine bias. Thanks for Morten's reminder!
2, replace source_load by weighted_cpuload for better function name meaning.

V2,
1, This version do some tuning on load bias of target load.
2, Got further to remove the cpu_load in rq.
3, Revert the patch 'Limit sd->*_idx range on sysctl' since no needs

Any testing/comments are appreciated.

This patch rebase on latest tip/master.
The git tree for this patchset at:
git@xxxxxxxxxx:alexshi/power-scheduling.git noload

Thanks
Alex

[PATCH 1/8] sched: shortcut to remove load_idx
[PATCH 2/8] sched: remove rq->cpu_load[load_idx] array
[PATCH 3/8] sched: remove source_load and target_load
[PATCH 4/8] sched: remove LB_BIAS
[PATCH 5/8] sched: clean up cpu_load update
[PATCH 6/8] sched: rewrite update_cpu_load_nohz
[PATCH 7/8] sched: remove rq->cpu_load and rq->nr_load_updates
[PATCH 8/8] sched: rename update_*_cpu_load
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/