Re: [3.2.16 -> 3.2.17 regression] High reported CPU load when idle

From: Jonathan Nieder
Date: Sun Jun 10 2012 - 13:50:31 EST


Hi Doug et al,

Doug Smythies wrote:

> "does 556061b00c9f ("sched/nohz: Fix rq->cpu_load[] calculations",
> 2012-05-11) change anything?"
>
> I back edited those changes into my test environment yesterday. It
> made no difference with respect to this issue. (minimally tested.)
[...]
> By the way, I found and tested 5aaa0b7a2ed5b12692c9ffb5222182bd558d3146
> It is similar (minimally tested).
>
> I am certainly not an expert, and I find the load average area of the
> code extremely difficult to follow and understand. That being said, I
> think the root issue here is the 10 tick grace period. I think that
> cpu idle enter exit transitions can not be ignored during this period,
> and somehow needs to be accumulated towards the next sample time. So far,
> I have been unsuccessful trying to help with a suggested solution. I will
> continue to try.

Another load average related patch is being discussed (not meant
particularly to address the too-low load case, just mentioning it
FYI):

sched: Folding nohz load accounting more accurate

After patch 453494c3d4 (sched: Fix nohz load accounting -- again!), we can fold
the idle into calc_load_tasks_idle between the last cpu load calculating and
calc_global_load calling. However problem still exits between the first cpu
load calculating and the last cpu load calculating. Every time when we do load
calculating, calc_load_tasks_idle will be added into calc_load_tasks, even if
the idle load is caused by calculated cpus. This problem is also described in
the following link:

https://lkml.org/lkml/2012/5/24/419

This bug can be found in our work load. The average running processes number
is about 15, but the load only shows about 4.

>From [*].

Hope that helps,
Jonathan

[*] http://thread.gmane.org/gmane.linux.kernel/1310462
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/