Re: High CPU load when machine is idle (related to PROBLEM: Unusually high load average when idle in 2.6.35, 2.6.35.1 and later)

From: tmhikaru
Date: Thu Dec 02 2010 - 05:17:12 EST


On Wed, Dec 01, 2010 at 04:27:38PM -0500, tmhikaru@xxxxxxxxx wrote:
> On Tue, Nov 30, 2010 at 03:59:05PM +0100, Peter Zijlstra wrote:
> > On Tue, 2010-11-30 at 00:01 +0100, Peter Zijlstra wrote:
> > >
> > > Ok, that's good testing.. so its still not quite the same as NO_HZ=n,
> > > how about this one?
> > >
> > > (it seems to drop down to 0.00 if I wait a few minutes with top -d5)
> >
> > OK, so here's a less crufty patch that gets the same result on my
> > machine, load drops down to 0.00 after a while.
> >
> > It seems a bit slower to reach 0.00, but that could be because I
> > actually changed the load computation for NO_HZ=n as well, I added a
> > rounding factor in calc_load(), we no longer truncate the division.
> >
> > If people want to compare, simply remove the third line from
> > calc_load(): load += 1UL << (FSHIFT - 1), to restore the old behaviour.
>
> For some bizzare reason, this version has a small but noticable amount of
> jitter and never really seems to hit 0.00 on my machine, tends to jump
> around at low values between 0.03 to 0.08 on a routine basis:
>
> 16:20:42 up 16:31, 4 users, load average: 0.00, 0.01, 0.05
>
> the jitter seems to have no visible reason for it happening; with no
> networking, disk access or a process waking up and demanding attention from
> the cpu, it goes back up.
>
> Mind this is obviously NOT as horrible as it was originally, but I'd like to
> find out why it's acting so differently.
>
> I'm going to try this variant again with that line you were talking about
> disabled and see if it reacts differently. I get the feeling if it's the
> rounding factor - since you say that was changed for BOTH nohz=y and n, that
> it's not really a problem in the first place, and likely is very low load
> that wasn't being accurately reported before.

Indeed, this seems to be the case:

04:50:14 up 5:45, 5 users, load average: 0.00, 0.00, 0.00

the average seems to not be jittery, or at least noticably, and reacts as I
have expected it to in the past with that single line disabled; Since you
have said that this change would affect all load calculations I have not
tested how this patch with the line enabled/disabled reacts with nohz=n,
please let me know if you would like me to test that condition anyway.

Personally since it changes the previous behavior of the load calculation
I'd prefer that the rounding not be done.

Tim McGrath

Attachment: pgp00000.pgp
Description: PGP signature