Re: [stable] 2.6.23 regression: top displaying 9999% CPU usage

From: Balbir Singh
Date: Tue Oct 16 2007 - 05:31:53 EST


Christian Borntraeger wrote:
> Chuck, Balbir,
>
> we still have a problem with stime occosionally going backwards. I stated
> below that I think this is not fixable with the current utime/stime split
> algorithm.
>

Hi,

I missed seeing this problem before, sorry about that. Thanks for the
link below, I now understand the problem.

> Balbir, you wrote this code, Chuck you tried to fix it. Any ideas how to
> fix this properly? The only idea I have requires that we save the old value
> of utime and stime and therefore requires additional locking.
>

I am trying to think out loud as to what the root cause of the problem
might be. In one of the discussion threads, I saw utime going backwards,
which seemed very odd, I suspect that those are rounding errors.

I don't understand your explanation below

Initially utime = 9, stime = 0, sum_exec_runtime = S1

Later

utime = 9, stime = 1, sum_exec_runtime = S2

We can be sure that S >= (utime + stime)

If S2 = S1 + delta, then as per our calculation

Initially

utime_proc = (utime * (S1))/(utime + stime)
= nsec_to_clock_t(9 * S1 / 9)

later

utime_proc = nsec_to_clock_t(9 * S2/10)

Given that S >= (utime + stime), we should be fine.
The only problem I see is with rounding, like I mentioned before at two
places

1. Rounding at do_div() in task_utime()
2. Rounding in conversion from clock_t_to_cputime()

I have tried and not had any success reproducing the problem, could you
please help me with some pointers/steps to reproduce the problem?

--
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/