proc/stat: idle goes backward

From: Oleg Nesterov
Date: Fri Aug 16 2013 - 10:53:01 EST


Hello.

Out customer reports that "idle" field is not monotonic. So far this
is all I know. I do not know how to reproduce, etc.

But when I look at this code, this looks really possible even
ignoring drivers/cpuidle/ which plays with update_ts_time_stats().

So, get_cpu_idle_time_us(last_update_time => NULL) does:

if (ts->idle_active && !nr_iowait_cpu(cpu)) {
ktime_t delta = ktime_sub(now, ts->idle_entrytime);

idle = ktime_add(ts->idle_sleeptime, delta);
} else {
idle = ts->idle_sleeptime;
}


Suppose that ts->idle_active == T. By the time we calculate

idle = ktime_add(ts->idle_sleeptime, delta);

this cpu can be already non-idle and ->idle_sleeptime can be already
updated by tick_nohz_stop_idle(), we return the wrong value.

If user-space reads /proc/stat again after that, "idle" can obviously
go back.

get_cpu_iowait_time_us() has the same problem.

Plus nr_iowait_cpu() can change in between even if cpu stays idle,
io_schedule() can return on another CPU.

Questions:

- Any other reason why it can be non-monotonic?

- Should we fix this or should we document that userspace
should handle this itself?

IOW, is this is bug or not?

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/