[PATCH] nohz: do not update idle/iowait counters fromget_cpu_{idle,iowait}_time_us if not asked

From: Michal Hocko
Date: Mon Aug 22 2011 - 05:56:22 EST


On Fri 05-08-11 16:23:49, Michal Hocko wrote:
> On Thu 04-08-11 17:20:32, Michal Hocko wrote:
> > show_stat handler of the /proc/stat file relies on kstat_cpu(cpu)
> > statistics when priting information about idle and iowait times.
> > This is OK if we are not using tickless kernel (CONFIG_NO_HZ) because
> > counters are updated periodically.
> > With NO_HZ things got more tricky because we are not doing idle/iowait
> > accounting while we are tickless so the value might get outdated.
> > Users of /proc/stat will notice that by unchanged idle/iowait values
> > which is then interpreted as 0% idle/iowait time. From the user space
> > POV this is an unexpected behavior and a change of the interface.
> >
> > Let's fix this by using get_cpu_{idle,iowait}_time_us which accounts the
> > total idle/iowait time since boot and it doesn't rely on sampling or any
> > other periodic activity. Fall back to the previous behavior if NO_HZ is
> > disabled or not configured.
>
> I forgot to mention that this might be racy because we are updating
> those per-cpu values without having preemption disabled or any other
> locking which would be necessary as governors iterate over all CPUs.
> Governors do not have to care about that because they are singletons.
> Introducing locks doesn't look like an option but I was thinking
> about adding __get_cpu_{idle,iowait}_time_us which wouldn't call
> update_ts_timestat and calculate the result instead.
> I can add a patch which does that but I wanted to hear about general
> approach first.

I guess we do not need a separate __get_cpu_{idle,iowait}_time_us
variant and rather reuse last_update_time parameter to determine whether
to update counters or not.

AFAICS we can still race with IRQ in update path (governors):
irq_enter
tick_check_idle
tick_check_nohz
tick_nohz_stop_idle
but this is a separate issue IMO.
---