Re: [RFC] nohz: no update idle entry time on get_cpu_idle/iowait_time_us()

From: Yunhong Jiang
Date: Tue Oct 20 2015 - 19:28:18 EST


On Sun, Oct 11, 2015 at 08:38:17PM +0200, Thomas Gleixner wrote:
> On Mon, 28 Sep 2015, Yunhong Jiang wrote:
> > Currently the get_cpu_idle/iowait_time_us() updates the idle_entrytime.
> > When it's invoked from another CPU and the target CPU has been on idle
> > already, it will update the idle_entrytime to now, which is incorrect.
> >
> > However, the get_cpu_idle/iowait_time_us() is not guranteed to be called
> > on the target CPU. For example, the get_cpu_idle_time_us() seems is
> > invoked remotely on drivers/cpufreq/cpufreq_governor.c through
> > get_cpu_idle_time().
> >
> > There is a check that the update happens only if a valid last_update_time
> > parameter passed. IMHO, this is more a hack because there is no guarantee
> > that it's invoked on the target CPU when last_update_time is valid.
>
> Looking at the call sites, this last_update_time parameter is
> silly. Why is the calling code not taking the timestamp? There is
> hardly a requirement that this needs to be the same timestamp as the
> one which is used to calculate idle time. That cpufreq calculations
> are speculative anyway.
>
> So we better get rid of that parameter completely.

Sure, will change the patch accordingly.

>
> > In fact, we don't need update the idle stats from
> > get_cpu_idle/iowait_time_us(). Now the policy is, we record the
> > entrytime when tick_nohz_start_idle() and update the stats
> > when tick_nohz_stop_idle(). We calculate the stats on other situations.
> >
> > Please notice:
> > 1) There is a bug currently that the tick_nohz_stop_idle() calls the
> > update_ts_time_stats() and update the idle_entrytime, which is sure
> > to be wrong. Removing the idle_entrytime update resolves the bug also.
>
> Care to explain the actual bug and what wreckage it causes? If it's a
> real bug then removing "ts->idle_entrytime = now" needs to be a
> separate patch. AFAICT, it's a cosmetic issue.

Thanks for review and sorry for slow response.
Yes, it's an exaggeration to call it a bug. I just thought that updating the
idle_entrytime when exiting the idle state should be something wrong. But
since the idle_entrytime is not used meaningful, it has no impact.

>
> > 2) There is a small widows in tick_nohz_stop_idle() between
> > update_ts_time_stats() and clear ts->idle_active, that
> > get_cpu_idle/iowait_time_us(), when invoked remotely, may double
> > count last idle period. This window exists w/o this change and this
> > change does not fix it.
>
> Calling any of those functions from a remote cpu is broken to begin
> with, especially on 32bit machines. And that does not change with your
> patch at all.

Yes, as stated, this change does not fix it.

Arjan also said that get_cpu_idle/iowait_time_us() should not be called from
remote CPUs, but as stated, drivers/cpufreq/cpufreq_governor.c seems trying
to invoke them remotely. I have no idea of the impact, Rafael or Viresh, can
you give me some hints?

>
> What we really need here is protecting the idle stats fields with a
> raw_spinlock.

If no calling from remote CPU, then we don't need the spinlock protection,
right?

Thanks
--jyh
>
> Thanks,
>
> tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/