Re: [patch 4/7] sched: convert rq->avg_idle to rq->avg_event

From: Peter Zijlstra
Date: Wed Nov 23 2011 - 07:27:35 EST


On Wed, 2011-11-23 at 13:09 +0100, Mike Galbraith wrote:
> On Wed, 2011-11-23 at 12:55 +0100, Peter Zijlstra wrote:
> > On Tue, 2011-11-22 at 15:22 +0100, Mike Galbraith wrote:
> > > We update rq->clock only at points of interest to the scheduler.
> > > Using this distance has the same effect as measuring idle time
> > > for idle_balance() throttling, and allows other uses as well.
> >
> > I'm not sure I follow, suppose we're happily context switching away, how
> > is the avg distance between context switches related to idle time?
>
> Average idle time can't be larger.

True :-)

But it can be _much MUCH_ smaller. So the value is a fair upper limit on
the idle time, but has no relation to the actual idle duration.

Now this value seems to be used in 5 to throttle select_idle_sibling(),
which is again something unrelated to actual idle duration, but also
unrelated to the avg update_rq_clock() interval.

In patch 6 we use this value to guestimate if we should enter nohz,
since its a wild over estimation of the actual idle duration it'll be
less effective and might not hard much.

Also, patch 6's use of sched_migration_cost to reflect the nohz
enter/exit cost is somewhat iffy, but that's another issue.


Now I'm not saying this all isn't worth it, just saying my brain is
having difficulty seeing how it all makes sense :-)

Anyway, I picked up 1,2,3,7 and will give the missing patches another
stare a bit later.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/