Re: [PATCH v6 04/16] sched/core: uclamp: Add CPU's clamp buckets refcounting

From: Peter Zijlstra
Date: Tue Jan 22 2019 - 04:45:16 EST


On Mon, Jan 21, 2019 at 04:33:38PM +0000, Patrick Bellasi wrote:
> On 21-Jan 17:12, Peter Zijlstra wrote:
> > On Mon, Jan 21, 2019 at 03:23:11PM +0000, Patrick Bellasi wrote:

> > > and keep all
> > > the buckets in use at the beginning of a cache line.
> >
> > That; is that the rationale for all this? Note that per the defaults
> > everything is in a single line already.
>
> Yes, that's because of the loop in:
>
> dequeue_task()
> uclamp_cpu_dec()
> uclamp_cpu_dec_id()
> uclamp_cpu_update()
>
> where buckets needs sometimes to be scanned to find a new max.
>
> Consider also that, with mapping, we can more easily increase the
> buckets count to 20 in order to have a finer clamping granularity if
> needed without warring too much about performance impact especially
> when we use anyway few different clamp values.
>
> So, I agree that mapping adds (code) complexity but it can also save
> few cycles in the fast path... do you think it's not worth the added
> complexity?

Then maybe split this out in a separate patch? Do the trivial linear
bucket thing first and then do this smarty pants thing on top.

One problem with the scheme is that it doesn't defrag; so if you get a
peak usage, you can still end up with only two active buckets in
different lines.

Also; if it is it's own patch, you get a much better view of the
additional complexity and a chance to justify it ;-)

Also; would it make sense to do s/cpu/rq/ on much of this? All this
uclamp_cpu_*() stuff really is per rq and takes rq arguments, so why
does it have cpu in the name... no strong feelings, just noticed it and
thought is a tad inconsistent.