Re: [PATCH v6 07/16] sched/core: uclamp: Add system default clamps

From: Peter Zijlstra
Date: Tue Jan 22 2019 - 10:13:32 EST


On Tue, Jan 22, 2019 at 02:43:29PM +0000, Patrick Bellasi wrote:
> On 22-Jan 14:56, Peter Zijlstra wrote:
> > On Tue, Jan 15, 2019 at 10:15:04AM +0000, Patrick Bellasi wrote:
> >
> > > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > > index 84294925d006..c8f391d1cdc5 100644
> > > --- a/include/linux/sched.h
> > > +++ b/include/linux/sched.h
> > > @@ -625,6 +625,11 @@ struct uclamp_se {
> > > unsigned int bucket_id : bits_per(UCLAMP_BUCKETS);
> > > unsigned int mapped : 1;
> > > unsigned int active : 1;
> > > + /* Clamp bucket and value actually used by a RUNNABLE task */
> > > + struct {
> > > + unsigned int value : bits_per(SCHED_CAPACITY_SCALE);
> > > + unsigned int bucket_id : bits_per(UCLAMP_BUCKETS);
> > > + } effective;
> >
> > I am confuzled by this thing.. so uclamp_se already has a value,bucket,
> > which per the prior code is the effective one.
> >
> > Now; I think I see why you want another value; you need the second to
> > store the original value for when the system limits change and we must
> > re-evaluate.
>
> Yes, that's one reason, the other one being to properly support
> CGroup when we add them in the following patches.
>
> Effective will always track the value/bucket in which the task has
> been refcounted at enqueue time and it depends on the aggregated
> value.

> > Should you not update all tasks?
>
> That's true, but that's also an expensive operation, that's why now
> I'm doing only lazy updates at next enqueue time.

Aaah, so you refcount on the original value, which allows you to skip
fixing up all tasks. I missed that bit.


> Do you think that could be acceptable?

Think so, it's a sysctl poke, 'nobody' ever does that.