Re: [PATCH v5 1/2] sched, cgroup: Restore meaning to hierarchical_quota

From: Phil Auld
Date: Tue Jul 11 2023 - 09:14:42 EST


On Mon, Jul 10, 2023 at 05:00:11PM -0700 Benjamin Segall wrote:
> Phil Auld <pauld@xxxxxxxxxx> writes:
>
> > In cgroupv2 cfs_b->hierarchical_quota is set to -1 for all task
> > groups due to the previous fix simply taking the min. It should
> > reflect a limit imposed at that level or by an ancestor. Even
> > though cgroupv2 does not require child quota to be less than or
> > equal to that of its ancestors the task group will still be
> > constrained by such a quota so this should be shown here. Cgroupv1
> > continues to set this correctly.
> >
> > In both cases, add initialization when a new task group is created
> > based on the current parent's value (or RUNTIME_INF in the case of
> > root_task_group). Otherwise, the field is wrong until a quota is
> > changed after creation and __cfs_schedulable() is called.
>
> Reviewed-by: Ben Segall <bsegall@xxxxxxxxxx>
>

Thanks, I'll hold on to this for the next version where I update the comment
if that's okay. I was just going to send that but based on your comment
on patch 2 may just do a v6 of the whole thing.


Cheers,
Phil

> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index a68d1276bab0..1b214e10c25d 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -11038,11 +11038,14 @@ static int tg_cfs_schedulable_down(struct task_group *tg, void *data)
> >
> > /*
> > * Ensure max(child_quota) <= parent_quota. On cgroup2,
> > - * always take the min. On cgroup1, only inherit when no
> > - * limit is set:
> > + * always take the non-RUNTIME_INF min. On cgroup1, only
> > + * inherit when no limit is set:
> > */
> > if (cgroup_subsys_on_dfl(cpu_cgrp_subsys)) {
> > - quota = min(quota, parent_quota);
> > + if (quota == RUNTIME_INF)
> > + quota = parent_quota;
> > + else if (parent_quota != RUNTIME_INF)
> > + quota = min(quota, parent_quota);
> > } else {
> > if (quota == RUNTIME_INF)
> > quota = parent_quota;
>
> I suppose you could also set RUNTIME_INF to be a positive value or
> better yet just compare at unsigned, but it's not like config needs to
> be fast, so no need to mess with that.
>
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 373ff5f55884..92381f9ecf37 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -6005,13 +6005,14 @@ static enum hrtimer_restart sched_cfs_period_timer(struct hrtimer *timer)
> > return idle ? HRTIMER_NORESTART : HRTIMER_RESTART;
> > }
> >
> > -void init_cfs_bandwidth(struct cfs_bandwidth *cfs_b)
> > +void init_cfs_bandwidth(struct cfs_bandwidth *cfs_b, struct cfs_bandwidth *parent)
> > {
> > raw_spin_lock_init(&cfs_b->lock);
> > cfs_b->runtime = 0;
> > cfs_b->quota = RUNTIME_INF;
> > cfs_b->period = ns_to_ktime(default_cfs_period());
> > cfs_b->burst = 0;
> > + cfs_b->hierarchical_quota = ((parent) ? parent->hierarchical_quota : RUNTIME_INF);
>
> Minor style nit: don't need any of these parens here.
>

--