Re: [PATCH v3 1/2] sched, cgroup: Restore meaning to hierarchical_quota

From: Tejun Heo
Date: Mon Jul 17 2023 - 14:28:46 EST


On Fri, Jul 14, 2023 at 08:57:46AM -0400, Phil Auld wrote:
> In cgroupv2 cfs_b->hierarchical_quota is set to -1 for all task
> groups due to the previous fix simply taking the min. It should
> reflect a limit imposed at that level or by an ancestor. Even
> though cgroupv2 does not require child quota to be less than or
> equal to that of its ancestors the task group will still be
> constrained by such a quota so this should be shown here. Cgroupv1
> continues to set this correctly.
>
> In both cases, add initialization when a new task group is created
> based on the current parent's value (or RUNTIME_INF in the case of
> root_task_group). Otherwise, the field is wrong until a quota is
> changed after creation and __cfs_schedulable() is called.
>
> Fixes: c53593e5cb69 ("sched, cgroup: Don't reject lower cpu.max on ancestors")

Does this really fix anything observable? I wonder whether this is more
misleading than helpful. In cgroup2, the value simply wasn't being used,
right?

> Signed-off-by: Phil Auld <pauld@xxxxxxxxxx>
> Reviewed-by: Ben Segall <bsegall@xxxxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
> Cc: Juri Lelli <juri.lelli@xxxxxxxxxx>
> Cc: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
> Cc: Valentin Schneider <vschneid@xxxxxxxxxx>
> Cc: Ben Segall <bsegall@xxxxxxxxxx>
> Cc: Frederic Weisbecker <frederic@xxxxxxxxxx>
> Cc: Tejun Heo <tj@xxxxxxxxxx>

Acked-by: Tejun Heo <tj@xxxxxxxxxx>

> + * always take the non-RUNTIME_INF min. On cgroup1, only
> + * inherit when no limit is set. In both cases this is used
> + * by the scheduler to determine if a given CFS task has a
> + * bandwidth constraint at some higher level.

The discussion on this comment is stretching too long and this is fine too
but what's worth commenting for cgroup2 is that the limit value itself
doesn't mean anything and we're just latching onto the value used by cgroup1
to track whether there's any limit active or not.

Thanks.

--
tejun