Re: [PATCH v2 2/2] sched: add throttled time stat for throttled children

From: Josh Don
Date: Tue Jun 20 2023 - 14:28:42 EST


Hi Michal,

On Mon, Jun 19, 2023 at 10:53 AM Michal Koutný <mkoutny@xxxxxxxx> wrote:
>
> On Mon, Jun 12, 2023 at 04:27:48PM -0700, Josh Don <joshdon@xxxxxxxxxx> wrote:
> > We currently export the total throttled time for cgroups that are given
> > a bandwidth limit.
>
> I assume you refer to cpu.stat:throttled_usec (from struct
> cfs_bandwidth) -- notice that the value is not properly hierarchical
> despite v2 filename.
>
> > This patch extends this accounting to also account the total time that
> > each children cgroup has been throttled.
>
> IIUC, this is visible on inner-nodes cpu cgroups (i.e. with no tasks)?
>
> IOW, wouldn't you get the intended information if hierarchical summing
> was added/fixed for cpu.stat:throttled_usec?

It isn't currently hierarchical in the sense that the inner-nodes
don't themselves account their throttled time, but the summation at
the top is still correct. This patch is intended to close the gap. I
suppose your question here is why not simply make the existing
throttled_usec export properly hierarchical, and avoid the extra stat
export here. I think it might be useful to still expose a
non-hierarchical metric indicating the throttled time due to the
group's own configured limit; the accounting can look strange with
nested bandwidth limits. Not strongly opposed to the idea, but your
hierarchical accounting proposal is essentially what this patch adds.