Re: [External] Re: [PATCH] cgroup/rstat: record the cumulative per-cpu time of cgroup and its descendants

From: Hao Jia
Date: Tue Jul 18 2023 - 23:02:31 EST




On 2023/7/19 Tejun Heo wrote:
On Tue, Jul 18, 2023 at 06:08:50PM +0800, Hao Jia wrote:
https://github.com/jiaozhouxiaojia/cgv2-stat-percpu_test/tree/main

So, we run `stress -c 1` for 1 second in the asdf/test0 cgroup and
asdf/cpu.stat correctly reports the cumulative usage. After removing
asdf/test0 cgroup, asdf's usage_usec is still there. What's missing here?

Sorry, some of my expressions may have misled you.

Yes, cpu.stat will display the cumulative **global** cpu time of the cgroup and its descendants (the corresponding kernel variable is "cgrp->bstat"), and it will not be lost when the child cgroup is removed.

Similarly, we need a **per-cpu** variable to record the accumulated per-cpu time of cgroup and its descendants.
The existing kernel variable "cgroup_rstat_cpu(cgrp, cpu)->bstat" is not satisfied, it only records the per-cpu time of cgroup itself,
So I try to add "cgroup_rstat_cpu(cgrp, cpu)->cumul_bstat" to record per-cpu time of cgroup and its descendants.

In order to verify the correctness of my patch, I wrote a kernel module to compare the results of calculating the per-cpu time of cgroup and its descendants in two ways:
Method 1. Traverse and add the per-cpu rstatc->bstat of cgroup and each of its descendants.
Method 2. Directly read "cgroup_rstat_cpu(cgrp, cpu)->cumul_bstat" in the kernel.

When the child cgroup is not removed, the results calculated by the two methods should be equal.

What are you adding?
I want to add a **per-cpu variable** to record the cumulative per-cpu time of cgroup and its descendants, which is similar to the variable "cgrp->bstat", but it is a per-cpu variable.
It is very useful and convenient for calculating the usage of cgroup on each cpu, and its behavior is similar to the "cpuacct.usage*" interface of cgroup v1.

Thanks,
Hao