Re: [PATCH cgroup/for-4.15] cgroup, sched: Move basic cpu stats from cgroup.stat to cpu.stat

From: Tejun Heo
Date: Tue Oct 24 2017 - 10:59:30 EST


Hello, Peter.

On Tue, Oct 24, 2017 at 10:35:04AM +0200, Peter Zijlstra wrote:
> > The more I think about showing cpu stat in cgroup.stat, the uglier it
> > seems.
>
> I've not been paying much attention to this, could you elaborate on the
> problems there?

Sure, so, on cgroup2, the basic stat collects cpu usage info whether
cpu controller is enabled or not. As the hot path overhead is always
per-cpu and constant, there's no reason to not to and always having
the information is useful, especially as enabling cpu isn't free of
side-effects. This is similar to what cpuacct did in cgroup1 but
cgroup2's single hierarchy makes the dedicated controller unnecessary.

The issue at hand with this patch is how the stat is presented when
the controller is not enabled. There were two alternatives.

1. Make it available in the core file cgroup.stat with subsystem
prefix ("cpu." for cpu).

This is easy to implement but a bit ugly because when the
controller is enabled, the same information is available in two
places - cgroup.stat and cpu.stat. Access from userspace becomes
ugly too especially in cases where we make more fields available in
basic stat.

2. Make $SUBSYS.stat available whether the controller is enabled or
not. Show the basic stat when the controller is disabled, show the
full thing when enabled.

This is somewhat more complicated to implement and having a
subsystem specific file around when the controller is not enabled
might be a bit confusing. However, it ensures that a given piece
of information is available in only one place and makes it less
awkward to make more information available through basic stat.

The original implementation went for #1 and this patch switches it to
#2.

> > This patch flips it so that "cpu.stat" is always available
> > with basic cpu stat instead. It only changes the presentation and
> > changes to the scheduler code is minimal. Will route with the other
> > cpu controller changes through cgroup/for-4.15 unless there are
> > objections.
>
> And this is -v2 only? I'm a little lost on how all that connects.

Yeap, basic stat is v2 only. We can't do it against multiple
hierarchies with constant overhead.

Thanks.

--
tejun