Re: [PATCH v13 0/7] cgroup-aware OOM killer

From: Michal Hocko
Date: Mon Jan 15 2018 - 06:54:42 EST


On Fri 12-01-18 14:03:03, David Rientjes wrote:
> On Thu, 11 Jan 2018, Roman Gushchin wrote:
>
> > Summarizing all this, following the hierarchy is good when it reflects
> > the "importance" of cgroup's memory for a user, and bad otherwise.
> > In generic case with unified hierarchy it's not true, so following
> > the hierarchy unconditionally is bad.
> >
> > Current version of the patchset allows common evaluation of cgroups
> > by setting memory.groupoom to true. The only limitation is that it
> > also changes the OOM action: all belonging processes will be killed.
> > If we really want to preserve an option to evaluate cgroups
> > together without forcing "kill all processes" action, we can
> > convert memory.groupoom from being boolean to the multi-value knob.
> > For instance, "disabled" and "killall". This will allow to add
> > a third state ("evaluate", for example) later without breaking the API.
> >
>
> No, this isn't how kernel features get introduced. We don't design a new
> kernel feature with its own API for a highly specialized usecase and then
> claim we'll fix the problems later. Users will work around the
> constraints of the new feature, if possible, and then we can end up
> breaking them later. Or, we can pollute the mem cgroup v2 filesystem with
> even more tunables to cover up for mistakes in earlier designs.

This is a blatant misinterpretation of the proposed changes. I haven't
heard _any_ single argument against the proposed user interface except
for complaints for missing tunables. This is not how the kernel
development works and should work. The usecase was clearly described and
far from limited to a single workload or company.

> The key point to all three of my objections: extensibility.

And it has been argued that further _features_ can be added on top. I am
absolutely fed up discussing those things again and again without any
progress. You simply keep _ignoring_ counter arguments and that is a
major PITA to be honest with you. You are basically blocking a useful
feature because it doesn't solve your particular workload. This is
simply not acceptable in the kernel development.

> Both you and Michal have acknowledged blantently obvious shortcomings in
> the design.

What you call blatant shortcomings we do not see affecting any
_existing_ workloads. If they turn out to be real issues then we can fix
them without deprecating any user APIs added by this patchset.

Look, I am not the one who is a heavy container user. The primary reason
I am supporting this is because a) it has a _real_ user and I can see
more to emerge and b) because it _makes_ sense in its current form and
it doesn't bring heavy maintenance burden in the OOM code base which I
do care about.

I am deliberately skipping the rest of your email because it is _yet_
_again_ a form of obstructing the current patchset which is what you
have been doing for quite some time. I will leave the final decision
for merging to Andrew. If you want to build a more fine grained control
on top, you are free to do so. I will be reviewing those like any other
upstream oom changes.

--
Michal Hocko
SUSE Labs