Re: [RFC PATCH 0/3] mm: memcontrol: delayed force empty

From: Michal Hocko
Date: Thu Jan 03 2019 - 14:23:44 EST


On Thu 03-01-19 11:10:00, Yang Shi wrote:
>
>
> On 1/3/19 10:53 AM, Michal Hocko wrote:
> > On Thu 03-01-19 10:40:54, Yang Shi wrote:
> > >
> > > On 1/3/19 10:13 AM, Michal Hocko wrote:
[...]
> > > > Is there any reason for your scripts to be strictly sequential here? In
> > > > other words why cannot you offload those expensive operations to a
> > > > detached context in _userspace_?
> > > I would say it has not to be strictly sequential. The above script is just
> > > an example to illustrate the pattern. But, sometimes it may hit such pattern
> > > due to the complicated cluster scheduling and container scheduling in the
> > > production environment, for example the creation process might be scheduled
> > > to the same CPU which is doing force_empty. I have to say I don't know too
> > > much about the internals of the container scheduling.
> > In that case I do not see a strong reason to implement the offloding
> > into the kernel. It is an additional code and semantic to maintain.
>
> Yes, it does introduce some additional code and semantic, but IMHO, it is
> quite simple and very straight forward, isn't it? Just utilize the existing
> css offline worker. And, that a couple of lines of code do improve some
> throughput issues for some real usecases.

I do not really care it is few LOC. It is more important that it is
conflating force_empty into offlining logic. There was a good reason to
remove reparenting/emptying the memcg during the offline. Considering
that you can offload force_empty from userspace trivially then I do not
see any reason to implement it in the kernel.

> > I think it is more important to discuss whether we want to introduce
> > force_empty in cgroup v2.
>
> We would prefer have it in v2 as well.

Then bring this up in a separate email thread please.
--
Michal Hocko
SUSE Labs