Re: [PATCH 2/2] mm, oom: do not enfore OOM killer for __GFP_NOFAIL automatically

From: Michal Hocko
Date: Thu Dec 08 2016 - 08:47:25 EST


On Thu 08-12-16 21:53:44, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > On Tue 06-12-16 19:38:38, Tetsuo Handa wrote:
> > > You are trying to increase possible locations of lockups by changing
> > > default behavior of __GFP_NOFAIL.
> >
> > I disagree. I have tried to explain that it is much more important to
> > have fewer silent side effects than optimize for the very worst case. I
> > simply do not see __GFP_NOFAIL lockups so common to even care or tweak
> > their semantic in a weird way. It seems you prefer to optimize for the
> > absolute worst case and even for that case you cannot offer anything
> > better than randomly OOM killing random processes until the system
> > somehow resurrects or panics. I consider this a very bad design. So
> > let's agree to disagree here.
>
> You think that invoking the OOM killer with __GFP_NOFAIL is worse than
> locking up with __GFP_NOFAIL.

Yes and I have explained why.

> But I think that locking up with __GFP_NOFAIL
> is worse than invoking the OOM killer with __GFP_NOFAIL.

Without any actual arguments based just on handwaving.

> If we could agree
> with calling __alloc_pages_nowmark() before out_of_memory() if __GFP_NOFAIL
> is given, we can avoid locking up while minimizing possibility of invoking
> the OOM killer...

I do not understand. We do __alloc_pages_nowmark even when oom is called
for GFP_NOFAIL.

> I suggest "when you change something, ask users who are affected by
> your change" because patch 2 has values-based conflict.
>
[...]
> > > Those callers which prefer lockup over panic can specify both
> > > __GFP_NORETRY and __GFP_NOFAIL.
> >
> > No! This combination just doesn't make any sense. The same way how
> > __GFP_REPEAT | GFP_NOWAIT or __GFP_REPEAT | __GFP_NORETRY make no sense
> > as well. Please use a common sense!
>
> I wonder why I'm accused so much. I mentioned that patch 2 might be a
> garbage because patch 1 alone unexpectedly provided a mean to retry forever
> without invoking the OOM killer.

Which is the whole point of the patch and the changelog is vocal about
that. Even explaining why it is desirable to not override decisions when
the oom killer is not invoked. Please reread that and object if the
argument is not correct.

> You are not describing that fact in the
> description. You are not describing what combinations are valid and
> which flag is stronger requirement in gfp.h (e.g. __GFP_NOFAIL v.s.
> __GFP_NORETRY).

Sigh... I really fail to see why I should describe an impossible gfp
mask combination which is _not_ used in the kernel. Please stop this
strawman, I am really tired of it.

> > Invoking or not invoking the oom killer is the page allocator internal
> > business. No code outside of the MM is to talk about those decisions.
> > The fact that we provide a lightweight allocation mode which doesn't
> > invoke the OOM killer is a mere implementation detail.
>
> __GFP_NOFAIL allocation requests for e.g. fs writeback is considered as
> code inside the MM because they are operations for reclaiming memory.
> Such __GFP_NOFAIL allocation requests should be given a chance to choose
> which one (possibility of lockup by not invoking the OOM killer or
> possibility of panic by invoking the OOM killer) they prefer.

Please be more specific. How and why they should choose that. Which
allocation are we talking about and why do you believe that the current
implementation with access to memory reserves is not sufficient.

> Therefore,
>
> > If you believe that my argumentation is incorrect then you are free to
> > nak the patch with your reasoning. But please stop this nit picking on
> > nonsense combination of flags.
>
> Nacked-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
>
> on patch 2 unless "you explain these patches to __GFP_NOFAIL users and
> provide a mean to invoke the OOM killer if someone chose possibility of
> panic"

I believe that the changelog contains my reasonining and so far I
haven't heard any _argument_ from you why they are wrong. You just
managed to nitpick on an impossible and pointless gfp_mask combination
and some handwaving on possible lockups without any backing arguments.
This is not something I would consider as a basis for a serious nack. So
if you really hate this patch then do please start being reasonable and
put some real arguments into your review without any off topics and/or
strawman arguments without any relevance.

> or "you accept kmallocwd".

Are you serious? Are you really suggesting that your patch has to be
accepted in order to have this one in?

--
Michal Hocko
SUSE Labs