Re: [RFC 2/2] mm, oom: do not enfore OOM killer for __GFP_NOFAIL automatically

From: Michal Hocko
Date: Thu Nov 24 2016 - 02:51:19 EST


On Thu 24-11-16 08:41:30, Vlastimil Babka wrote:
> On 11/23/2016 01:35 PM, Michal Hocko wrote:
> > On Wed 23-11-16 13:19:20, Vlastimil Babka wrote:
[...]
> > > > static inline struct page *
> > > > +__alloc_pages_nowmark(gfp_t gfp_mask, unsigned int order,
> > > > + const struct alloc_context *ac)
> > > > +{
> > > > + struct page *page;
> > > > +
> > > > + page = get_page_from_freelist(gfp_mask, order,
> > > > + ALLOC_NO_WATERMARKS|ALLOC_CPUSET, ac);
> > > > + /*
> > > > + * fallback to ignore cpuset restriction if our nodes
> > > > + * are depleted
> > > > + */
> > > > + if (!page)
> > > > + page = get_page_from_freelist(gfp_mask, order,
> > > > + ALLOC_NO_WATERMARKS, ac);
> > >
> > > Is this enough? Look at what __alloc_pages_slowpath() does since
> > > e46e7b77c909 ("mm, page_alloc: recalculate the preferred zoneref if the
> > > context can ignore memory policies").
> >
> > this is a one time attempt to do the nowmark allocation. If we need to
> > do the recalculation then this should happen in the next round. Or am I
> > missing your question?
>
> The next round no-watermarks allocation attempt in __alloc_pages_slowpath()
> uses different criteria than the new __alloc_pages_nowmark() callers. And it
> would be nicer to unify this as well, if possible.

I am sorry but I still do not see your point. Could you be more specific
why it matters? In other words this is what we were doing prior to this
patch already so I am not changing it. I just wrapped it into a helper
because I have to do the same thing at two places because of oom vs.
no-oom paths.

> > > > - }
> > > > /* Exhausted what can be done so it's blamo time */
> > > > - if (out_of_memory(&oc) || WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL)) {
> > > > + if (out_of_memory(&oc)) {
> > >
> > > This removes the warning, but also the check for __GFP_NOFAIL itself. Was it
> > > what you wanted?
> >
> > The point of the check was to keep looping for __GFP_NOFAIL requests
> > even when the OOM killer is disabled (out_of_memory returns false). We
> > are accomplishing that by
> > >
> > > > *did_some_progress = 1;
> > ^^^^ this
>
> But oom disabled means that this line is not reached?

Yes but it doesn't need to anymore because we have that "check NOFAIL on
nopage" check in the allocator slow path from the first patch. We didn't
have that previously so we had to "cheat" here.

--
Michal Hocko
SUSE Labs