Re: System freezes after OOM

From: Michal Hocko
Date: Mon Jul 18 2016 - 03:39:24 EST


On Fri 15-07-16 14:47:30, David Rientjes wrote:
> On Fri, 15 Jul 2016, Michal Hocko wrote:
[...]
> > And let me repeat your proposed patch
> > has a undesirable side effects so we should think about a way to deal
> > with those cases. It might work for your setups but it shouldn't break
> > others at the same time. OOM situation is quite unlikely compared to
> > simple memory depletion by writing to a swap...
> >
>
> I haven't proposed any patch, not sure what the reference is to.

I was talking about f9054c70d28b ("mm, mempool: only set
__GFP_NOMEMALLOC if there are free elements"). Do you at least recognize
it has caused a regression which is more likely than the OOM lockup you
are referring to and that might be very specific to your particular
workload? I would really like to move on here and come up with a fix
which can handle dm-crypt swapout gracefully and also deal with the
typical case when the OOM victim is inside the mempool_alloc which
should help your usecase as well (at least the writeout path).

> There's
> two fundamental ways to go about it: (1) ensure mempool_alloc() can make
> forward progress (whether that's by way of gfp flags or access to memory
> reserves, which may depend on the process context such as PF_MEMALLOC) or
> (2) rely on an implementation detail of mempools to never access memory
> reserves, although it is shown to not livelock systems on 4.7 and earlier
> kernels, and instead rely on users of the same mempool to return elements
> to the freelist in all contexts, including oom contexts. The mempool
> implementation itself shouldn't need any oom awareness, that should be a
> page allocator issue.

OK, I agree that we have a certain layer violation here. __GFP_NOMEMALLOC at
the mempool level is kind of hack (like the whole existence of the
flag TBH). So if you believe that the OOM part should be handled at the
page allocator level then that has already been proposed
http://lkml.kernel.org/r/2d5e1f84-e886-7b98-cb11-170d7104fd13@xxxxxxxxxxxxxxxxxxx
and not welcome because it might have other side effects as _all_
__GFP_NOMEMALLOC users would be affected.

--
Michal Hocko
SUSE Labs