Re: [PATCH 0/9] mm: improve OOM mechanism v2

From: Tetsuo Handa
Date: Thu Apr 30 2015 - 05:44:36 EST


Michal Hocko wrote:
> I mean we should eventually fail all the allocation types but GFP_NOFS
> is coming from _carefully_ handled code paths which is an easier starting
> point than a random code path in the kernel/drivers. So can we finally
> move at least in this direction?

I agree that all the allocation types can fail unless GFP_NOFAIL is given.
But I also expect that all the allocation types should not fail unless
order > PAGE_ALLOC_COSTLY_ORDER or GFP_NORETRY is given or chosen as an OOM
victim.

We already experienced at Linux 3.19 what happens if !__GFP_FS allocations
fails. out_of_memory() is called by pagefault_out_of_memory() when 0x2015a
(!__GFP_FS) allocation failed. This looks to me that !__GFP_FS allocations
are effectively OOM killer context. It is not fair to kill the thread which
triggered a page fault, for that thread may not be using so much memory
(unfair from memory usage point of view) or that thread may be global init
(unfair because killing the entire system than survive by killing somebody).
Also, failing the GFP_NOFS/GFP_NOIO allocations which are not triggered by
a page fault generally causes more damage (e.g. taking filesystem error
action) than survive by killing somebody. Therefore, I think we should not
hesitate invoking the OOM killer for !__GFP_FS allocation.

> > Likewise, there is possibility that such memory reserve is used by threads
> > which the OOM victim is not waiting for, for malloc() + memset() causes
> > __GFP_FS allocations.
>
> We cannot be certain without complete dependency tracking. This is
> just a heuristic.

Yes, we cannot be certain without complete dependency tracking. And doing
complete dependency tracking is too expensive to implement. Dave is
recommending that we should focus on not to trigger the OOM killer than
how to handle corner cases in OOM conditions, isn't he? I still believe that
choosing more OOM victims upon timeout (which is a heuristic after all) and
invoking the OOM killer for !__GFP_FS allocations are the cheapest and least
surprising. This is something like automatically and periodically pressing
SysRq-f on behalf of the system administrator when memory allocator cannot
recover from low memory situation.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/