Re: OOM killer not nearly agressive enough?

From: Michal Hocko
Date: Fri Jan 10 2020 - 01:31:53 EST


On Thu 09-01-20 23:48:45, Pavel Machek wrote:
> Hi!
>
> > > > > Do we agree that OOM killer should have reacted way sooner?
> > > >
> > > > This is impossible to answer without knowing what was going on at the
> > > > time. Was the system threshing over page cache/swap? In other words, is
> > > > the system completely out of memory or refaulting the working set all
> > > > the time because it doesn't fit into memory?
> > >
> > > Swap was full, so "completely out of memory", I guess. Chromium does
> > > that fairly often :-(.
> >
> > The oom heuristic is based on the reclaim failure. If the reclaim makes
> > some progress then the oom killer is not hit. Have a look at
> > should_reclaim_retry for more details.
>
> Thanks for pointer.
>
> I guess setting MAX_RECLAIM_RETRIES to 1 is not something you'd
> recommend? :-).

You can certainly play with that. I am not overly optimistic that would
help though because symptoms of a threshing system is that we actually
do not even reach this point. Pages are simply recycled but they evict
other part of the hot working set. But I am only guessing what is the
problem in your case. Anyway MAX_RECLAIM_RETRIES would tend to be more
timing sensitive in general. If the reclaim progress cannot be made
because of IO latencies or other resource depletion then the OOM be
declared too early. The current MAX_RECLAIM_RETRIES is not something we
have tuned for in any sense. I remember it didn't make much difference
to change it unless the number would be really high which would be
signal that the reclaim is not throttled very well.

> > > PSI is completely different system, but I guess
> > > I should attempt to tweak the existing one first...
> >
> > PSI is measuring the cost of the allocation (among other things) and
> > that can give you some idea on how much time is spent to get memory.
> > Userspace can implement a policy based on that and act. The kernel oom
> > killer is the last resort when there is really no memory to
> > allocate.
>
> So what I'm seeing is system that is unresponsive, easily for an hour.
>
> Sometimes, I'm able to log in. When I could do that, system was
> absurdly slow, like ps printing at more than 10 seconds per line.
> ps on my system takes 300msec, estimate in the slow case would be 2000
> seconds, that is slowdown by factor of 6000x. That would be X terminal
> opening in like two hours... that's not really usable.

It would be great to find out what is the bottle neck. Is the allocator
stuck in the memory reclaim? Waiting on some lock? Reclaiming pages
which are stolen by other contending processes?

--
Michal Hocko
SUSE Labs