Re: Machine lockups on extreme memory pressure

From: Michal Hocko
Date: Tue Sep 22 2020 - 11:16:58 EST


On Tue 22-09-20 06:37:02, Shakeel Butt wrote:
[...]
> > I would recommend to focus on tracking down the who is blocking the
> > further progress.
>
> I was able to find the CPU next in line for the list_lock from the
> dump. I don't think anyone is blocking the progress as such but more
> like the spinlock in the irq context is starving the spinlock in the
> process context. This is a high traffic machine and there are tens of
> thousands of potential network ACKs on the queue.

So there is a forward progress but it is too slow to have any reasonable
progress in userspace?

> I talked about this problem with Johannes at LPC 2019 and I think we
> talked about two potential solutions. First was to somehow give memory
> reserves to oomd and second was in-kernel PSI based oom-killer. I am
> not sure the first one will work in this situation but the second one
> might help.

Why does your oomd depend on memory allocation?
--
Michal Hocko
SUSE Labs