Re: [patch] mm, oom: prevent soft lockup on memcg oom for UP systems

From: Michal Hocko
Date: Mon Mar 16 2020 - 06:14:59 EST


On Mon 16-03-20 19:04:44, Tetsuo Handa wrote:
> On 2020/03/16 18:31, Michal Hocko wrote:
> >> What happens if the allocator has SCHED_FIFO?
> >
> > The same thing as a SCHED_FIFO running in a tight loop in the userspace.
> >
> > As long as a high priority context depends on a resource held by a low
> > priority task then we have a priority inversion problem and the page
> > allocator is no real exception here. But I do not see the allocator
> > is much different from any other code in the kernel. We do not add
> > random sleeps here and there to push a high priority FIFO or RT tasks
> > out of the execution context. We do cond_resched to help !PREEMPT
> > kernels but priority related issues are really out of scope of that
> > facility.
> >
>
> Spinning with realtime priority in userspace is a userspace's bug.
> Spinning with realtime priority in kernelspace until watchdog fires is
> a kernel's bug. We are not responsible for userspace's bug, and I'm
> asking whether the memory allocator kernel code can give enough CPU
> time to other threads even if current thread has realtime priority.

We've been through that discussion many times and the core point is that
this requires a large surgery to work properly. It is not just to add a
sleep into the page allocator and be done with that. Page allocator
cannot really do much on its own. It relies on many other contexts to
make a forward progress. What you really demand is far from trivial.
Maybe you are looking something much closer to the RT kernel than what
other preemption modes can offer currently.

Right now, you really have to be careful when running FIFO/RT processes
and plan their resources very carefully. Is that ideal? Not really but
considering that this is the status quo for many years it seems that
the usecases tend to find their way around that restriction.
--
Michal Hocko
SUSE Labs