Re: [PATCH v2] mm: terminate shrink_slab loop if signal is pending

From: Suren Baghdasaryan
Date: Mon Dec 11 2017 - 16:05:59 EST


On Sat, Dec 9, 2017 at 12:08 AM, Tetsuo Handa
<penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote:
> Suren Baghdasaryan wrote:
>> On Fri, Dec 8, 2017 at 6:03 AM, Tetsuo Handa
>> <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote:
>> >> > >> This change checks for pending
>> >> > >> fatal signals inside shrink_slab loop and if one is detected
>> >> > >> terminates this loop early.
>> >> > >
>> >> > > This changelog doesn't really address my previous review feedback, I am
>> >> > > afraid. You should mention more details about problems you are seeing
>> >> > > and what causes them.
>>
>> The problem I'm facing is that a SIGKILL sent from user space to kill
>> the least important process is delayed enough for OOM-killer to get a
>> chance to kill something else, possibly a more important process. Here
>> "important" is from user's point of view. So the delay in SIGKILL
>> delivery effectively causes extra kills. Traces indicate that this
>> delay happens when process being killed is in direct reclaim and
>> shrinkers (before I fixed them) were the biggest cause for the delay.
>
> Sending SIGKILL from userspace is not releasing memory fast enough to prevent
> the OOM killer from invoking? Yes, under memory pressure, even an attempt to
> send SIGKILL from userspace could be delayed due to e.g. page fault.
>

I understand that there will be cases when OOM is unavoidable. I'm
trying to minimize the cases when SIGKILL processing is delayed.

> Unless it is memcg OOM, you could try OOM notifier callback for checking
> whether there are SIGKILL pending processes and wait for timeout if any.
> This situation resembles timeout-based OOM killing discussion, where the OOM
> killer is enabled again (based on an assumption that the OOM victim got stuck
> due to OOM lockup) after some timeout from previous OOM-killing. And since
> we did not merge timeout-based approach, there is no timestamp field for
> remembering when the SIGKILL was delivered. Therefore, an attempt to wait for
> timeout would become messy.
>
> Regardless of whether you try OOM notifier callback, I think that adding
> __GFP_KILLABLE and allow bailing out without calling out_of_memory() and
> warn_alloc() will help terminating killed processes faster. I think that
> majority of memory allocation requests can give up upon SIGKILL. Important
> allocation requests which should not give up upon SIGKILL (e.g. committing
> to filesystem metadata and storage) can be offloaded to kernel threads.
>
>>
>> >> > > If we have a shrinker which takes considerable
>> >> > > amount of time them we should be addressing that. If that is not
>> >> > > possible then it should be documented at least.
>>
>> I already submitted patches for couple shrinkers. Problem became less
>> pronounced and less frequent but the retry loop Tetsuo mentioned still
>> visibly delays the delivery. The worst case I've seen after fixing
>> shrinkers is 43ms.
>
> You meant "delays the termination (of the killed process)" rather than
> "delays the delivery (of SIGKILL)", didn't you?
>

To be more precise, I'm talking about the delay between send_signal()
and get_signal() or in other words the delay between the moment when
we send the SIGKILL and the moment we handle it.

>> > I agree that making waits/loops killable is generally good. But be sure to be
>> > prepared for the worst case. For example, start __GFP_KILLABLE from "best effort"
>> > basis (i.e. no guarantee that the allocating thread will leave the page allocator
>> > slowpath immediately) and check for fatal_signal_pending() only if
>> > __GFP_KILLABLE is set. That is,
>> >
>> > + /*
>> > + * We are about to die and free our memory.
>> > + * Stop shrinking which might delay signal handling.
>> > + */
>> > + if (unlikely((gfp_mask & __GFP_KILLABLE) && fatal_signal_pending(current)))
>> > + break;
>> >
>> > at shrink_slab() etc. and
>> >
>> > + if ((gfp_mask & __GFP_KILLABLE) && fatal_signal_pending(current))
>> > + goto nopage;
>> >
>> > at __alloc_pages_slowpath().
>>
>> I was thinking about something similar and will experiment to see if
>> this solves the problem and if it has any side effects. Anyone sees
>> any obvious problems with this approach?
>>
>
> It is Michal who thinks that "killability for a particular allocation request sounds
> like a hack" ( http://lkml.kernel.org/r/201606112335.HBG09891.OLFJOFtVMOQHSF@xxxxxxxxxxxxxxxxxxx ).
> I'm willing to give up memory allocations from functions which are called by
> system calls if SIGKILL is pending. Thus, it should be time to try __GFP_KILLABLE.