Re: [PATCH] oom killer: break from infinite loop

From: anfei
Date: Mon Mar 29 2010 - 07:31:28 EST


On Sun, Mar 28, 2010 at 06:28:21PM +0200, Oleg Nesterov wrote:
> On 03/28, anfei wrote:
> >
> > On Fri, Mar 26, 2010 at 11:33:56PM +0100, Oleg Nesterov wrote:
> >
> > > Off-topic, but we shouldn't use force_sig(), SIGKILL doesn't
> > > need "force" semantics.
> > >
> > This may need a dedicated patch, there are some other places to
> > force_sig(SIGKILL, ...) too.
>
> Yes, yes, sure.
>
> > > I'd wish I could understand the changelog ;)
> > >
> > Assume thread A and B are in the same group. If A runs into the oom,
> > and selects B as the victim, B won't exit because at least in exit_mm(),
> > it can not get the mm->mmap_sem semaphore which A has already got.
>
> I see. But still I can't understand. To me, the problem is not that
> B can't exit, the problem is that A doesn't know it should exit. All

If B can exit, its memory will be freed, and A will be able to allocate
the memory, so A won't loop here.

Regards,
Anfei.

> threads should exit and free ->mm. Even if B could exit, this is not
> enough. And, to some extent, it doesn't matter if it holds mmap_sem
> or not.
>
> Don't get me wrong. Even if I don't understand oom_kill.c the patch
> looks obviously good to me, even from "common sense" pov. I am just
> curious.
>
> So, my understanding is: we are going to kill the whole thread group
> but TIF_MEMDIE is per-thread. Mark the whole thread group as TIF_MEMDIE
> so that any thread can notice this flag and (say, __alloc_pages_slowpath)
> fail asap.
>
> Is my understanding correct?
>
> Oleg.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/