Re: [RFC][PATCH] memcg: page fault oom improvement v2

From: David Rientjes
Date: Tue Feb 23 2010 - 17:49:31 EST


On Tue, 23 Feb 2010, KAMEZAWA Hiroyuki wrote:

> Ouch, I missed to add memcontrol.h to quilt's reflesh set..
> This is updated one. Anyway, I'd like to wait for the next mmotm.
> We already have several changes.
>

I think it would be better to just remove mem_cgroup_out_of_memory() and
make it go through out_of_memory() by specifying a non-NULL pointer to a
struct mem_cgroup. We don't need the duplication in code that these two
functions have and then we can begin to have some consistency with how to
deal with panic_on_oom.

It would be much better to prefer killing current in pagefault oom
conditions, as the final patch in my oom killer rewrite does, if it is
killable. If not, we scan the tasklist and find another suitable
candidate. If current is bound to a memcg, we pass that to
select_bad_process() so that we only kill other tasks from the same
cgroup.

This allows us to hijack the TIF_MEMDIE bit to detect when there is a
parallel pagefault oom killing when the oom killer hasn't necessarily been
invoked to kill a system-wide task (it's simply killing current, by
default, and giving it access to memory reserves). Then, we can change
out_of_memory(), which also now handles memcg oom conditions, to always
scan the tasklist first (including for mempolicy and cpuset constrained
ooms), check for any candidates that have TIF_MEMDIE, and return
ERR_PTR(-1UL) if so. That catches the parallel pagefault oom conditions
from needlessly killing memcg tasks. panic_on_oom would only panic after
the tasklist scan has completed and returned != ERR_PTR(-1UL), meaning
pagefault ooms are exempt from that sysctl.

Anyway, do you think it would be possible to rebase on mmotm with my oom
killer rewrite patches? They're at
http://www.kernel.org/pub/linux/kernel/people/rientjes/oom-killer-rewrite
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/