Re: [PATCH] cpuset: mm: Remove memory barrier damage from the pageallocator

From: David Rientjes
Date: Fri Mar 02 2012 - 18:47:19 EST


On Fri, 2 Mar 2012, Peter Zijlstra wrote:

> Also, for the write side it doesn't really matter, changing mems_allowed
> should be rare and is an 'expensive' operation anyway.
>

It's very expensive even without memory barriers since the page allocator
wraps itself in {get,put}_mems_allowed() until a page or NULL is returned
and an update to current's set of allowed mems can stall indefinitely
trying to change the nodemask during this time. The thread changing
cpuset.mems is holding cgroup_mutex the entire time which locks out
changes, including adding additional nodes to current's set of allowed
mems. If direct reclaim takes a long time or an oom killed task fails to
exit quickly (or the allocation is __GFP_NOFAIL and we just spin
indefinitely holding get_mems_allowed()), then it's not uncommon to see a
write to cpuset.mems taking minutes while holding the mutex, if it ever
actually returns at all.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/