Re: [PATCH 0/2] Fix memcg/memory.high in case kmem accounting is enabled

From: Tejun Heo
Date: Mon Aug 31 2015 - 11:48:05 EST


Hello,

On Mon, Aug 31, 2015 at 06:18:14PM +0300, Vladimir Davydov wrote:
> We have to be cautious about placing memcg_charge in slab/slub. To
> understand why, consider SLAB case, which first tries to allocate from
> all nodes in the order of preference w/o __GFP_WAIT and only if it fails
> falls back on an allocation from any node w/ __GFP_WAIT. This is its
> internal algorithm. If we blindly put memcg_charge to alloc_slab method,
> then, when we are near the memcg limit, we will go over all NUMA nodes
> in vain, then finally fall back to __GFP_WAIT allocation, which will get
> a slab from a random node. Not only we do more work than necessary due
> to walking over all NUMA nodes for nothing, but we also break SLAB
> internal logic! And you just can't fix it in memcg, because memcg knows
> nothing about the internal logic of SLAB, how it handles NUMA nodes.
>
> SLUB has a different problem. It tries to avoid high-order allocations
> if there is a risk of invoking costly memory compactor. It has nothing
> to do with memcg, because memcg does not care if the charge is for a
> high order page or not.

Maybe I'm missing something but aren't both issues caused by memcg
failing to provide headroom for NOWAIT allocations when the
consumption gets close to the max limit? Regardless of the specific
usage, !__GFP_WAIT means "give me memory if it can be spared w/o
inducing direct time-consuming maintenance work" and the contract
around it is that such requests will mostly succeed under nominal
conditions. Also, slab/slub might not stay as the only user of
try_charge(). I still think solving this from memcg side is the right
direction.

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/