Re: [PATCH 1/7] mm: memcontrol: charge swap to cgroup2

From: Kamezawa Hiroyuki
Date: Tue Dec 15 2015 - 22:19:16 EST


On 2015/12/15 23:50, Johannes Weiner wrote:
On Tue, Dec 15, 2015 at 12:22:41PM +0900, Kamezawa Hiroyuki wrote:
On 2015/12/15 4:42, Vladimir Davydov wrote:
Anyway, if you don't trust a container you'd better set the hard memory
limit so that it can't hurt others no matter what it runs and how it
tweaks its sub-tree knobs.

Limiting swap can easily cause "OOM-Killer even while there are available swap"
with easy mistake. Can't you add "swap excess" switch to sysctl to allow global
memory reclaim can ignore swap limitation ?

That never worked with a combined memory+swap limit, either. How could
it? The parent might swap you out under pressure, but simply touching
a few of your anon pages causes them to get swapped back in, thrashing
with whatever the parent was trying to do. Your ability to swap it out
is simply no protection against a group touching its pages.

Allowing the parent to exceed swap with separate counters makes even
less sense, because every page swapped out frees up a page of memory
that the child can reuse. For every swap page that exceeds the limit,
the child gets a free memory page! The child doesn't even have to
cause swapin, it can just steal whatever the parent tried to free up,
and meanwhile its combined memory & swap footprint explodes.

Sure.

The answer is and always should have been: don't overcommit untrusted
cgroups. Think of swap as a resource you distribute, not as breathing
room for the parents to rely on. Because it can't and could never.

ok, don't overcommmit.

And the new separate swap counter makes this explicit.

Hmm, my requests are
- set the same capabilities as mlock() to set swap.limit=0
- swap-full notification via vmpressure or something mechanism.
- OOM-Killer's available memory calculation may be corrupted, please check.
- force swap-in at reducing swap.limit

Thanks,
-Kame


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/