Re: [RFC][PATCH 1/6] memcg: free all at rmdir

From: Andrew Morton
Date: Wed Nov 12 2008 - 20:05:04 EST


On Thu, 13 Nov 2008 06:18:56 +0530
Balbir Singh <balbir@xxxxxxxxxxxxxxxxxx> wrote:

> Andrew Morton wrote:
> > On Thu, 13 Nov 2008 05:53:49 +0530
> > Balbir Singh <balbir@xxxxxxxxxxxxxxxxxx> wrote:
> >
> >> Andrew Morton wrote:
> >>> On Wed, 12 Nov 2008 12:26:56 +0900
> >>> KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:
> >>>
> >>>> +5.1 on_rmdir
> >>>> +set behavior of memcg at rmdir (Removing cgroup) default is "drop".
> >>>> +
> >>>> +5.1.1 drop
> >>>> + #echo on_rmdir drop > memory.attribute
> >>>> + This is default. All pages on the memcg will be freed.
> >>>> + If pages are locked or too busy, they will be moved up to the parent.
> >>>> + Useful when you want to drop (large) page caches used in this memcg.
> >>>> + But some of in-use page cache can be dropped by this.
> >>>> +
> >>>> +5.1.2 keep
> >>>> + #echo on_rmdir keep > memory.attribute
> >>>> + All pages on the memcg will be moved to its parent.
> >>>> + Useful when you don't want to drop page caches used in this memcg.
> >>>> + You can keep page caches from some library or DB accessed by this
> >>>> + memcg on memory.
> >>> Would it not be more useful to implement a per-memcg version of
> >>> /proc/sys/vm/drop_caches? (One without drop_caches' locking bug,
> >>> hopefully).
> >>>
> >>> If we do this then we can make the above "keep" behaviour non-optional,
> >>> and the operator gets to choose whether or not to drop the caches
> >>> before doing the rmdir.
> >>>
> >>> Plus, we get a new per-memcg drop_caches capability. And it's a nicer
> >>> interface, and it doesn't have the obvious races which on_rmdir has,
> >>> etc.
> >>>
> >> Andrew, I suspect that will not be easy, since we don't track address spaces
> >> that belong to a particular memcg. If page cache ends up being shared across
> >> memcg's, dropping them would impact both mem cgroups.
> >>
> >
> > walk the LRUs?
>
> We do that for the force_empty() interface we have. Although we don't
> differentiate between cache and RSS at the moment.

so.. what's wrong with using that (possibly with some
generalisation/enhancement)?

btw, mem_cgroup_force_empty_list() uses PageLRU() outside ->lru_lock.
That's racy, although afaict this race will only cause an accounting
error.

Or maybe not. What happens if
__mem_cgroup_uncharge_common()->__mem_cgroup_remove_list() is passed a
page which isn't on an LRU any more? boom?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/