Re: [patch] mm: memcg: close race between charge and putback

From: Johannes Weiner
Date: Thu Sep 08 2011 - 05:54:16 EST


On Thu, Sep 08, 2011 at 06:42:21PM +0900, KAMEZAWA Hiroyuki wrote:
> On Thu, 8 Sep 2011 11:33:16 +0200
> Johannes Weiner <jweiner@xxxxxxxxxx> wrote:
>
> > On Thu, Sep 08, 2011 at 06:19:01PM +0900, KAMEZAWA Hiroyuki wrote:
> > > On Thu, 8 Sep 2011 10:54:04 +0200
> > > Johannes Weiner <jweiner@xxxxxxxxxx> wrote:
> > >
> > > > On Thu, Sep 08, 2011 at 05:30:42PM +0900, KAMEZAWA Hiroyuki wrote:
> > > > > On Thu, 8 Sep 2011 09:40:22 +0200
> > > > > Johannes Weiner <jweiner@xxxxxxxxxx> wrote:
> > > > >
> > > > > > There is a potential race between a thread charging a page and another
> > > > > > thread putting it back to the LRU list:
> > > > > >
> > > > > > charge: putback:
> > > > > > SetPageCgroupUsed SetPageLRU
> > > > > > PageLRU && add to memcg LRU PageCgroupUsed && add to memcg LRU
> > > > > >
> > > > >
> > > > > I assumed that all pages are charged before added to LRU.
> > > > > (i.e. event happens in charge->lru_lock->putback order.)
> > > > >
> > > > > But hmm, this assumption may be bad for maintainance.
> > > > > Do you find a code which adds pages to LRU before charge ?
> > > > >
> > > > > Hmm, if there are codes which recharge the page to other memcg,
> > > > > it will cause bug and my assumption may be harmful.
> > > >
> > > > Swap slots are read optimistically into swapcache and put to the LRU,
> > > > then charged upon fault.
> > >
> > > Yes, then swap charge removes page from LRU before charge.
> > > IIUC, it needed to do so because page->mem_cgroup may be replaced.
> >
> > But only from the memcg LRU. It's still on the global per-zone LRU,
> > so reclaim could isolate/putback it during the charge. And then
> >
> > > > > > charge: putback:
> > > > > > SetPageCgroupUsed SetPageLRU
> > > > > > PageLRU && add to memcg LRU PageCgroupUsed && add to memcg LRU
> >
> > applies.
>
> Hmm, in this case, I thought memcg puts back the page to its LRU by itself
> under lru_loc after charge and the race was hidden.

But it locklessly checks PageLRU and bails if it's cleared and that is
the problem: it's not guaranteed that PageLRU is observed on the
charging CPU when the putback side bailed because of PageCgroupUsed.

My barrier puts this in order and makes sure one of the two succeeds.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/