Re: [PATCH v2 1/3] mm: memcontrol: fix swap counter leak on swapout from offline cgroup

From: Michal Hocko
Date: Wed Aug 03 2016 - 08:01:05 EST


On Wed 03-08-16 14:46:40, Vladimir Davydov wrote:
> On Wed, Aug 03, 2016 at 01:09:42PM +0200, Michal Hocko wrote:
> > On Wed 03-08-16 12:50:49, Vladimir Davydov wrote:
> > > On Tue, Aug 02, 2016 at 06:00:26PM +0200, Michal Hocko wrote:
> > > > On Tue 02-08-16 18:00:48, Vladimir Davydov wrote:
> > > ...
> > > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > > > > index 3be791afd372..4ae12effe347 100644
> > > > > --- a/mm/memcontrol.c
> > > > > +++ b/mm/memcontrol.c
> > > > > @@ -4036,6 +4036,24 @@ static void mem_cgroup_id_get(struct mem_cgroup *memcg)
> > > > > atomic_inc(&memcg->id.ref);
> > > > > }
> > > > >
> > > > > +static struct mem_cgroup *mem_cgroup_id_get_active(struct mem_cgroup *memcg)
> > > > > +{
> > > > > + while (!atomic_inc_not_zero(&memcg->id.ref)) {
> > > > > + /*
> > > > > + * The root cgroup cannot be destroyed, so it's refcount must
> > > > > + * always be >= 1.
> > > > > + */
> > > > > + if (memcg == root_mem_cgroup) {
> > > > > + VM_BUG_ON(1);
> > > > > + break;
> > > > > + }
> > > >
> > > > why not simply VM_BUG_ON(memcg == root_mem_cgroup)?
> > >
> > > Because with DEBUG_VM disabled we could wind up looping forever here if
> > > the refcount of the root_mem_cgroup got screwed up. On production
> > > kernels, it's better to break the loop and carry on closing eyes on
> > > diverging counters rather than getting a lockup.
> >
> > Wouldn't this just paper over a real bug? Anyway I will not insist but
> > making the code more complex just to pretend we can handle a situation
> > gracefully doesn't sound right to me.
>
> But we can handle this IMO. AFAICS diverging id refcount will typically
> result in leaking swap charges, which aren't even a real resource.

Fair enough.

> At
> worst, we can leak an offline mem_cgroup, which is also not critical
> enough to crash the production system.

Agreed.

> I see your concern of papering over a bug though. What about adding a
> warning there?

WARN_ON_ONCE sounds better...

> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 1c0aa59fd333..8c8e68becee9 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -4044,7 +4044,7 @@ static struct mem_cgroup *mem_cgroup_id_get_online(struct mem_cgroup *memcg)
> * The root cgroup cannot be destroyed, so it's refcount must
> * always be >= 1.
> */
> - if (memcg == root_mem_cgroup) {
> + if (WARN_ON_ONCE(memcg == root_mem_cgroup)) {
> VM_BUG_ON(1);
> break;
> }
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@xxxxxxxxxx For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>

--
Michal Hocko
SUSE Labs