Re: [PATCH] mm/slub: fix a deadlock in show_slab_objects()

From: Qian Cai
Date: Thu Oct 03 2019 - 16:07:51 EST


On Thu, 2019-10-03 at 12:56 -0700, David Rientjes wrote:
> On Thu, 3 Oct 2019, Qian Cai wrote:
>
> > diff --git a/mm/slub.c b/mm/slub.c
> > index 42c1b3af3c98..922cdcf5758a 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -4838,7 +4838,15 @@ static ssize_t show_slab_objects(struct kmem_cache *s,
> > }
> > }
> >
> > - get_online_mems();
> > +/*
> > + * It is not possible to take "mem_hotplug_lock" here, as it has already held
> > + * "kernfs_mutex" which could race with the lock order:
> > + *
> > + * mem_hotplug_lock->slab_mutex->kernfs_mutex
> > + *
> > + * In the worest case, it might be mis-calculated while doing NUMA node
> > + * hotplug, but it shall be corrected by later reads of the same files.
> > + */
> > #ifdef CONFIG_SLUB_DEBUG
> > if (flags & SO_ALL) {
> > struct kmem_cache_node *n;
>
> No objection to removing the {get,put}_online_mems() but the comment
> doesn't match the kernel style. I actually don't think we need the
> comment at all, actually.

I am a bit worry about later someone comes to add the lock back as he/she
figures out that it could get more accurate statistics that way, but I agree it
is probably an overkill.

>
> > @@ -4879,7 +4887,6 @@ static ssize_t show_slab_objects(struct kmem_cache *s,
> > x += sprintf(buf + x, " N%d=%lu",
> > node, nodes[node]);
> > #endif
> > - put_online_mems();
> > kfree(nodes);
> > return x + sprintf(buf + x, "\n");
> > }