Re: [RFC] Move the memory_notifier out of the memory_hotplug lock

From: David Rientjes
Date: Wed Feb 05 2014 - 18:20:24 EST


On Wed, 5 Feb 2014, Nathan Zimmer wrote:

> > That looks a little problematic, what happens if a nid is being brought
> > online and a registered callback does something like allocate resources
> > for the arg->status_change_nid and the above two hunks of this patch end
> > up racing?
> >
> > Before, a registered callback would be guaranteed to see either a
> > MEMORY_CANCEL_ONLINE or MEMORY_ONLINE after it has already done
> > MEMORY_GOING_ONLINE.
> >
> > With your patch, we could race and see one cpu doing MEMORY_GOING_ONLINE,
> > another cpu doing MEMORY_GOING_ONLINE, and then MEMORY_ONLINE and
> > MEMORY_CANCEL_ONLINE in either order.
> >
> > So I think this patch will break most registered callbacks that actually
> > depend on lock_memory_hotplug(), it's a coarse lock for that reason.
>
> Since the argument being passed in is the pfn and size it would be an issue
> only if two threads attepted to online the same piece of memory. Right?
>

No, I'm referring to registered callbacks that provide a resource for
arg->status_change_nid. An example would be the callbacks I added to the
slub allocator in slab_memory_callback(). If we are now able to get a
racy MEM_GOING_ONLINE -> MEM_GOING_ONLINE -> MEM_ONLINE ->
MEM_CANCEL_ONLINE, which is possible with your patch _and_ the node being
successfully onlined at the end, then we get a NULL pointer dereference
because the kmem_cache_node for each slab cache has been freed.

> That seems very unlikely but if it can happen it needs to be protected
> against.
>

The protection for registered memory online or offline callbacks is
lock_memory_hotplug() which is eliminated with your patch, the locking for
memory_notify() that you're citing is irrelevant.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/