Re: [RFC PATCH] mm, memory_hotplug: support movable_node for hotplugable nodes

From: Vlastimil Babka
Date: Thu Jun 01 2017 - 10:12:10 EST


On 06/01/2017 02:20 PM, Michal Hocko wrote:
> From: Michal Hocko <mhocko@xxxxxxxx>
>
> movable_node kernel parameter allows to make hotplugable NUMA
> nodes to put all the hotplugable memory into movable zone which
> allows more or less reliable memory hotremove. At least this
> is the case for the NUMA nodes present during the boot (see
> find_zone_movable_pfns_for_nodes).
>
> This is not the case for the memory hotplug, though.
>
> echo online > /sys/devices/system/memory/memoryXYZ/status
>
> will default to a kernel zone (usually ZONE_NORMAL) unless the
> particular memblock is already in the movable zone range which is not
> the case normally when onlining the memory from the udev rule context
> for a freshly hotadded NUMA node. The only option currently is to have a
> special udev rule to echo online_movable to all memblocks belonging to
> such a node which is rather clumsy. Not the mention this is inconsistent
> as well because what ended up in the movable zone during the boot will
> end up in a kernel zone after hotremove & hotadd without special care.

Yeah, it would be better if movable_node worked consistently for both
boot and runtime hotplug.

> It would be nice to reuse memblock_is_hotpluggable but the runtime
> hotplug doesn't have that information available because the boot and
> hotplug paths are not shared and it would be really non trivial to
> make them use the same code path because the runtime hotplug doesn't
> play with the memblock allocator at all.
>
> Teach move_pfn_range that MMOP_ONLINE_KEEP can use the movable zone if
> movable_node is enabled and the range doesn't overlap with the existing
> normal zone. This should provide a reasonable default onlining strategy.
>
> Strictly speaking the semantic is not identical with the boot time
> initialization because find_zone_movable_pfns_for_nodes covers only the
> hotplugable range as described by the BIOS/FW. From my experience this
> is usually a full node though (except for Node0 which is special and
> never goes away completely). If this turns out to be a problem in the
> real life we can tweak the code to store hotplug flag into memblocks
> but let's keep this simple now.

Simple should work, hopefully.
- if memory is hotplugged, it's obviously hotplugable, so we don't have
to rely on BIOS description.
- there shouldn't be a reason to offline a non-removable (part of) node
and online it back (which would move it from Normal to Movable after
your patch?), right?

Vlastimil