Re: [7/8,v3] NUMA Hotplug Emulator: extend memory probe interfaceto support NUMA

From: Dave Hansen
Date: Wed Nov 17 2010 - 18:00:41 EST


On Wed, 2010-11-17 at 14:44 -0800, David Rientjes wrote:
> > That would work, in theory. But, in practice, we allocate the mem_map[]
> > at probe time. So, we've already effectively picked a node at probe.
> > That was done because the probe is equivalent to the hardware "add"
> > event. Once the hardware where in the address space the memory is, it
> > always also knows the node.
> >
> > But, I guess it also wouldn't be horrible if we just hot-removed and
> > hot-added an offline section if someone did write to a node file like
> > you're suggesting. It might actually exercise some interesting code
> > paths.
>
> Since the pages are offline you should be able to modify the memmap when
> the 'node' file is written and use populate_memnodemap() since that file
> is only writeable in an offline state.

It's not just the mem_map[], though. When a section is sitting
"offline", it's pretty much all ready to go, except that its pages
aren't in the allocators. But, all of the other mm structures have
already been modified to make room for the pages. Zones have been added
or modified, pgdats resized, 'struct page's initialized.

Changing the node implies changing _all_ of those, which requires
unrolling most of what happened when the "echo $foo > probe" operation
happened in the first place.

This is all _doable_, but it's not trivial.

-- Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/