Re: [PATCH] [RESEND] x86_64: add memory hotremove config option

From: Badari Pulavarty
Date: Fri Sep 05 2008 - 18:34:13 EST



On Fri, 2008-09-05 at 20:54 +0200, Andi Kleen wrote:
> > At this time we are interested on node remove (on x86_64).
> > It doesn't really work well at this time -
>
> That's a quite euphemistic way to put it.
>
> > due to some of the structures
>
> That means you can never put any slab data on specific nodes.
> And all the kernel subsystems on that node will not ever get local
> memory. How are you going to solve that? And if you disallow
> kernel allocations in so large memory areas you get many of the highmem
> issues that plagued 32bit back in the 64bit kernel.

You are absolutely correct. There is no easy solution - one has
to loose performance in order to support node removal, along with
some old x86 issues :(

We were contemplating idea of limiting node removal to few
select set of nodes as a compromise - but it didn't sound right :(

>
> There are lots of other issues. It's quite questionable if this
> whole exercise makes sense at all.

Same issues exist with ia64 and x86_64 won't be any worse off.
Gary was trying to enable the functionality so that we can atleast
test out offlining memory section easier (test page migration,
isolation code and hash out issues)

Another possible idea being considered (still lot of unknowns)
to make use offline memory section feature for power management
(*cough*).

Anyway, as you can see this patch doesn't add any code - just
enables config option for x86_64. (if you are worried about
code bloat).

> > (BTW, on ppc64 this works fine - since we are interested mostly in
> > removing *some* sections of memory to give it back to hypervisor -
> > not entire node removal).
>
> Ok for hypervisors you can do it reasonably easy on x86 too, but it's likely
> that some hypercall interface is better than going through
> sysfs.

sysfs interface already exists to offline sections of memory. (same
interface as online).

The proposed patch provides easy way to find out what sections of
memory belongs to which node. (could be useful on its own).

Thanks,
Badari

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/