Re: [Patch 2/2]: powerpc/hotplug/mm: Fix hot-add memory node assoc

From: Michael Ellerman
Date: Tue Jun 06 2017 - 05:48:13 EST


Michael Bringmann <mwb@xxxxxxxxxxxxxxxxxx> writes:
> On 06/01/2017 04:36 AM, Michael Ellerman wrote:
>> Do you actually see mention of nodes 0 and 8 in the dmesg?
>
> When the 'numa.c' code is built with debug messages, and the system was
> given that configuration by pHyp, yes, I did.
>
>> What does it say?
>
> The debug message for each core thread would be something like,
>
> removing cpu 64 from node 0
> adding cpu 64 to node 8
>
> repeated for all 8 threads of the CPU, and usually with the messages
> for all of the CPUs coming out intermixed on the console/dmesg log.

OK. I meant what do you see at boot.

I'm curious how we're discovering node 0 and 8 at all if neither has any
memory or CPUs assigned at boot.

>> Right. So it's not that you're hot adding memory into a previously
>> unseen node as you implied in earlier mails.
>
> In the sense that the nodes were defined in the device tree, that is correct.

Where are they defined in the device tree? That's what I'm trying to understand.


> In the sense that those nodes are currently deleted from node_possible_map in
> 'numa.c' by the instruction 'node_and(node_possible_map,node_possible_map,
> node_online_map);', the nodes are no longer available to place memory or CPU.

Yeah I understand that part.

> Okay, I can try to insert code that extracts all of the nodes from the
> ibm,associativity-lookup-arrays property and merge them with the nodes
> put into the online map from the CPUs that were found previously during
> boot of the powerpc code.

Hmm, will that work?

Looking at PAPR it's not clear to me that it will work for nodes that
have no memory assigned at boot.

This property is used to duplicate the function of the
âibm,associativityâ property in a /memory node. Each âassignedâ LMB
represented has an index valued between 0 and M-1 which is used as in
index into this table to select which associativity list to use for
the LMB. âunassignedâ LMBs are place holders for potential DLPAR
additions, for which the associativity list index is meaningless and
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
is given the reserved value of -1. This static property, need only
contain values relevant for the LMBs presented in the
âibm,dynamicreconfiguration-memoryâ node; for a dynamic LPAR addition
of a new LMB, the device tree fragment reported by the
ibm,configure-connector RTAS function is a /memory node, with the
inclusion of the âibm,associativityâ device tree property defined in
Section C.6.2.2â âProperties of the Children of Rootââ on page 1059.

>> What does your device tree look like? Can you send us the output of:
>>
>> $ lsprop /proc/device-tree

Thanks. I forgot that lsprop will truncate long properties, I actually
wanted to see all of the ibm,dynamic-memory property.

But looking at the code I see the only place we set a nid online is if
there is a CPU assigned to it:

static int __init parse_numa_properties(void)
{
...
for_each_present_cpu(i) {
...
cpu = of_get_cpu_node(i, NULL);
nid = of_node_to_nid_single(cpu);
...
node_set_online(nid);
}

Or for memory nodes (same function):

for_each_node_by_type(memory, "memory") {
...
nid = of_node_to_nid_single(memory);
...
node_set_online(nid);
...
}

Or for entries in ibm,dynamic-memory that are assigned:

static void __init parse_drconf_memory(struct device_node *memory)
{
...
for (; n != 0; --n) {
...
/* skip this block if the reserved bit is set in flags (0x80)
or if the block is not assigned to this partition (0x8) */
if ((drmem.flags & DRCONF_MEM_RESERVED)
|| !(drmem.flags & DRCONF_MEM_ASSIGNED))
continue;

...
do {
...
nid = of_drconf_to_nid_single(&drmem, &aa);
node_set_online(nid);
...
} while (--ranges);
}
}


So I don't see from that how we can even be aware that node 0 and 8
exist at boot based on that. Maybe there's another path I'm missing
though.

cheers