Re: [PATCH v3] mm: migrate: Support multiple target nodes demotion

From: Baolin Wang
Date: Thu Nov 11 2021 - 22:09:55 EST




On 2021/11/12 11:02, Huang, Ying wrote:
Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> writes:

On 2021/11/12 10:44, Huang, Ying wrote:
Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> writes:

We have some machines with multiple memory types like below, which
have one fast (DRAM) memory node and two slow (persistent memory) memory
nodes. According to current node demotion policy, if node 0 fills up,
its memory should be migrated to node 1, when node 1 fills up, its
memory will be migrated to node 2: node 0 -> node 1 -> node 2 ->stop.

But this is not efficient and suitbale memory migration route
for our machine with multiple slow memory nodes. Since the distance
between node 0 to node 1 and node 0 to node 2 is equal, and memory
migration between slow memory nodes will increase persistent memory
bandwidth greatly, which will hurt the whole system's performance.

Thus for this case, we can treat the slow memory node 1 and node 2
as a whole slow memory region, and we should migrate memory from
node 0 to node 1 and node 2 if node 0 fills up.

This patch changes the node_demotion data structure to support multiple
target nodes, and establishes the migration path to support multiple
target nodes with validating if the node distance is the best or not.

available: 3 nodes (0-2)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
node 0 size: 62153 MB
node 0 free: 55135 MB
node 1 cpus:
node 1 size: 127007 MB
node 1 free: 126930 MB
node 2 cpus:
node 2 size: 126968 MB
node 2 free: 126878 MB
node distances:
node 0 1 2
0: 10 20 20
1: 20 10 20
2: 20 20 10

Signed-off-by: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx>

snip

/*
* 'next_pass' contains nodes which became migration
@@ -3192,6 +3281,14 @@ static int __init migrate_on_reclaim_init(void)
{
int ret;
+ /*
+ * Ignore allocation failure, if this kmalloc fails
+ * at boot time, we are likely in bigger trouble.
+ */
+ node_demotion = kmalloc_array(nr_node_ids,
+ sizeof(struct demotion_nodes),
+ GFP_KERNEL);
+
I think we should WARN_ON() here.

In this unlikey case, I think the mm core will print more information,
IMHO WARN_ON() will help little. Anyway no strong opinion on
this. Other than that, can I get your reviewed-by tag with this nit
fixed? Thanks.

Yes. Please add my "reviewed-by" after changing this.

OK. Thanks for your reviewing.