[PATCH] memory-hotplug: Clear pgdat which is allocated by bootmem in try_offline_node()

From: Yasuaki Ishimatsu
Date: Mon Oct 20 2014 - 06:06:53 EST


When hot adding the same memory after hot removing a memory,
the following messages are shown:

WARNING: CPU: 20 PID: 6 at mm/page_alloc.c:4968 free_area_init_node+0x3fe/0x426()
...
Call Trace:
[<...>] dump_stack+0x46/0x58
[<...>] warn_slowpath_common+0x81/0xa0
[<...>] warn_slowpath_null+0x1a/0x20
[<...>] free_area_init_node+0x3fe/0x426
[<...>] ? up+0x32/0x50
[<...>] hotadd_new_pgdat+0x90/0x110
[<...>] add_memory+0xd4/0x200
[<...>] acpi_memory_device_add+0x1aa/0x289
[<...>] acpi_bus_attach+0xfd/0x204
[<...>] ? device_register+0x1e/0x30
[<...>] acpi_bus_attach+0x178/0x204
[<...>] acpi_bus_scan+0x6a/0x90
[<...>] ? acpi_bus_get_status+0x2d/0x5f
[<...>] acpi_device_hotplug+0xe8/0x418
[<...>] acpi_hotplug_work_fn+0x1f/0x2b
[<...>] process_one_work+0x14e/0x3f0
[<...>] worker_thread+0x11b/0x510
[<...>] ? rescuer_thread+0x350/0x350
[<...>] kthread+0xe1/0x100
[<...>] ? kthread_create_on_node+0x1b0/0x1b0
[<...>] ret_from_fork+0x7c/0xb0
[<...>] ? kthread_create_on_node+0x1b0/0x1b0

The detaled explanation is as follows:

When hot removing memory, pgdat is set to 0 in try_offline_node().
But if the pgdat is allocated by bootmem allocator, the clearing
step is skipped. And when hot adding the same memory, the uninitialized
pgdat is reused. But free_area_init_node() chacks wether pgdat is set
to zero. As a result, free_area_init_node() hits WARN_ON().

This patch clears pgdat which is allocated by bootmem allocator
in try_offline_node().

Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@xxxxxxxxxxxxxx>
CC: Zhang Zhen <zhenzhang.zhang@xxxxxxxxxx>
CC: Wang Nan <wangnan0@xxxxxxxxxx>
CC: Tang Chen <tangchen@xxxxxxxxxxxxxx>
CC: Toshi Kani <toshi.kani@xxxxxx>
CC: Dave Hansen <dave.hansen@xxxxxxxxx>
CC: David Rientjes <rientjes@xxxxxxxxxx>

---
mm/memory_hotplug.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 29d8693..7649f7c 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1943,7 +1943,7 @@ void try_offline_node(int nid)

if (!PageSlab(pgdat_page) && !PageCompound(pgdat_page))
/* node data is allocated from boot memory */
- return;
+ goto out;

/* free waittable in each zone */
for (i = 0; i < MAX_NR_ZONES; i++) {
@@ -1957,6 +1957,7 @@ void try_offline_node(int nid)
vfree(zone->wait_table);
}

+out:
/*
* Since there is no way to guarentee the address of pgdat/zone is not
* on stack of any kernel threads or used by other kernel objects
--
1.8.3.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/