Re: [PATCH 1/3] mm/memory_hotplug: add missing mem_hotplug_lock

From: David Hildenbrand
Date: Mon Nov 20 2023 - 03:56:35 EST


On 20.11.23 08:23, Sumanth Korikkar wrote:
From Documentation/core-api/memory-hotplug.rst:
When adding/removing/onlining/offlining memory or adding/removing
heterogeneous/device memory, we should always hold the mem_hotplug_lock
in write mode to serialise memory hotplug (e.g. access to global/zone
variables).

mhp_(de)init_memmap_on_memory() functions can change zone stats and
struct page content, but they are currently called w/o the
mem_hotplug_lock.

When memory block is being offlined and when kmemleak goes through each
populated zone, the following theoretical race conditions could occur:
CPU 0: | CPU 1:
memory_offline() |
-> offline_pages() |
-> mem_hotplug_begin() |
... |
-> mem_hotplug_done() |
| kmemleak_scan()
| -> get_online_mems()
| ...
-> mhp_deinit_memmap_on_memory() |
[not protected by mem_hotplug_begin/done()]|
Marks memory section as offline, | Retrieves zone_start_pfn
poisons vmemmap struct pages and updates | and struct page members.
the zone related data |
| ...
| -> put_online_mems()

Fix this by ensuring mem_hotplug_lock is taken before performing
mhp_init_memmap_on_memory(). Also ensure that
mhp_deinit_memmap_on_memory() holds the lock.

online/offline_pages() are currently only called from
memory_block_online/offline(), so it is safe to move the locking there.

Fixes: a08a2ae34613 ("mm,memory_hotplug: allocate memmap from the added memory range")
Reviewed-by: Gerald Schaefer <gerald.schaefer@xxxxxxxxxxxxx>
Signed-off-by: Sumanth Korikkar <sumanthk@xxxxxxxxxxxxx>
---

Acked-by: David Hildenbrand <david@xxxxxxxxxx>

--
Cheers,

David / dhildenb