Re: [RFC PATCH v1 4/4] mm, memory_hotplug: fix inconsistent num_poisoned_pages on memory hotremove

From: Miaohe Lin
Date: Thu Apr 28 2022 - 03:16:15 EST


On 2022/4/28 12:05, HORIGUCHI NAOYA(堀口 直也) wrote:
> On Thu, Apr 28, 2022 at 11:20:16AM +0800, Miaohe Lin wrote:
>> On 2022/4/27 12:28, Naoya Horiguchi wrote:
>>> From: Naoya Horiguchi <naoya.horiguchi@xxxxxxx>
>>>
>>> When offlining memory section with hwpoisoned pages, the hwpoisons are
>>> canceled. But num_poisoned_pages is not updated for that event, so the
>>> counter becomes inconsistent.
>>
>> IIUC, this work is already done via clear_hwpoisoned_pages when __remove_pages.
>> Or am I miss something?
>
> Actually I had the same question when writing this patch, and found that
> __remove_pages() seems to be called from device memory or HMM, but not from

It seems remove_memory (which calls __remove_pages) will be called as .detach callback of
memory_device_handler in drivers/acpi/acpi_memhotplug.c. So the hwpoison info will also be
clear for that memory ?

> offline_pages(). If you mean that we could make offline_pages() call
> clear_hwpoisoned_pages(), that seems possible and I'll consider it.
>
> But as David and Oscar pointed out for 0/4, removing PageHWPoison flags
> in offlining seems not to be right thing, so I'd like to have some consensus
> on what way to go first.

Agree. We should have some consensus first.

Thanks!

>
> Thanks,
> Naoya Horiguchi
>