Re: [PATCH v3 7/7] mm, page_alloc: disable pcplists during memory offline

From: David Hildenbrand
Date: Wed Nov 11 2020 - 13:00:05 EST


On 11.11.20 10:28, Vlastimil Babka wrote:
Memory offlining relies on page isolation to guarantee a forward
progress because pages cannot be reused while they are isolated. But the
page isolation itself doesn't prevent from races while freed pages are
stored on pcp lists and thus can be reused. This can be worked around by
repeated draining of pcplists, as done by commit 968318261221
("mm/memory_hotplug: drain per-cpu pages again during memory offline").

David and Michal would prefer that this race was closed in a way that callers
of page isolation who need stronger guarantees don't need to repeatedly drain.
David suggested disabling pcplists usage completely during page isolation,
instead of repeatedly draining them.

To achieve this without adding special cases in alloc/free fastpath, we can use
the same approach as boot pagesets - when pcp->high is 0, any pcplist addition
will be immediately flushed.

The race can thus be closed by setting pcp->high to 0 and draining pcplists
once, before calling start_isolate_page_range(). The draining will serialize
after processes that already disabled interrupts and read the old value of
pcp->high in free_unref_page_commit(), and processes that have not yet disabled
interrupts, will observe pcp->high == 0 when they are rescheduled, and skip
pcplists. This guarantees no stray pages on pcplists in zones where isolation
happens.

This patch thus adds zone_pcp_disable() and zone_pcp_enable() functions that
page isolation users can call before start_isolate_page_range() and after
unisolating (or offlining) the isolated pages.

Also, drain_all_pages() is optimized to only execute on cpus where pcplists are
not empty. The check can however race with a free to pcplist that has not yet
increased the pcp->count from 0 to 1. Thus make the drain optionally skip the
racy check and drain on all cpus, and use this option in zone_pcp_disable().

^ there it is, can you move that comment into the code? People (e.g., me :) ) staring at that will be confused otherwise.

--
Thanks,

David / dhildenb