Re: [PATCH 1/1] mm: compaction: avoid fast_isolate_around() to set pageblock_skip on reserved pages

From: Andrea Arcangeli
Date: Wed Nov 25 2020 - 14:02:12 EST


On Wed, Nov 25, 2020 at 01:08:54PM +0100, Vlastimil Babka wrote:
> Yeah I guess it would be simpler if zoneid/nid was correct for
> pfn_valid() pfns within a zone's range, even if they are reserved due
> not not being really usable memory.
>
> I don't think we want to introduce CONFIG_HOLES_IN_ZONE to x86. If the
> chosen solution is to make this to a real hole, the hole should be
> extended to MAX_ORDER_NR_PAGES aligned boundaries.

The way pfn_valid works it's not possible to render all non-RAM pfn as
!pfn_valid, CONFIG_HOLES_IN_ZONE would not achieve it 100% either. So
I don't think we can rely on that to eliminate all non-RAM reserved
pages from the mem_map and avoid having to initialize them in the
first place. Some could remain as in this case since in the same
pageblock there's non-RAM followed by RAM and all pfn are valid.

> In any case, compaction code can't fix this with better range checks.

David's correct that it can, by adding enough PageReserved (I'm
running all systems reproducing this with plenty of PageReserved
checks in all places to work around it until we do a proper fix).

My problem with that is that 1) it's simply non enforceable at runtime
that there is not missing PageReserved check and 2) what benefit it
would provide to leave a wrong zoneid in reserved pages and having to
add extra PageReserved checks?

A struct page has a deterministic zoneid/nid, if it's pointed by a
valid pfn (as in pfn_valid()) the simplest is that the zoneid/nid in
the page remain correct no matter if it's reserved at boot, it was
marked reserved by a driver that swap the page somewhere else with the
GART or EFI or something else. All reserved pages should work the
same, RAM and non-RAM, since the non-RAM status can basically change
at runtime if a driver assigns the page to hw somehow.

NOTE: on the compaction side, we still need to add
thepageblock_pfn_to_page to validate the "highest" pfn because the
pfn_valid() check is missing on the first pfn on the pageblock as it's
also missing the check of a pageblock that spans over two different
zones.

Thanks,
Andrea