Re: [PATCH] mm/pages_alloc.c: Don't create ZONE_MOVABLE beyond the end of a node

From: Anshuman Khandual
Date: Wed Feb 16 2022 - 00:24:30 EST




On 2/15/22 10:46 AM, Alistair Popple wrote:
> Anshuman Khandual <anshuman.khandual@xxxxxxx> writes:
>
>> Hi Alistair,
>>
>> On 2/15/22 8:28 AM, Alistair Popple wrote:
>>> ZONE_MOVABLE uses the remaining memory in each node. It's starting pfn
>>> is also aligned to MAX_ORDER_NR_PAGES. It is possible for the remaining
>>> memory in a node to be less than MAX_ORDER_NR_PAGES, meaning there is
>>> not enough room for ZONE_MOVABLE on that node.
>>
>> How plausible is this scenario on normal systems ?
>
> Probably not very. I happened to run into this on my development/test x86 VM
> which has 8GB and was booted with `numa=fake=4 kernelcore=60%` but in theory I
> guess any system that has a node with less than MAX_ORDER_NR_PAGES left over for
> ZONE_MOVABLE may be susceptible.
>
> This was the RAM map:
>
> [ 0.000000] BIOS-provided physical RAM map:
> [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
> [ 0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
> [ 0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
> [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000007ffddfff] usable
> [ 0.000000] BIOS-e820: [mem 0x000000007ffde000-0x000000007fffffff] reserved
> [ 0.000000] BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] reserved
> [ 0.000000] BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
> [ 0.000000] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
> [ 0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
> [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000027fffffff] usable
>
> [...]
>
> [ 0.065897] Early memory node ranges
> [ 0.065898] node 0: [mem 0x0000000000001000-0x000000000009efff]
> [ 0.065900] node 0: [mem 0x0000000000100000-0x000000007ffddfff]
> [ 0.065902] node 1: [mem 0x0000000100000000-0x000000017fffffff]
> [ 0.065904] node 2: [mem 0x0000000180000000-0x00000001ffffffff]
> [ 0.065906] node 3: [mem 0x0000000200000000-0x000000027fffffff]
>
> Note the reserved range from 0x000000007ffde000 to 0x000000007fffffff resulting
> in node-0 ending at 0x000000007ffddfff.
>
>> Should not the node always contain MAX_ORDER_NR_PAGES aligned pages ? Also all
>> zones which get created from that node should also be MAX_ORDER_NR_PAGES
>> aligned ?
>
> I'm not sure why that would be case given page size and MAX_ORDER_NR_PAGES can
> be set via a kernel configuration parameter. Obviously it wasn't the case here

I assumed that in general that would be the case.

> or this situation would not arise. That said I don't know this code well, and
> this was where I decided to stop shaving this yak so it's possible there is an
> even deeper underlying issue.
>
> Either way I don't *think* the fix should introduce any problems as it shouldn't
> do anything unless you were going to hit this issue anyway (which took sometime
> to track down as the cause wasn't obvious).

Fair enough.

>
>> I am just curious how a node could end up being like this.
>
> - Anshuman
>