Re: [PATCH] mm: Check if section present during memory block (un)registering

From: Yinghai Lu
Date: Wed Aug 26 2015 - 00:04:49 EST


On Tue, Aug 25, 2015 at 4:08 PM, Andrew Morton
<akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Tue, 25 Aug 2015 15:41:16 -0700 Yinghai Lu <yinghai@xxxxxxxxxx> wrote:
>
>> Tony found on his setup, if memory block size 512M will cause crash
>> during booting.
>>
>> BUG: unable to handle kernel paging request at ffffea0074000020
>> IP: [<ffffffff81670527>] get_nid_for_pfn+0x17/0x40
>> PGD 128ffcb067 PUD 128ffc9067 PMD 0
>> Oops: 0000 [#1] SMP
>> Modules linked in:
>> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.2.0-rc8 #1
>> ...
>> Call Trace:
>> [<ffffffff81453b56>] ? register_mem_sect_under_node+0x66/0xe0
>> [<ffffffff81453eeb>] register_one_node+0x17b/0x240
>> [<ffffffff81b1f1ed>] ? pci_iommu_alloc+0x6e/0x6e
>> [<ffffffff81b1f229>] topology_init+0x3c/0x95
>> [<ffffffff8100213d>] do_one_initcall+0xcd/0x1f0
>>
>> The system has non continuous RAM address:
>> BIOS-e820: [mem 0x0000001300000000-0x0000001cffffffff] usable
>> BIOS-e820: [mem 0x0000001d70000000-0x0000001ec7ffefff] usable
>> BIOS-e820: [mem 0x0000001f00000000-0x0000002bffffffff] usable
>> BIOS-e820: [mem 0x0000002c18000000-0x0000002d6fffefff] usable
>> BIOS-e820: [mem 0x0000002e00000000-0x00000039ffffffff] usable
>>
>> So there are start sections in memory block not present.
>> For example:
>> memory block : [0x2c18000000, 0x2c20000000) 512M
>> first three sections are not present.
>>
>> Current register_mem_sect_under_node() assume first section is present,
>> but memory block section number range [start_section_nr, end_section_nr]
>> would include not present section.
>>
>> For arch that support vmemmap, we don't setup memmap for struct page area
>> within not present sections area.
>>
>> So skip the pfn range that belong to not present section.
>>
>> Als fixes unregister_mem_sect_under_nodes().
>
> It appears this should be backported into -stable kernels, yes? Do you
> know which kernel versions need the fix?

should add following according to Tony's email.

Fixes: bdee237c0343 ("x86: mm: Use 2GB memory block size on large
memory x86-64 systems")
Fixes: 982792c782ef ("x86, mm: probe memory block size for generic x86 64bit")
Cc: stable@xxxxxxxxxx #v3.15

>
>> --- a/drivers/base/node.c
>> +++ b/drivers/base/node.c
>> @@ -390,8 +390,14 @@ int register_mem_sect_under_node(struct memory_block *mem_blk, int nid)
>> sect_end_pfn = section_nr_to_pfn(mem_blk->end_section_nr);
>> sect_end_pfn += PAGES_PER_SECTION - 1;
>> for (pfn = sect_start_pfn; pfn <= sect_end_pfn; pfn++) {
>> - int page_nid;
>> + int page_nid, scn_nr;
>>
>> + scn_nr = pfn_to_section_nr(pfn);
>> + if (!present_section_nr(scn_nr)) {
>> + pfn = round_down(pfn + PAGES_PER_SECTION,
>> + PAGES_PER_SECTION) - 1;
>> + continue;
>> + }
>
> Can we please add a comment here telling readers why this is being
> done? What scenario is being detected and how it comes about.
>

Yes, should add

/* skip pfn range from absent memory section */
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/