Re: [PATCH v2 2/2] mm: fix initialization of struct page for holes in memory layout

From: Mike Rapoport
Date: Tue Jan 05 2021 - 03:24:55 EST


Hi,

On Mon, Jan 04, 2021 at 02:03:00PM -0500, Qian Cai wrote:
> On Wed, 2020-12-09 at 23:43 +0200, Mike Rapoport wrote:
> > From: Mike Rapoport <rppt@xxxxxxxxxxxxx>
> >
> > Interleave initialization of pages that correspond to holes with the
> > initialization of memory map, so that zone and node information will be
> > properly set on such pages.
> >
> > Fixes: 73a6e474cb37 ("mm: memmap_init: iterate over memblock regions rather
> > that check each PFN")
> > Reported-by: Andrea Arcangeli <aarcange@xxxxxxxxxx>
> > Signed-off-by: Mike Rapoport <rppt@xxxxxxxxxxxxx>
>
> Reverting this commit on the top of today's linux-next fixed a crash while
> reading /proc/kpagecount on a NUMA server.

Can you please post the entire dmesg?
Is it possible to get the pfn that triggered the crash?

> [ 8858.006726][T99897] BUG: unable to handle page fault for address: fffffffffffffffe
> [ 8858.014814][T99897] #PF: supervisor read access in kernel mode
> [ 8858.020686][T99897] #PF: error_code(0x0000) - not-present page
> [ 8858.026557][T99897] PGD 1371417067 P4D 1371417067 PUD 1371419067 PMD 0
> [ 8858.033224][T99897] Oops: 0000 [#1] SMP KASAN NOPTI
> [ 8858.038710][T99897] CPU: 28 PID: 99897 Comm: proc01 Tainted: G O 5.11.0-rc1-next-20210104 #1
> [ 8858.048515][T99897] Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 03/09/2018
> [ 8858.057794][T99897] RIP: 0010:kpagecount_read+0x1be/0x5e0
> PageSlab at include/linux/page-flags.h:342
> (inlined by) kpagecount_read at fs/proc/page.c:69

--
Sincerely yours,
Mike.