RE: [External] [RFC PATCH v1 3/6] mm, zone_type: create ZONE_NVM and fill into GFP_ZONE_TABLE

From: Huaisheng HS1 Ye
Date: Wed May 09 2018 - 10:04:53 EST


> From: owner-linux-mm@xxxxxxxxx [mailto:owner-linux-mm@xxxxxxxxx] On Behalf Of Michal Hocko
>
> On Wed 09-05-18 04:22:10, Huaisheng HS1 Ye wrote:
> >
> > > On 05/07/2018 07:33 PM, Huaisheng HS1 Ye wrote:
> > > > diff --git a/mm/Kconfig b/mm/Kconfig
> > > > index c782e8f..5fe1f63 100644
> > > > --- a/mm/Kconfig
> > > > +++ b/mm/Kconfig
> > > > @@ -687,6 +687,22 @@ config ZONE_DEVICE
> > > >
> > > > +config ZONE_NVM
> > > > + bool "Manage NVDIMM (pmem) by memory management (EXPERIMENTAL)"
> > > > + depends on NUMA && X86_64
> > >
> > > Hi,
> > > I'm curious why this depends on NUMA. Couldn't it be useful in non-NUMA
> > > (i.e., UMA) configs?
> > >
> > I wrote these patches with two sockets testing platform, and there are two DDRs and
> two NVDIMMs have been installed to it.
> > So, for every socket it has one DDR and one NVDIMM with it. Here is memory region
> from memblock, you can get its distribution.
> >
> > 435 [ 0.000000] Zone ranges:
> > 436 [ 0.000000] DMA [mem 0x0000000000001000-0x0000000000ffffff]
> > 437 [ 0.000000] DMA32 [mem 0x0000000001000000-0x00000000ffffffff]
> > 438 [ 0.000000] Normal [mem 0x0000000100000000-0x00000046bfffffff]
> > 439 [ 0.000000] NVM [mem 0x0000000440000000-0x00000046bfffffff]
> > 440 [ 0.000000] Device empty
> > 441 [ 0.000000] Movable zone start for each node
> > 442 [ 0.000000] Early memory node ranges
> > 443 [ 0.000000] node 0: [mem 0x0000000000001000-0x000000000009ffff]
> > 444 [ 0.000000] node 0: [mem 0x0000000000100000-0x00000000a69c2fff]
> > 445 [ 0.000000] node 0: [mem 0x00000000a7654000-0x00000000a85eefff]
> > 446 [ 0.000000] node 0: [mem 0x00000000ab399000-0x00000000af3f6fff]
> > 447 [ 0.000000] node 0: [mem 0x00000000af429000-0x00000000af7fffff]
> > 448 [ 0.000000] node 0: [mem 0x0000000100000000-0x000000043fffffff] Normal 0
> > 449 [ 0.000000] node 0: [mem 0x0000000440000000-0x000000237fffffff] NVDIMM 0
> > 450 [ 0.000000] node 1: [mem 0x0000002380000000-0x000000277fffffff] Normal 1
> > 451 [ 0.000000] node 1: [mem 0x0000002780000000-0x00000046bfffffff] NVDIMM 1
> >
> > If we disable NUMA, there is a result as Normal an NVDIMM zones will be overlapping
> with each other.
> > Current mm treats all memory regions equally, it divides zones just by size, like
> 16M for DMA, 4G for DMA32, and others above for Normal.
> > The spanned range of all zones couldn't be overlapped.
>
> No, this is not correct. Zones can overlap.

Hi Michal,

Thanks for pointing it out.
But function zone_sizes_init decides arch_zone_lowest/highest_possible_pfn's size by max_low_pfn, then free_area_init_nodes/node are responsible for calculating the spanned size of zones from memblock memory regions.
So, ZONE_DMA and ZONE_DMA32 and ZONE_NORMAL have separate address scope. How can they be overlapped with each other?

Sincerely,
Huaisheng Ye | 叶怀胜
Linux kernel | Lenovo