Re: [v2 RFC PATCH 0/9] Another Approach to Use PMEM as NUMA Node

From: Michal Hocko
Date: Wed Apr 17 2019 - 05:23:23 EST


On Tue 16-04-19 14:22:33, Dave Hansen wrote:
> On 4/16/19 12:19 PM, Yang Shi wrote:
> > would we prefer to try all the nodes in the fallback order to find the
> > first less contended one (i.e. DRAM0 -> PMEM0 -> DRAM1 -> PMEM1 -> Swap)?
>
> Once a page went to DRAM1, how would we tell that it originated in DRAM0
> and is following the DRAM0 path rather than the DRAM1 path?
>
> Memory on DRAM0's path would be:
>
> DRAM0 -> PMEM0 -> DRAM1 -> PMEM1 -> Swap
>
> Memory on DRAM1's path would be:
>
> DRAM1 -> PMEM1 -> DRAM0 -> PMEM0 -> Swap
>
> Keith Busch had a set of patches to let you specify the demotion order
> via sysfs for fun. The rules we came up with were:

I am not a fan of any sysfs "fun"

> 1. Pages keep no history of where they have been

makes sense

> 2. Each node can only demote to one other node

Not really, see my other email. I do not really see any strong reason
why not use the full zonelist to demote to

> 3. The demotion path can not have cycles

yes. This could be achieved by GFP_NOWAIT opportunistic allocation for
the migration target. That should prevent from loops or artificial nodes
exhausting quite naturaly AFAICS. Maybe we will need some tricks to
raise the watermark but I am not convinced something like that is really
necessary.

--
Michal Hocko
SUSE Labs