[LSF/MM ATTEND ] memory reclaim with NUMA rebalancing

From: Aneesh Kumar K.V
Date: Thu Jan 31 2019 - 01:50:04 EST


Michal Hocko <mhocko@xxxxxxxxxx> writes:

> Hi,
> I would like to propose the following topic for the MM track. Different
> group of people would like to use NVIDMMs as a low cost & slower memory
> which is presented to the system as a NUMA node. We do have a NUMA API
> but it doesn't really fit to "balance the memory between nodes" needs.
> People would like to have hot pages in the regular RAM while cold pages
> might be at lower speed NUMA nodes. We do have NUMA balancing for
> promotion path but there is notIhing for the other direction. Can we
> start considering memory reclaim to move pages to more distant and idle
> NUMA nodes rather than reclaim them? There are certainly details that
> will get quite complicated but I guess it is time to start discussing
> this at least.

I would be interested in this topic too. I would like to
understand the API and how it can help exploit the different type of
devices we have on OpenCAPI.

IMHO there are few proposals related to this which we could discuss together

1. HMAT series which want to expose these devices as Numa nodes
2. The patch series from Dave Hansen which just uses Pmem as Numa node.
3. The patch series from Fengguang Wu which does prevent default
allocation from these numa nodes by excluding them from zone list.
4. The patch series from Jerome Glisse which doesn't expose these as
numa nodes.

IMHO (3) is suggesting that we really don't want them as numa nodes. But
since Numa is the only interface we currently have to present them as
memory and control the allocation and migration we are forcing
ourselves to Numa nodes and then excluding them from default allocation.

-aneesh