Re: [RFC PATCH v2 0/2] Node migration between memory tiers

From: Gregory Price
Date: Fri Dec 15 2023 - 12:43:19 EST


On Fri, Dec 15, 2023 at 01:02:59PM +0800, Huang, Ying wrote:
> <sthanneeru.opensrc@xxxxxxxxxx> writes:
>
> > =============
> > Version Notes:
> >
> > V2 : Changed interface to memtier_override from adistance_offset.
> > memtier_override was recommended by
> > 1. John Groves <john@xxxxxxxxxxxxxx>
> > 2. Ravi Shankar <ravis.opensrc@xxxxxxxxxx>
> > 3. Brice Goglin <Brice.Goglin@xxxxxxxx>
>
> It appears that you ignored my comments for V1 as follows ...
>
> https://lore.kernel.org/lkml/87o7f62vur.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
> https://lore.kernel.org/lkml/87jzpt2ft5.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
> https://lore.kernel.org/lkml/87a5qp2et0.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
>

Not speaking for the group, just chiming in because i'd discussed it
with them.

"Memory Type" is a bit nebulous. Is a Micron Type-3 with performance X
and an SK Hynix Type-3 with performance Y a "Different type", or are
they the "Same Type" given that they're both Type 3 backed by some form
of DDR? Is socket placement of those devices relevant for determining
"Type"? Is whether they are behind a switch relevant for determining
"Type"? "Type" is frustrating when everything we're talking about
managing is "Type-3" with difference performance.

A concrete example:
To the system, a Multi-Headed Single Logical Device (MH-SLD) looks
exactly the same as an standard SLD. I may want to have some
combination of local memory expansion devices on the majority of my
expansion slots, but reserve 1 slot on each socket for a connection to
the MH-SLD. As of right now: There is no good way to differentiate the
devices in terms of "Type" - and even if you had that, the tiering
system would still lump them together.

Similarly, an initial run of switches may or may not allow enumeration
of devices behind it (depends on the configuration), so you may end up
with a static numa node that "looks like" another SLD - despite it being
some definition of "GFAM". Do number of hops matter in determining
"Type"?

So I really don't think "Type" is useful for determining tier placement.

As of right now, the system lumps DRAM nodes as one tier, and pretty
much everything else as "the other tier". To me, this patch set is an
initial pass meant to allow user-control over tier composition while
the internal mechanism is sussed out and the environment develops.

In general, a release valve that lets you redefine tiers is very welcome
for testing and validation of different setups while the industry evolves.

Just my two cents.

~Gregory

> --
> Best Regards,
> Huang, Ying
>