Re: [PATCH RESEND 1/4] memory tiering: add abstract distance calculation algorithms management

From: Alistair Popple
Date: Wed Jul 26 2023 - 23:45:21 EST



"Huang, Ying" <ying.huang@xxxxxxxxx> writes:

>>> The other way (suggested by this series) is to make dax/kmem call a
>>> notifier chain, then CXL CDAT or ACPI HMAT can identify the type of
>>> device and calculate the distance if the type is correct for them. I
>>> don't think that it's good to make dax/kem to know every possible
>>> types of memory devices.
>>
>> Do we expect there to be lots of different types of memory devices
>> sharing a common dax/kmem driver though? Must admit I'm coming from a
>> GPU background where we'd expect each type of device to have it's own
>> driver anyway so wasn't expecting different types of memory devices to
>> be handled by the same driver.
>
> Now, dax/kmem.c is used for
>
> - PMEM (Optane DCPMM, or AEP)
> - CXL.mem
> - HBM (attached to CPU)

Thanks a lot for the background! I will admit to having a faily narrow
focus here.

>>> And, I don't think that we are forced to use the general notifier
>>> chain interface in all memory device drivers. If the memory device
>>> driver has better understanding of the memory device, it can use other
>>> way to determine abstract distance. For example, a CXL memory device
>>> driver can identify abstract distance by itself. While other memory
>>> device drivers can use the general notifier chain interface at the
>>> same time.
>>
>> Whilst I think personally I would find that flexibility useful I am
>> concerned it means every driver will just end up divining it's own
>> distance rather than ensuring data in HMAT/CDAT/etc. is correct. That
>> would kind of defeat the purpose of it all then.
>
> But we have no way to enforce that too.

Enforce that HMAT/CDAT/etc. is correct? Agree we can't enforce it, but
we can influence it. If drivers can easily ignore the notifier chain and
do their own thing that's what will happen.

>>> While other memory device drivers can use the general notifier chain
>>> interface at the same time.

How would that work in practice though? The abstract distance as far as
I can tell doesn't have any meaning other than establishing preferences
for memory demotion order. Therefore all calculations are relative to
the rest of the calculations on the system. So if a driver does it's own
thing how does it choose a sensible distance? IHMO the value here is in
coordinating all that through a standard interface, whether that is HMAT
or something else.

- Alistair