Re: [PATCH v14 04/10] mm/demotion/dax/kmem: Set node's abstract distance to MEMTIER_DEFAULT_DAX_ADISTANCE

From: Aneesh Kumar K V
Date: Tue Aug 16 2022 - 03:56:17 EST


On 8/15/22 8:09 AM, Huang, Ying wrote:
> "Aneesh Kumar K.V" <aneesh.kumar@xxxxxxxxxxxxx> writes:
>
>> By default, all nodes are assigned to the default memory tier which
>> is the memory tier designated for nodes with DRAM
>>
>> Set dax kmem device node's tier to slower memory tier by assigning
>> abstract distance to MEMTIER_DEFAULT_DAX_ADISTANCE. Low-level drivers
>> like papr_scm or ACPI NFIT can initialize memory device type to a
>> more accurate value based on device tree details or HMAT. If the
>> kernel doesn't find the memory type initialized, a default slower
>> memory type is assigned by the kmem driver.
>>
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxx>
>> ---
>> drivers/dax/kmem.c | 42 +++++++++++++++--
>> include/linux/memory-tiers.h | 42 ++++++++++++++++-
>> mm/memory-tiers.c | 91 +++++++++++++++++++++++++++---------
>> 3 files changed, 149 insertions(+), 26 deletions(-)
>>
>> diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
>> index a37622060fff..d88814f1c414 100644
>> --- a/drivers/dax/kmem.c
>> +++ b/drivers/dax/kmem.c
>> @@ -11,9 +11,17 @@
>> #include <linux/fs.h>
>> #include <linux/mm.h>
>> #include <linux/mman.h>
>> +#include <linux/memory-tiers.h>
>> #include "dax-private.h"
>> #include "bus.h"
>>
>> +/*
>> + * Default abstract distance assigned to the NUMA node onlined
>> + * by DAX/kmem if the low level platform driver didn't initialize
>> + * one for this NUMA node.
>> + */
>> +#define MEMTIER_DEFAULT_DAX_ADISTANCE (MEMTIER_ADISTANCE_DRAM * 2)
>
> If my understanding were correct, this is targeting Optane DCPMM for
> now. The measured results in the following paper is,
>
> https://arxiv.org/pdf/2002.06018.pdf
>
> Section: 2.1 Read/Write Latencies
>
> "
> For read access, the latency of DCPMM was 400.1% higher than that of
> DRAM. For write access, it was 407.1% higher.
> "
>
> Section: 2.2 Read/Write Bandwidths
>
> "
> For read access, the throughput of DCPMM was 37.1% of DRAM. For write
> access, it was 7.8%
> "
>
> According to the above data, I think the MEMTIER_DEFAULT_DAX_ADISTANCE
> can be "5 * MEMTIER_ADISTANCE_DRAM".
>

If we look at mapping every 100% increase in latency as a memory tier, we essentially
will have 4 memory tier here. Each memory tier is covering a range of abstract distance 128.
which makes a total adistance increase from MEMTIER_ADISTANCE_DRAM by 512. This puts
DEFAULT_DAX_DISTANCE at 1024 or MEMTIER_ADISTANCE_DRAM * 2

-aneesh