Re: [PATCH 0/2] Add a new scheme to support demotion on tiered memory system

From: Baolin Wang
Date: Tue Dec 21 2021 - 09:31:45 EST




On 12/21/2021 9:26 PM, SeongJae Park wrote:
Hi Baolin,

On Tue, 21 Dec 2021 17:18:02 +0800 Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> wrote:

Hi,

Now on tiered memory system with different memory types, the reclaim path in
shrink_page_list() already support demoting pages to slow memory node instead
of discarding the pages. However, at that time the fast memory node memory
wartermark is already tense, which will increase the memory allocation latency
during page demotion. So a new method from user space demoting cold pages
proactively will be more helpful.

We can rely on the DAMON in user space to help to monitor the cold memory on
fast memory node, and demote the cold pages to slow memory node proactively to
keep the fast memory node in a healthy state.

This patch set introduces a new scheme named DAMOS_DEMOTE to support this feature,
and works well from my testing. Any comments are welcome. Thanks.

I like the idea, thank you for these patches! If possible, could you share
some details about your tests?

Sure, sorry for not adding more information about my tests.

My machine contains 64G DRAM + 256G AEP(persistent memory), and you should enable the demotion firstly by:
echo "true" > /sys/kernel/mm/numa/demotion_enabled

Then I just write a simple test case like below to mmap some anon memory, and then just read and write half of the mmap buffer to let another half to be cold enough to demote.

int main()
{
int len = 50 * 1024 * 1024;
int scan_len = len / 2;
int i, ret, j;
unsigned long *p;

p = mmap(NULL, len, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (p == MAP_FAILED) {
printf("failed to get memory\n");
return -1;
}

for (i = 0; i < len / sizeof(*p); i++)
p[i] = 0x55aa;

/* Let another half of buffer to be cold */
do {
for (i = 0; i < scan_len / sizeof(*p); i++)
p[i] = 0x55aa;

sleep(2);

for (i = 0; i < scan_len / sizeof(*p); i++)
j += p[i] >> 2;
} while (1);

munmap(p, len);
return 0;
}

After setting the atts/schemes/target_ids, then start monitoring:
echo 100000 1000000 1000000 10 1000 > /sys/kernel/debug/damon/attrs
echo 4096 8192000 0 5 10 2000 5 1000 2097152 5000 0 0 0 0 0 3 2 1 > /sys/kernel/debug/damon/schemes

After a while, you can check the demote statictics by below command, and you can find the demote scheme is applied by demoting some cold pages to slow memory (AEP) node.

cat /proc/vmstat | grep "demote"
pgdemote_direct 6881