Re: [PATCH v2 0/2] Add a new scheme to support demotion on tiered memory system

From: Baolin Wang
Date: Thu Dec 23 2021 - 01:35:11 EST




On 12/23/2021 11:22 AM, Huang, Ying wrote:
Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> writes:

On 12/23/2021 9:07 AM, Huang, Ying wrote:
Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> writes:

Hi,

Now on tiered memory system with different memory types, the reclaim path in
shrink_page_list() already support demoting pages to slow memory node instead
of discarding the pages. However, at that time the fast memory node memory
wartermark is already tense, which will increase the memory allocation latency
during page demotion. So a new method from user space demoting cold pages
proactively will be more helpful.

We can rely on the DAMON in user space to help to monitor the cold memory on
fast memory node, and demote the cold pages to slow memory node proactively to
keep the fast memory node in a healthy state.

This patch set introduces a new scheme named DAMOS_DEMOTE to support this feature,
and works well from my testing. Any comments are welcome. Thanks.
As a performance optimization patch, it's better to provide some
test
results.

Actually this is a functional patch, which adds a new scheme for
DAMON. And I think it is too early to measure the performance for the
real workload, and more work need to do for DAMON used on tiered
memory system (like supporting promotion scheme later).

I don't think you provide any new functionality except the performance
influence.

Fair enough. I mean for DAMON.

And I think proactive demotion itself can show some performance benefit
already. Just like we can find the performance benefit in the proactive

Yes, I think so too. But now I am afraid I can not get some obvious performance benefit with current linux-next branch on tiered memory system, since the promotion patches are not there (yes, I can backport them into my local branch to test), meanwhile I may need more tuning for the demote scheme (such as tuning min-size, max-size, min-acc, max-acc, min-age, max-age to get a better performance) for the real workload. Now I just did a small step to add demotiong support for DAMON, so I do not expect some obvious performance gain now (more work need to research). But same as the proactive reclaim, I think this is on the right way for DAMON.

Anyway, maybe some other people also curious the benefit, and I will do some measurement with DAMON demote scheme on mysql to show the performance results. Or do you have any other measurement suggestion?

reclaim patchset as below.

https://lore.kernel.org/lkml/20211019150731.16699-1-sj@xxxxxxxxxx/

Another question is why we shouldn't do this in user space? With DAMON,
it's possible to export cold memory regions information to the user
space, then we can use move_pages() to migrate them from DRAM to PMEM.
What's the problem of that?

IMO this is the purpose of introducing scheme for DAMON, and you can
check more in the Documentation/admin-guide/mm/damon/usage.rst.

"
Schemes
-------

For usual DAMON-based data access aware memory management
optimizations, users
would simply want the system to apply a memory management action to a memory
region of a specific access pattern. DAMON receives such formalized
operation
schemes from the user and applies those to the target processes.
"

For proactive reclaim, we haven't a user space ABI to reclaim a page of
a process from memory to disk. So it appears necessary to add a kernel
module to do that.

But for proactive demotion, we already have a user space ABI
(move_pages()) to demote a page of a process from DRAM to PMEM. What
prevents you to do all these in the user space?

And, I found there are MADV_XXX schemes too. Where the user space ABIs
are available already. TBH, I don't know why we need these given there
are already user space ABIs. Maybe this is a question for SeongJae too.

From my understanding, schemes will simplify the design for user space to avoid implementing their own strategy according to the monitoring results, and more details in patch[1]. SeongJae may have more input for the purpose.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=1f366e421c8f69583ed37b56d86e3747331869c3