Re: [RFC PATCH 0/4] DAMON based 2-tier memory management for CXL memory

From: SeongJae Park
Date: Tue Jan 16 2024 - 17:14:06 EST


Hello,

On Mon, 15 Jan 2024 13:52:48 +0900 Honggyu Kim <honggyu.kim@xxxxxx> wrote:

> There was an RFC IDEA "DAMOS-based Tiered-Memory Management" previously
> posted at [1].
>
> It says there is no implementation of the demote/promote DAMOS action
> are made. This RFC is about its implementation for physical address
> space.
>
[...]
> Evaluation Results
> ==================
>
[...]
> In summary of both results, our evaluation shows that "DAMON 2-tier"
> memory management reduces the performance slowdown compared to the
> "default" memory policy from 15~17% to 4~5% when the system runs with
> high memory pressure on its fast tier DRAM nodes.
>
> The similar evaluation was done in another machine that has 256GB of
> local DRAM and 96GB of CXL memory. The performance slowdown is reduced
> from 20~24% for "default" to 5~7% for "DAMON 2-tier".
>
> Having these DAMOS_DEMOTE and DAMOS_PROMOTE actions can make 2-tier
> memory systems run more efficiently under high memory pressures.


Thank you so much for this great patches and the above nice test results. I
believe the test setup and results make sense, and merging a revised version of
this patchset would provide real benefits to the users.

In a high level, I think it might better to separate DAMON internal changes
from DAMON external changes.

For DAMON part changes, I have no big concern other than trivial coding style
level comments.

For DAMON-external changes that implementing demote_pages() and
promote_pages(), I'm unsure if the implementation is reusing appropriate
functions, and if those are placee in right source file. Especially, I'm
unsure if vmscan.c is the right place for promotion code. Also I don't know if
there is a good agreement on the promotion/demotion target node decision. That
should be because I'm not that familiar with the areas and the files, but I
feel this might because our discussions on the promotion and the demotion
operations are having rooms for being more matured. Because I'm not very
faimiliar with the part, I'd like to hear others' comments, too.

To this end, I feel the problem might be able to be simpler, because this
patchset is trying to provide two sophisticated operations, while I think a
simpler approach might be possible. My humble simpler idea is adding a DAMOS
operation for moving pages to a given node (like sys_move_phy_pages RFC[1]),
instead of the promote/demote. Because the general pages migration can handle
multiple cases including the promote/demote in my humble assumption. In more
detail, users could decide which is the appropriate node for promotion or
demotion and use the new DAMOS action to do promotion and demotion. Users
would requested to decide which node is the proper promotion/demotion target
nodes, but that decision wouldn't be that hard in my opinion.

For this, 'struct damos' would need to be updated for such argument-dependent
actions, like 'struct damos_filter' is haing a union.

In future, we could extend the operation to the promotion and the demotion
after the dicussion around the promotion and demotion is matured, if required.
And assuming DAMON be extended for originating CPU-aware access monitoring, the
new DAMOS action would also cover more use cases such as general NUMA nodes
balancing (extending DAMON for CPU-aware monitoring would required), and some
complex configurations where having both CPU affinity and tiered memory. I
also think that may well fit with my RFC idea[2] for tiered memory management.

Looking forward to opinions from you and others. I admig I miss many things,
and more than happy to be enlightened.

[1] https://lwn.net/Articles/944007/
[2] https://lore.kernel.org/damon/20231112195602.61525-1-sj@xxxxxxxxxx/


Thanks,
SJ

>
> Signed-off-by: Honggyu Kim <honggyu.kim@xxxxxx>
> Signed-off-by: Hyeongtak Ji <hyeongtak.ji@xxxxxx>
> Signed-off-by: Rakie Kim <rakie.kim@xxxxxx>
>
> [1] https://lore.kernel.org/damon/20231112195602.61525-1-sj@xxxxxxxxxx
> [2] https://github.com/skhynix/hmsdk
> [3] https://github.com/redis/redis/tree/7.0.0
> [4] https://github.com/brianfrankcooper/YCSB/tree/0.17.0
> [5] https://dl.acm.org/doi/10.1145/3503222.3507731
> [6] https://dl.acm.org/doi/10.1145/3582016.3582063
>
> Honggyu Kim (2):
> mm/vmscan: refactor reclaim_pages with reclaim_or_migrate_folios
> mm/damon: introduce DAMOS_DEMOTE action for demotion
>
> Hyeongtak Ji (2):
> mm/memory-tiers: add next_promotion_node to find promotion target
> mm/damon: introduce DAMOS_PROMOTE action for promotion
>
> include/linux/damon.h | 4 +
> include/linux/memory-tiers.h | 11 ++
> include/linux/migrate_mode.h | 1 +
> include/linux/vm_event_item.h | 1 +
> include/trace/events/migrate.h | 3 +-
> mm/damon/paddr.c | 46 ++++++-
> mm/damon/sysfs-schemes.c | 2 +
> mm/internal.h | 2 +
> mm/memory-tiers.c | 43 ++++++
> mm/vmscan.c | 231 +++++++++++++++++++++++++++++++--
> mm/vmstat.c | 1 +
> 11 files changed, 330 insertions(+), 15 deletions(-)
>
>
> base-commit: 0dd3ee31125508cd67f7e7172247f05b7fd1753a
> --
> 2.34.1