Re: [RFC 0/4] Introduce unbalance proactive reclaim

From: Huan Yang
Date: Sun Nov 12 2023 - 21:21:46 EST



在 2023/11/10 20:24, Michal Hocko 写道:
On Fri 10-11-23 11:48:49, Huan Yang wrote:
[...]
Also, When the application enters the foreground, the startup speed
may be slower. Also trace show that here are a lot of block I/O.
(usually 1000+ IO count and 200+ms IO Time) We usually observe very
little block I/O caused by zram refault.(read: 1698.39MB/s, write:
995.109MB/s), usually, it is faster than random disk reads.(read:
48.1907MB/s write: 49.1654MB/s). This test by zram-perf and I change a
little to test UFS.

Therefore, if the proactive reclamation encounters many file pages,
the application may become slow when it is opened.
OK, this is an interesting information. From the above it seems that
storage based IO refaults are order of magnitude more expensive than
swap (zram in this case). That means that the memory reclaim should
_in general_ prefer anonymous memory reclaim over refaulted page cache,
right? Or is there any reason why "frozen" applications are any
different in this case?
Frozen applications mean that the application process is no longer active,
so once its private anonymous page data is swapped out, the anonymous
pages will not be refaulted until the application becomes active again.

On the contrary, page caches are usually shared. Even if the application that
first read the file is no longer active, other processes may still read the file.
Therefore, it is not reasonable to use the proactive reclamation interface to
reclaim page caches without considering memory pressure.

Then, considering the recycling cost of anonymous pages and page cache,
the idea of unbalanced recycling as described above is generated.

Our traditional interface to control the anon vs. file balance has been
swappiness. It is not the best interface and it has its flaws but
have you experimented with the global swappiness to express that
preference? What were your observations? Please note that the behavior
We have tested this part and found that no version of the code has the
priority control over swappiness.

This means that even if we modify swappiness to 0 or 200,
we cannot achieve the goal of unbalanced reclaim if some conditions
are not met during the reclaim process. Under certain conditions,
we may mistakenly reclaim file pages, and since we usually trigger
active reclaim when there is sufficient memory(before LMKD trigger),
this will cause higher block IO.

This RFC code provide some flags with the highest priority to set
reclaim tendencies. Currently, it can only be triggered by the active
reclaim interface.
might be really different with different kernel versions so I would
really stress out that testing with the current Linus (or akpm) tree is
necessary.
OK, thank you for the reminder.

Anyway, the more I think about that the more I am convinced that
explicit anon/file extension for the memory.reclaim interface is just a
wrong way to address a more fundamental underlying problem. That is, the
default reclaim choice over anon vs file preference should consider the
cost of the refaulting IO. This is more a property of the underlying
storage than a global characteristic. In other words, say you have
mutlitple storages, one that is a network based with a high latency and
other that is a local fast SSD. Reclaiming a page backed by the slower
storage is going to be more expensive to refault than the one backed by
the fast storage. So even page cache pages are not really all the same.

It is quite likely that a IO cost aspect is not really easy to integrate
into the memory reclaim but it seems to me this is a better way to focus
on for a better long term solution. Our existing refaulting
infrastructure should help in that respect. Also MGLRU could fit for
that purpose better than the traditional LRU based reclaim as the higher
generations could be used for more more expensive pages.

Yes, your insights are very informative.

However, before our algorithm is perfected, I think it is reasonable to provide
different reclaim tendencies for the active reclaim interface. This will provide
greater flexibility for the strategy layer.
For example, in the field of mobile phones, we can consider the comprehensive
impact of refault IO overhead and LMKD killing when providing different reclaim
tendencies for the active reclaim interface.

--
Thanks,
Huan Yang