Re: [RFC 0/4] Introduce unbalance proactive reclaim

From: Huan Yang
Date: Tue Nov 14 2023 - 07:37:34 EST



在 2023/11/14 18:04, Michal Hocko 写道:
On Mon 13-11-23 09:54:55, Huan Yang wrote:
在 2023/11/10 20:32, Michal Hocko 写道:
On Fri 10-11-23 14:21:17, Huan Yang wrote:
[...]
BTW: how do you know the number of pages to be reclaimed proactively in
memcg proactive reclaiming based solution?
One point here is that we are not sure how long the frozen application
will be opened, it could be 10 minutes, an hour, or even days. So we
need to predict and try, gradually reclaim anonymous pages in
proportion, preferably based on the LRU algorithm. For example, if
the application has been frozen for 10 minutes, reclaim 5% of
anonymous pages; 30min:25%anon, 1hour:75%, 1day:100%. It is even more
complicated as it requires adding a mechanism for predicting failure
penalties.
Why would make your reclaiming decisions based on time rather than the
actual memory demand? I can see how a pro-active reclaim could make a
head room for an unexpected memory pressure but applying more pressure
just because of inactivity sound rather dubious to me TBH. Why cannot
you simply wait for the external memory pressure (e.g. from kswapd) to
deal with that based on the demand?
Because the current kswapd and direct memory reclamation are a passive
memory reclamation based on the watermark, and in the event of triggering
these reclamation scenarios, the smoothness of the phone application cannot
be guaranteed.
OK, so you are worried about latencies on spike memory usage.

(We often observe that when the above reclamation is triggered, there
is a delay in the application startup, usually accompanied by block
I/O, and some concurrency issues caused by lock design.)
Does that mean you do not have enough head room for kswapd to keep with
Yes, but if set high watermark a little high, the power consumption will be very high.
We usually observe that kswapd will run frequently.
Even if we have set a low kswapd water level, kswapd CPU usage can still be
high in some extreme scenarios.(For example, when starting a large application that
needs to acquire a large amount of memory in a short period of time. )However, we will
not discuss it in detail here, the reasons are quite complex, and we have not yet sorted
out a complete understanding of them.
the memory demand? It is really hard to discuss this without some actual
numbers or more specifics.
To ensure the smoothness of application startup, we have a module in
Android called LMKD (formerly known as lowmemorykiller). Based on a
certain algorithm, LMKD detects if application startup may be delayed
and proactively kills inactive applications. (For example, based on
factors such as refault IO and swap usage.)

However, this behavior may cause the applications we want to protect
to be killed, which will result in users having to wait for them to
restart when they are reopened, which may affect the user
experience.(For example, if the user wants to reopen the application
interface they are working on, or re-enter the order interface they
were viewing.)
This suggests that your LMKD doesn't pick up the right victim to kill.
And I suspect this is a fundamental problem of those pro-active oom
Yes, but, our current LMKD configuration is already very conservative, which
can cause lag in some scenarios, but we will not analyze the reasons in detail here.
killer solutions.

Therefore, the above proactive reclamation interface is designed to
compress memory types with minimal cost for upper-layer applications
based on reasonable strategies, in order to avoid triggering LMKD or
memory reclamation as much as possible, even if it is not balanced.
This would suggest that MADV_PAGEOUT is really what you are looking for.
Yes, I agree, especially to avoid reclaiming shared anonymous pages.

However, I did some shallow research and found that MADV_PAGEOUT does not
reclaim pages with mapcount != 1. Our applications are usually composed of multiple
processes, and some anonymous pages are shared among them. When the application
is frozen, the memory that is only shared among the processes within the application should
be released, but MADV_PAGEOUT seems not to be suitable for this scenario?(If I
misunderstood anything, please correct me.)

In addition, I still have doubts that this approach will consume a lot of strategy
resources, but it is worth studying.

Thanks.
If you really aim at compressing a specific type of memory then tweking
reclaim to achieve that sounds like a shortcut because madvise based
solution is more involved. But that is not a solid justification for
adding a new interface.
Yes, but this RFC is just adding an additional configuration option to the proactive
reclaim interface. And in the reclaim path, prioritize processing these requests
with reclaim tendencies. However, using `unlikely` judgment should not have
much impact.

--
Thanks,
Huan Yang