Re: [RFC 0/4] Introduce unbalance proactive reclaim

From: Huang, Ying
Date: Mon Nov 13 2023 - 03:07:31 EST


Huan Yang <link@xxxxxxxx> writes:

> 在 2023/11/13 14:10, Huang, Ying 写道:
>> Huan Yang <link@xxxxxxxx> writes:
>>
>>> 在 2023/11/10 20:24, Michal Hocko 写道:
>>>> On Fri 10-11-23 11:48:49, Huan Yang wrote:
>>>> [...]
>>>>> Also, When the application enters the foreground, the startup speed
>>>>> may be slower. Also trace show that here are a lot of block I/O.
>>>>> (usually 1000+ IO count and 200+ms IO Time) We usually observe very
>>>>> little block I/O caused by zram refault.(read: 1698.39MB/s, write:
>>>>> 995.109MB/s), usually, it is faster than random disk reads.(read:
>>>>> 48.1907MB/s write: 49.1654MB/s). This test by zram-perf and I change a
>>>>> little to test UFS.
>>>>>
>>>>> Therefore, if the proactive reclamation encounters many file pages,
>>>>> the application may become slow when it is opened.
>>>> OK, this is an interesting information. From the above it seems that
>>>> storage based IO refaults are order of magnitude more expensive than
>>>> swap (zram in this case). That means that the memory reclaim should
>>>> _in general_ prefer anonymous memory reclaim over refaulted page cache,
>>>> right? Or is there any reason why "frozen" applications are any
>>>> different in this case?
>>> Frozen applications mean that the application process is no longer active,
>>> so once its private anonymous page data is swapped out, the anonymous
>>> pages will not be refaulted until the application becomes active again.
>>>
>>> On the contrary, page caches are usually shared. Even if the
>>> application that
>>> first read the file is no longer active, other processes may still
>>> read the file.
>>> Therefore, it is not reasonable to use the proactive reclamation
>>> interface to
>>> reclaim page caches without considering memory pressure.
>> No. Not all page caches are shared. For example, the page caches used
>> for use-once streaming IO. And, they should be reclaimed firstly.
> Yes, but this part is done very well in MGLRU and does not require our
> intervention.
> Moreover, the reclaim speed of clean files is very fast, but compared to it,
> the reclaim speed of anonymous pages is a bit slower.
>>
>> So, your solution may work good for your specific use cases, but it's
> Yes, this approach is not universal.
>> not a general solution. Per my understanding, you want to reclaim only
>> private pages to avoid impact the performance of other applications.
>> Privately mapped anonymous pages is easy to be identified (And I suggest
>> that you can find a way to avoid reclaim shared mapped anonymous pages).
> Yes, it is not good to reclaim shared anonymous pages, and it needs to be
> identified. In the future, we will consider how to filter them.
> Thanks.
>> There's some heuristics to identify use-once page caches in reclaiming
>> code. Why doesn't it work for your situation?
> As mentioned above, the default reclaim algorithm is suitable for recycling
> file pages, but we do not need to intervene in it.
> Direct reclaim or kswapd of these use-once file pages is very fast and will
> not cause lag or other effects.
> Our overall goal is to actively and reasonably compress unused anonymous
> pages based on certain strategies, in order to increase available memory to
> a certain extent, avoid lag, and prevent applications from being killed.
> Therefore, using the proactive reclaim interface, combined with LRU
> algorithm
> and reclaim tendencies, is a good way to achieve our goal.

If so, why can't you just use the proactive reclaim with some large
enough swappiness? That will reclaim use-once page caches and compress
anonymous pages. So, more applications can be kept in memory before
passive reclaiming or killing background applications?

--
Best Regards,
Huang, Ying