Re: [PATCH mm-unstable] mm: multi-gen LRU: don't spin during memcg release

From: Yu Zhao
Date: Mon Aug 14 2023 - 12:00:42 EST


On Mon, Aug 14, 2023 at 9:16 AM T.J. Mercier <tjmercier@xxxxxxxxxx> wrote:
>
> When a memcg is in the process of being released mem_cgroup_tryget will
> fail because its reference count has already reached 0. This can happen
> during reclaim if the memcg has already been offlined, and we reclaim
> all remaining pages attributed to the offlined memcg. shrink_many
> attempts to skip the empty memcg in this case, and continue reclaiming
> from the remaining memcgs in the old generation. If there is only one
> memcg remaining, or if all remaining memcgs are in the process of being
> released then shrink_many will spin until all memcgs have finished
> being released. The release occurs through a workqueue, so it can take
> a while before kswapd is able to make any further progress.
>
> This fix results in reductions in kswapd activity and direct reclaim in
> a test where 28 apps (working set size > total memory) are repeatedly
> launched in a random sequence:
>
> A B delta ratio(%)
> allocstall_movable 5962 3539 -2423 -40.64
> allocstall_normal 2661 2417 -244 -9.17
> kswapd_high_wmark_hit_quickly 53152 7594 -45558 -85.71
> pageoutrun 57365 11750 -45615 -79.52
>
> Fixes: e4dde56cd208 ("mm: multi-gen LRU: per-node lru_gen_folio lists")
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: T.J. Mercier <tjmercier@xxxxxxxxxx>

Acked-by: Yu Zhao <yuzhao@xxxxxxxxxx>