Re: [PATCH] mm: ratelimit stat flush from workingset shrinker

From: Yu Zhao
Date: Thu Dec 28 2023 - 03:02:40 EST


On Thu, Dec 28, 2023 at 12:31 AM Shakeel Butt <shakeelb@xxxxxxxxxx> wrote:
>
> One of our internal workload regressed on newer upstream kernel

Not really internal -- it's Postgres 14 + sysbench OLTP.

> and on
> further investigation, it seems like the cause is the always synchronous
> rstat flush in the count_shadow_nodes() added by the commit f82e6bf9bb9b
> ("mm: memcg: use rstat for non-hierarchical stats"). On further
> inspection it seems like we don't really need accurate stats in this
> function as it was already approximating the amount of appropriate
> shadow entried to keep for maintaining the refault information. Since
> there is already 2 sec periodic rstat flush, we don't need exact stats
> here. Let's ratelimit the rstat flush in this code path.
>
> Fixes: f82e6bf9bb9b ("mm: memcg: use rstat for non-hierarchical stats")
> Signed-off-by: Shakeel Butt <shakeelb@xxxxxxxxxx>
> ---
> mm/workingset.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/workingset.c b/mm/workingset.c
> index 2a2a34234df9..226012974328 100644
> --- a/mm/workingset.c
> +++ b/mm/workingset.c
> @@ -680,7 +680,7 @@ static unsigned long count_shadow_nodes(struct shrinker *shrinker,
> struct lruvec *lruvec;
> int i;
>
> - mem_cgroup_flush_stats(sc->memcg);
> + mem_cgroup_flush_stats_ratelimited(sc->memcg);
> lruvec = mem_cgroup_lruvec(sc->memcg, NODE_DATA(sc->nid));
> for (pages = 0, i = 0; i < NR_LRU_LISTS; i++)
> pages += lruvec_page_state_local(lruvec,

LGTM.