Re: [PATCH] mm: vmpressure: don't count userspace-induced reclaim as memory pressure

From: Andrew Morton
Date: Wed Jun 22 2022 - 20:16:59 EST


On Thu, 23 Jun 2022 00:05:30 +0000 Yosry Ahmed <yosryahmed@xxxxxxxxxx> wrote:

> Commit e22c6ed90aa9 ("mm: memcontrol: don't count limit-setting reclaim
> as memory pressure") made sure that memory reclaim that is induced by
> userspace (limit-setting, proactive reclaim, ..) is not counted as
> memory pressure for the purposes of psi.
>
> Instead of counting psi inside try_to_free_mem_cgroup_pages(), callers
> from try_charge() and reclaim_high() wrap the call to
> try_to_free_mem_cgroup_pages() with psi handlers.
>
> However, vmpressure is still counted in these cases where reclaim is
> directly induced by userspace. This patch makes sure vmpressure is not
> counted in those operations, in the same way as psi. Since vmpressure
> calls need to happen deeper within the reclaim path, the same approach
> could not be followed. Hence, a new "controlled" flag is added to struct
> scan_control to flag a reclaim operation that is controlled by
> userspace. This flag is set by limit-setting and proactive reclaim
> operations, and is used to count vmpressure correctly.
>
> To prevent future divergence of psi and vmpressure, commit e22c6ed90aa9
> ("mm: memcontrol: don't count limit-setting reclaim as memory pressure")
> is effectively reverted and the same flag is used to control psi as
> well.

I'll await reviewer input on this, but I can always do trivia!

> @@ -3502,6 +3497,8 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
> static int mem_cgroup_force_empty(struct mem_cgroup *memcg)
> {
> int nr_retries = MAX_RECLAIM_RETRIES;
> + unsigned int reclaim_options = MEMCG_RECLAIM_CONTROLLED |
> + MEMCG_RECLAIM_MAY_SWAP;

If it doesn't fit, it's nicer to do

unsigned int reclaim_options;
...

reclaim_options = MEMCG_RECLAIM_CONTROLLED | MEMCG_RECLAIM_MAY_SWAP;

(several places)

> @@ -3751,6 +3757,7 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
> .may_writepage = !laptop_mode,
> .may_unmap = 1,
> .may_swap = 1,
> + .controlled = 0,
> };

Let's just skip all these initializations to zero, let the compiler take
care of it.

> @@ -4095,6 +4112,7 @@ static int balance_pgdat(pg_data_t *pgdat, int order, int highest_zoneidx)
> .gfp_mask = GFP_KERNEL,
> .order = order,
> .may_unmap = 1,
> + .controlled = 0,
> };
>
> set_task_reclaim_state(current, &sc.reclaim_state);
> @@ -4555,6 +4573,7 @@ unsigned long shrink_all_memory(unsigned long nr_to_reclaim)
> .may_unmap = 1,
> .may_swap = 1,
> .hibernation_mode = 1,
> + .controlled = 0,
> };
> struct zonelist *zonelist = node_zonelist(numa_node_id(), sc.gfp_mask);
> unsigned long nr_reclaimed;
> @@ -4707,6 +4726,7 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
> .may_unmap = !!(node_reclaim_mode & RECLAIM_UNMAP),
> .may_swap = 1,
> .reclaim_idx = gfp_zone(gfp_mask),
> + .controlled = 0,
> };
> unsigned long pflags;