Re: [RFC PATCH v1 0/2] Ignore non-LRU-based reclaim in memcg reclaim

From: Dave Chinner
Date: Thu Feb 02 2023 - 19:01:20 EST


On Thu, Feb 02, 2023 at 11:32:27PM +0000, Yosry Ahmed wrote:
> Reclaimed pages through other means than LRU-based reclaim are tracked
> through reclaim_state in struct scan_control, which is stashed in
> current task_struct. These pages are added to the number of reclaimed
> pages through LRUs. For memcg reclaim, these pages generally cannot be
> linked to the memcg under reclaim and can cause an overestimated count
> of reclaimed pages. This short series tries to address that.

Can you explain why memcg specific reclaim is calling shrinkers that
are not marked with SHRINKER_MEMCG_AWARE?

i.e. only objects that are directly associated with memcg aware
shrinkers should be accounted to the memcg, right? If the cache is
global (e.g the xfs buffer cache) then they aren't marked with
SHRINKER_MEMCG_AWARE and so should only be called for root memcg
(i.e. global) reclaim contexts.

So if you are having accounting problems caused by memcg specific
reclaim on global caches freeing non-memcg accounted memory, isn't
the problem the way the shrinkers are being called?

> Patch 1 is just refactoring updating reclaim_state into a helper
> function, and renames reclaimed_slab to just reclaimed, with a comment
> describing its true purpose.
>
> Patch 2 ignores pages reclaimed outside of LRU reclaim in memcg reclaim.
>
> The original draft was a little bit different. It also kept track of
> uncharged objcg pages, and reported them only in memcg reclaim and only
> if the uncharged memcg is in the subtree of the memcg under reclaim.
> This was an attempt to make reporting of memcg reclaim even more
> accurate, but was dropped due to questionable complexity vs benefit
> tradeoff. It can be revived if there is interest.
>
> Yosry Ahmed (2):
> mm: vmscan: refactor updating reclaimed pages in reclaim_state
> mm: vmscan: ignore non-LRU-based reclaim in memcg reclaim
>
> fs/inode.c | 3 +--

Inodes and inode mapping pages are directly charged to the memcg
that allocated them and the shrinker is correctly marked as
SHRINKER_MEMCG_AWARE. Freeing the pages attached to the inode will
account them correctly to the related memcg, regardless of which
memcg is triggering the reclaim. Hence I'm not sure that skipping
the accounting of the reclaimed memory is even correct in this case;
I think the code should still be accounting for all pages that
belong to the memcg being scanned that are reclaimed, not ignoring
them altogether...

-Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx