Re: [PATCH] memcg: add per-memcg total kernel memory stat

From: Shakeel Butt
Date: Tue Feb 01 2022 - 15:26:29 EST


On Tue, Feb 1, 2022 at 12:08 PM Yosry Ahmed <yosryahmed@xxxxxxxxxx> wrote:
>
> Currently memcg stats show several types of kernel memory:
> kernel stack, page tables, sock, vmalloc, and slab.
> However, there are other allocations with __GFP_ACCOUNT
> (or supersets such as GFP_KERNEL_ACCOUNT) that are not accounted
> in any of those stats, a few examples are:
> - various kvm allocations (e.g. allocated pages to create vcpus)
> - io_uring
> - tmp_page in pipes during pipe_write()
> - bpf ringbuffers
> - unix sockets
>
> Keeping track of the total kernel memory is essential for the ease of
> migration from cgroup v1 to v2 as there are large discrepancies between
> v1's kmem.usage_in_bytes and the sum of the available kernel memory stats
> in v2. Adding separate memcg stats for all __GFP_ACCOUNT kernel
> allocations is an impractical maintenance burden as there a lot of those
> all over the kernel code, with more use cases likely to show up in the
> future.
>
> Therefore, add a "kernel" memcg stat that is analogous to kmem
> page counter, with added benefits such as using rstat infrastructure
> which aggregates stats more efficiently. Additionally, this provides a
> lighter alternative in case the legacy kmem is deprecated in the future
>
> Signed-off-by: Yosry Ahmed <yosryahmed@xxxxxxxxxx>

Thanks Yosry. Just to emphasize further, in our gradual migration to
v2 (exposing v2 interfaces in v1 and removing v1-only interfaces), the
difference between kernel memory from v1 and v2 is very prominent for
some workloads. This patch will definitely ease the v2 migration.

Acked-by: Shakeel Butt <shakeelb@xxxxxxxxxx>