Re: [PATCH v3 13/35] lib: add allocation tagging support for memory allocation profiling

From: Vlastimil Babka
Date: Fri Feb 16 2024 - 03:58:15 EST


On 2/12/24 22:38, Suren Baghdasaryan wrote:
> Introduce CONFIG_MEM_ALLOC_PROFILING which provides definitions to easily
> instrument memory allocators. It registers an "alloc_tags" codetag type
> with /proc/allocinfo interface to output allocation tag information when
> the feature is enabled.
> CONFIG_MEM_ALLOC_PROFILING_DEBUG is provided for debugging the memory
> allocation profiling instrumentation.
> Memory allocation profiling can be enabled or disabled at runtime using
> /proc/sys/vm/mem_profiling sysctl when CONFIG_MEM_ALLOC_PROFILING_DEBUG=n.
> CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT enables memory allocation
> profiling by default.
>
> Signed-off-by: Suren Baghdasaryan <surenb@xxxxxxxxxx>
> Co-developed-by: Kent Overstreet <kent.overstreet@xxxxxxxxx>
> Signed-off-by: Kent Overstreet <kent.overstreet@xxxxxxxxx>
> ---
> Documentation/admin-guide/sysctl/vm.rst | 16 +++
> Documentation/filesystems/proc.rst | 28 +++++
> include/asm-generic/codetag.lds.h | 14 +++
> include/asm-generic/vmlinux.lds.h | 3 +
> include/linux/alloc_tag.h | 133 ++++++++++++++++++++
> include/linux/sched.h | 24 ++++
> lib/Kconfig.debug | 25 ++++
> lib/Makefile | 2 +
> lib/alloc_tag.c | 158 ++++++++++++++++++++++++
> scripts/module.lds.S | 7 ++
> 10 files changed, 410 insertions(+)
> create mode 100644 include/asm-generic/codetag.lds.h
> create mode 100644 include/linux/alloc_tag.h
> create mode 100644 lib/alloc_tag.c
>
> diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
> index c59889de122b..a214719492ea 100644
> --- a/Documentation/admin-guide/sysctl/vm.rst
> +++ b/Documentation/admin-guide/sysctl/vm.rst
> @@ -43,6 +43,7 @@ Currently, these files are in /proc/sys/vm:
> - legacy_va_layout
> - lowmem_reserve_ratio
> - max_map_count
> +- mem_profiling (only if CONFIG_MEM_ALLOC_PROFILING=y)
> - memory_failure_early_kill
> - memory_failure_recovery
> - min_free_kbytes
> @@ -425,6 +426,21 @@ e.g., up to one or two maps per allocation.
> The default value is 65530.
>
>
> +mem_profiling
> +==============
> +
> +Enable memory profiling (when CONFIG_MEM_ALLOC_PROFILING=y)
> +
> +1: Enable memory profiling.
> +
> +0: Disabld memory profiling.

Disable

..

> +allocinfo
> +~~~~~~~
> +
> +Provides information about memory allocations at all locations in the code
> +base. Each allocation in the code is identified by its source file, line
> +number, module and the function calling the allocation. The number of bytes
> +allocated at each location is reported.

See, it even says "number of bytes" :)

> +
> +Example output.
> +
> +::
> +
> + > cat /proc/allocinfo
> +
> + 153MiB mm/slub.c:1826 module:slub func:alloc_slab_page

Is "module" meant in the usual kernel module sense? In that case IIRC is
more common to annotate things e.g. [xfs] in case it's really a module, and
nothing if it's built it, such as slub. Is that "slub" simply derived from
"mm/slub.c"? Then it's just redundant?

> + 6.08MiB mm/slab_common.c:950 module:slab_common func:_kmalloc_order
> + 5.09MiB mm/memcontrol.c:2814 module:memcontrol func:alloc_slab_obj_exts
> + 4.54MiB mm/page_alloc.c:5777 module:page_alloc func:alloc_pages_exact
> + 1.32MiB include/asm-generic/pgalloc.h:63 module:pgtable func:__pte_alloc_one
> + 1.16MiB fs/xfs/xfs_log_priv.h:700 module:xfs func:xlog_kvmalloc
> + 1.00MiB mm/swap_cgroup.c:48 module:swap_cgroup func:swap_cgroup_prepare
> + 734KiB fs/xfs/kmem.c:20 module:xfs func:kmem_alloc
> + 640KiB kernel/rcu/tree.c:3184 module:tree func:fill_page_cache_func
> + 640KiB drivers/char/virtio_console.c:452 module:virtio_console func:alloc_buf
> + ...
> +
> +
> meminfo

..

> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 0be2d00c3696..78d258ca508f 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -972,6 +972,31 @@ config CODE_TAGGING
> bool
> select KALLSYMS
>
> +config MEM_ALLOC_PROFILING
> + bool "Enable memory allocation profiling"
> + default n
> + depends on PROC_FS
> + depends on !DEBUG_FORCE_WEAK_PER_CPU
> + select CODE_TAGGING
> + help
> + Track allocation source code and record total allocation size
> + initiated at that code location. The mechanism can be used to track
> + memory leaks with a low performance and memory impact.
> +
> +config MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT
> + bool "Enable memory allocation profiling by default"
> + default y

I'd go with default n as that I'd select for a general distro.

> + depends on MEM_ALLOC_PROFILING
> +
> +config MEM_ALLOC_PROFILING_DEBUG
> + bool "Memory allocation profiler debugging"
> + default n
> + depends on MEM_ALLOC_PROFILING
> + select MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT
> + help
> + Adds warnings with helpful error messages for memory allocation
> + profiling.
> +