Re: [PATCH v7 0/4] page_owner: print stacks and their outstanding allocations

From: Andrew Morton
Date: Thu Feb 08 2024 - 19:28:25 EST

On Fri, 9 Feb 2024 00:45:35 +0100 Oscar Salvador <osalvador@xxxxxxx> wrote:

> page_owner is a great debug functionality tool that lets us know
> about all pages that have been allocated/freed and their specific
> stacktrace.
> This comes very handy when debugging memory leaks, since with
> some scripting we can see the outstanding allocations, which might point
> to a memory leak.
> In my experience, that is one of the most useful cases, but it can get
> really tedious to screen through all pages and try to reconstruct the
> stack <-> allocated/freed relationship, becoming most of the time a
> daunting and slow process when we have tons of allocation/free operations.
> This patchset aims to ease that by adding a new functionality into
> page_owner.
> This functionality creates a new read-only file called "page_owner_stacks",

The full path would be appreciated.

> which prints out all the stacks followed by their outstanding number
> of allocations (being that the times the stacktrace has allocated
> but not freed yet).
> This gives us a clear and a quick overview of stacks <-> allocated/free.
> We take advantage of the new refcount_f field that stack_record struct
> gained, and increment/decrement the stack refcount on every
> __set_page_owner() (alloc operation) and __reset_page_owner (free operation)
> call.
> Unfortunately, we cannot use the new stackdepot api
> STACK_DEPOT_FLAG_{GET,PUT} because it does not fulfill page_owner needs,
> meaning we would have to special case things, at which point
> makes more sense for page_owner to do its own {dec,inc}rementing
> of the stacks.
> E.g: Using STACK_DEPOT_FLAG_PUT, once the refcount reaches 0,
> such stack gets evicted, so page_owner would lose information.
> This patch also creates a new file called 'page_owner_threshold'.
> By writing a value to it, the stacks which refcount is below such
> value will be filtered out.
> In order to better exercise the path in stack_depot_get_next_stack(),
> I artificially filled the buckets with more than one stack, making sure
> I was getting all of then when reading from it.
> On a side note, stack_depot_get_next_stack() could be somehow reconstructed
> to be in page_owner code, but we would have to move stack_table
> into the header, so page_owner can access it.
> I can do that if that's preferred, so stackdepot.c would not get "poluted".
> A PoC can be found below:
> # cat /sys/kernel/debug/page_owner_stacks > page_owner_full_stacks.txt

Oh, there it is. I wonder why we didn't use /sys/kernel/mm/

Would a new /sys/kernel/debug/page_owner_something/ be a good idea? We
might add more things later. Then it can be
/sys/kernel/debug/page_owner/ would be nice, but it's too late for

> Oscar Salvador (4):
> lib/stackdepot: Move stack_record struct definition into the header
> mm,page_owner: Implement the tracking of the stacks count
> mm,page_owner: Display all stacks and their count
> mm,page_owner: Filter out stacks by a threshold
> include/linux/stackdepot.h | 72 ++++++++++++++++++++
> lib/stackdepot.c | 97 ++++++++++++++------------
> mm/page_owner.c | 136 +++++++++++++++++++++++++++++++++++++
> 3 files changed, 262 insertions(+), 43 deletions(-)