[PATCH 00/15] stackdepot: allow evicting stack traces

From: andrey . konovalov
Date: Tue Aug 29 2023 - 13:12:43 EST


From: Andrey Konovalov <andreyknvl@xxxxxxxxxx>

Currently, the stack depot grows indefinitely until it reaches its
capacity. Once that happens, the stack depot stops saving new stack
traces.

This creates a problem for using the stack depot for in-field testing
and in production.

For such uses, an ideal stack trace storage should:

1. Allow saving fresh stack traces on systems with a large uptime while
limiting the amount of memory used to store the traces;
2. Have a low performance impact.

Implementing #1 in the stack depot is impossible with the current
keep-forever approach. This series targets to address that. Issue #2 is
left to be addressed in a future series.

This series changes the stack depot implementation to allow evicting
unneeded stack traces from the stack depot. The users of the stack depot
can do that via a new stack_depot_evict API.

Internal changes to the stack depot code include:

1. Storing stack traces in 32-frame-sized slots (vs precisely-sized slots
in the current implementation);
2. Keeping available slots in a freelist (vs keeping an offset to the next
free slot);
3. Using a read/write lock for synchronization (vs a lock-free approach
combined with a spinlock).

This series also integrates the eviction functionality in the tag-based
KASAN modes. (I will investigate integrating it into the Generic mode as
well in the following iterations of this series.)

Despite wasting some space on rounding up the size of each stack record
to 32 frames, with this change, the tag-based KASAN modes end up
consuming ~5% less memory in stack depot during boot (with the default
stack ring size of 32k entries). The reason for this is the eviction of
irrelevant stack traces from the stack depot, which frees up space for
other stack traces.

For other tools that heavily rely on the stack depot, like Generic KASAN
and KMSAN, this change leads to the stack depot capacity being reached
sooner than before. However, as these tools are mainly used in fuzzing
scenarios where the kernel is frequently rebooted, this outcome should
be acceptable.

There is no measurable boot time performace impact of these changes for
KASAN on x86-64. I haven't done any tests for arm64 modes (the stack
depot without performance optimizations is not suitable for intended use
of those anyway), but I expect a similar result. Obtaining and copying
stack trace frames when saving them into stack depot is what takes the
most time.

This series does not yet provide a way to configure the maximum size of
the stack depot externally (e.g. via a command-line parameter). This will
either be added in the following iterations of this series (if the used
approach gets approval) or will be added together with the performance
improvement changes.

Andrey Konovalov (15):
stackdepot: check disabled flag when fetching
stackdepot: simplify __stack_depot_save
stackdepot: drop valid bit from handles
stackdepot: add depot_fetch_stack helper
stackdepot: use fixed-sized slots for stack records
stackdepot: fix and clean-up atomic annotations
stackdepot: rework helpers for depot_alloc_stack
stackdepot: rename next_pool_required to new_pool_required
stackdepot: store next pool pointer in new_pool
stackdepot: store free stack records in a freelist
stackdepot: use read/write lock
stackdepot: add refcount for records
stackdepot: add backwards links to hash table buckets
stackdepot: allow users to evict stack traces
kasan: use stack_depot_evict for tag-based modes

include/linux/stackdepot.h | 11 ++
lib/stackdepot.c | 361 ++++++++++++++++++++++++-------------
mm/kasan/tags.c | 7 +-
3 files changed, 249 insertions(+), 130 deletions(-)

--
2.25.1