[PATCH v2 0/9] tracing/filters: filtering event fields with a cpumask

From: Valentin Schneider
Date: Fri Jul 07 2023 - 13:23:12 EST


Hi folks,

In the context of CPU isolation / NOHZ_FULL interference investigation, we now
have the ipi_send_cpu and ipi_send_cpumask events. However, enabling these
events without any filtering can yield pretty massive traces with a lot of
uninteresting or irrelevant data (e.g. everything targeting housekeeping CPUs).

This series is about adding event filtering via a user-provided cpumask. This
enables filtering using cpumask fields (e.g. ipi_send_cpumask) and extends this
to scalar and the local CPU common fields.

With this, it becomes fairly easy to trace events both happening on and
targeting CPUs of interest, e.g.:

trace-cmd record -e 'sched_switch' -f "CPU & CPUS{$ISOLATED_CPUS}" \
-e 'sched_wakeup' -f "target_cpu & CPUS{$ISOLATED_CPUS}" \
-e 'ipi_send_cpu' -f "cpu & CPUS{$ISOLATED_CPUS}" \
-e 'ipi_send_cpumask' -f "cpumask & CPUS{$ISOLATED_CPUS}" \
hackbench

The CPUS{} thing is a bit crude but seems to work well enough without break^C
overhauling the predicate parsing logic.

Cheers,
Valentin

Revisions
=========

v1 -> v2
++++++++

Context for the changes:
https://lore.kernel.org/lkml/20230705181256.3539027-1-vschneid@xxxxxxxxxx/

o Added check for NULL filter_pred in free_predicate()
o Changed filter_type and op checks to be switch cases.

o Swiched from strncpy() to strscpy() (Steven)
o Changed from "MASK{}" to "CPUS{}"

This is slightly more explicit IMO, and leaves MASK{} available if/when we
decide to add bitmask filtering for other events.

o Optimised cpumask vs scalar filtering

Rather than doing full-fledged cpumask operations using cpumask_of(scalar), I'm
using cpumask_nth(1, mask) as a way to check the mask's weight is one - the
reasoning here is we don't need to compute the full weight of the mask, we
just need to know if there's more than one bit set.

o Added scalar vs scalar optimisation (Steven)

In case the user-provided cpumask has a weight of one, Steven pointed out we
could use cheaper scalar-based filter functions.

I *may* have gone a bit overboard here, but given the mask is stable for the
duration of the filter, it felt silly to check its weight every time we enter
the filter function.

Valentin Schneider (9):
tracing/filters: Dynamically allocate filter_pred.regex
tracing/filters: Enable filtering a cpumask field by another cpumask
tracing/filters: Enable filtering a scalar field by a cpumask
tracing/filters: Enable filtering the CPU common field by a cpumask
tracing/filters: Optimise cpumask vs cpumask filtering when user mask
is a single CPU
tracing/filters: Optimise scalar vs cpumask filtering when the user
mask is a single CPU
tracing/filters: Optimise CPU vs cpumask filtering when the user mask
is a single CPU
tracing/filters: Further optimise scalar vs cpumask comparison
tracing/filters: Document cpumask filtering

Documentation/trace/events.rst | 14 ++
include/linux/trace_events.h | 1 +
kernel/trace/trace_events_filter.c | 302 ++++++++++++++++++++++++++---
3 files changed, 290 insertions(+), 27 deletions(-)

--
2.31.1