Re: [PATCH V2 3/5] perf mem: Clean up perf_mem_events__name()

From: Leo Yan
Date: Sat Dec 09 2023 - 00:48:25 EST


On Thu, Dec 07, 2023 at 11:23:36AM -0800, kan.liang@xxxxxxxxxxxxxxx wrote:
> From: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
>
> Introduce a generic perf_mem_events__name(). Remove the ARCH-specific
> one.

Now memory event naming is arch dependent, this is because every arch
has different memory PMU names, and it appends specific configurations
(e.g. aarch64 appends 'ts_enable=1,...,min_latency=%u', and x86_64
appends 'ldlat=%u', etc). This is not friendly for extension, e.g. it's
impossible for users to specify any extra attributes for memory events.

This patch tries to consolidate in a central place for generating memory
event names, its approach is to add more flags to meet special usage
cases, which means the code gets more complex (and more difficult for
later's maintenance).

I think we need to distinguish the event naming into two levels: in
util/mem-events.c, we can consider it as a common layer, and we maintain
common options in it, e.g. 'latency', 'all-user', 'all-kernel',
'timestamp', 'physical_address', etc. In every arch's mem-events.c
file, it converts the common options to arch specific configurations.

In the end, this can avoid to add more and more flags into the
structure perf_mem_event.

Anyway, I also have a question for this patch itself, please see below
inline comment.

[...]

> diff --git a/tools/perf/arch/arm64/util/mem-events.c b/tools/perf/arch/arm64/util/mem-events.c
> index 2602e8688727..eb2ef84f0fc8 100644
> --- a/tools/perf/arch/arm64/util/mem-events.c
> +++ b/tools/perf/arch/arm64/util/mem-events.c
> @@ -2,28 +2,10 @@
> #include "map_symbol.h"
> #include "mem-events.h"
>
> -#define E(t, n, s) { .tag = t, .name = n, .sysfs_name = s }
> +#define E(t, n, s, l, a) { .tag = t, .name = n, .sysfs_name = s, .ldlat = l, .aux_event = a }
>
> struct perf_mem_event perf_mem_events_arm[PERF_MEM_EVENTS__MAX] = {
> - E("spe-load", "arm_spe_0/ts_enable=1,pa_enable=1,load_filter=1,store_filter=0,min_latency=%u/", "arm_spe_0"),
> - E("spe-store", "arm_spe_0/ts_enable=1,pa_enable=1,load_filter=0,store_filter=1/", "arm_spe_0"),
> - E("spe-ldst", "arm_spe_0/ts_enable=1,pa_enable=1,load_filter=1,store_filter=1,min_latency=%u/", "arm_spe_0"),
> + E("spe-load", "%s/ts_enable=1,pa_enable=1,load_filter=1,store_filter=0,min_latency=%u/", "arm_spe_0", true, 0),
> + E("spe-store", "%s/ts_enable=1,pa_enable=1,load_filter=0,store_filter=1/", "arm_spe_0", false, 0),
> + E("spe-ldst", "%s/ts_enable=1,pa_enable=1,load_filter=1,store_filter=1,min_latency=%u/", "arm_spe_0", true, 0),

Arm SPE is AUX event, should set '1' to the field '.aux_event'.

I am a bit suspect if we really need the field '.aux_event', the
'.aux_event' field is only used for generating event string.

So in the below function perf_pmu__mem_events_name(), I prefer to call
an arch specific function, e.g. arch_memory_event_name(event_id, cfg),
the parameter 'event_id' passes memory event ID for load, store, and
load-store, and the parameter 'cfg' which is a pointer to the common
memory options (as mentioned above for latency, all-user or all-kernel
mode, timestamp tracing, etc).

In the end, the common layer just passes the common options to low
level arch's layer and get a event string for recording.

Thanks,
Leo