Re: [PATCH 2/2] perf arm-spe: Parse more SPE fields and store source

From: Leo Yan
Date: Fri Feb 11 2022 - 23:19:52 EST


Hi German, Ali,

On Fri, Feb 11, 2022 at 04:31:40PM +0000, German Gomez wrote:
> Hi Ali,
>
> On 28/01/2022 21:02, Ali Saidi wrote:
> > Hi German,
> >
> > On 28/01/2022 19:20, German Gomez wrote:
> >> Hi Ali,
> >>
> >> [...]
> >>> };
> >>>
> >>> enum arm_spe_op_type {
> >>> ARM_SPE_LD = 1 << 0,
> >>> ARM_SPE_ST = 1 << 1,
> >>> + ARM_SPE_LDST_EXCL = 1 << 2,
> >>> + ARM_SPE_LDST_ATOMIC = 1 << 3,
> >>> + ARM_SPE_LDST_ACQREL = 1 << 4,
>
> Wondering if we can store this in perf_sample->flags. The values are
> defined in "util/event.h" (PERF_IP_*). Maybe we can extend it to allow
> doing "sample->flags = PERF_LDST_FLAG_LD | PERF_LDST_FLAG_ATOMIC" and
> such.
>
> @Leo do you think that could work?

Let's step back a bit and divide the decoding flow into two parts:
backend and frontend.

For the backend part, we decode the SPE hardware trace data and
generate the SPE record in the file
util/arm-spe-decoder/arm-spe-decoder.c. As we want to support
complete operation types, we can extend arm_spe_op_type as below:

enum arm_spe_op_type {
/* First level operation type */
ARM_SPE_OP_OTHER = 1 << 0,
ARM_SPE_OP_LDST = 1 << 1,
ARM_SPE_OP_BRANCH_ERET = 1 << 2,

/* Second level operation type for OTHER */
ARM_SPE_OP_SVE_OTHER = 1 << 16,
ARM_SPE_OP_SVE_FP = 1 << 17,
ARM_SPE_OP_SVE_PRED = 1 << 18,

/* Second level operation type for LDST */
ARM_SPE_OP_LD = 1 << 16,
ARM_SPE_OP_ST = 1 << 17,
ARM_SPE_OP_ATOMIC = 1 << 18,
ARM_SPE_OP_EXCL = 1 << 19,
ARM_SPE_OP_AR = 1 << 20,
ARM_SPE_OP_SIMD_FP = 1 << 21,
ARM_SPE_OP_GP_REG = 1 << 22,
ARM_SPE_OP_UNSPEC_REG = 1 << 23,
ARM_SPE_OP_NV_SYSREG = 1 << 24,
ARM_SPE_OP_SVE_PRED = 1 << 25,
ARM_SPE_OP_SVE_SG = 1 << 26,

/* Second level operation type for BRANCH_ERET */
ARM_SPE_OP_BR_COND = 1 << 16,
ARM_SPE_OP_BR_INDIRECT = 1 << 17,
};

IIUC, Ali suggested to directly reuse packet format to express
operation type and don't need to redefine the operation types like
above structure arm_spe_op_type. I personally bias to convert the raw
packet format to more readable format, a benefit is this would give
us more readable code.

For the frontend, we need to convert Arm SPE record to samples.
We can explore two fields: sample::flags and sample::data_src, for
load/store related operations, I perfer we can fill more complete
info in field sample::data_src and extend the definitions for
perf_mem_data_src; for branch operations, we can fill sample::flags.

So I am just wandering if we can set the field
sample::data_src::mem_lock for atomic operations, like:

data_src.mem_op = PERF_MEM_OP_LOAD;
data_src.mem_lock = PERF_MEM_LOCK_ATOMIC;

The field "mem_lock" is only two bits, we can consider to extend the
structure with an extra filed "mem_lock_ext" if it cannot meet our
requirement.

> >>> + ARM_SPE_BR = 1 << 5,
> >>> + ARM_SPE_BR_COND = 1 << 6,
> >>> + ARM_SPE_BR_IND = 1 << 7,
>
> Seems like we can store BR_COND in the existing "branch-miss" event
> (--itrace=b) with:
>
>   sample->flags = PERF_IP_FLAG_BRANCH;
>   sample->flags |= PERF_IP_FLAG_CONDITIONAL;
> and/or
>   sample->flags |= PERF_IP_FLAG_INDIRECT;
>
> PERF_IP_FLAG_INDIRECT doesn't exist yet but we can probably add it.

Yes, for branch samples, this makes sense for me.

Thanks,
Leo