Re: [PATCH 2/2] perf: Expand perf_branch_entry.type

From: Anshuman Khandual
Date: Thu Feb 03 2022 - 23:55:39 EST




On 2/2/22 5:27 PM, Mark Rutland wrote:
> On Fri, Jan 28, 2022 at 11:14:13AM +0530, Anshuman Khandual wrote:
>> Current perf_branch_entry.type is a 4 bits field just enough to accommodate
>> 16 generic branch types. This is insufficient to accommodate platforms like
>> arm64 which has much more branch types.
>
> It would be good to mention BRBE specifically here, along with specific values
> and a rought intro, e.g.
>
> | The Arm Branch Record Buffer Extension (BRBE) distinguishes $N types of
> | branch/exception/return: <rough summary here>. There's not enough space to
> | describe these all in perf_branch_entry.type, as this is a 4-bit field.
>
> That way reviewers (and anyone looking at the patch in future) have a lot more
> rationale to work with. A rough summary of the distinct branch types would be
> *really* helpful.

Sure, will add a summary describing the need for an ABI extension as suggested.

>
>> Lets just expands this field into a 6 bits one, which can now hold 64 generic
>> branch types.
>
> Is it safe (ABI-wise) to extend a bit-field like this? Does that break any
> combination of old/new userspace and old/new kernel? I'm not sure how bit
> fields are managed w.r.t. endianness, but normally extending a field would
> break BE, so this seems suspicious.

Probably. I guess we would need some more inputs/suggestions from others
regarding any potential issues and possible workarounds.

>
> I suspect we might need to allocate a *separate* field for new values, and
> possibly reserve a value in the existing field to say "go look at the new
> field".

In that case there might be another level of indirection.

>
> Do you have any rationale for 64 values specifically? e.g. is that mostly for
> future extensibility? How many will we need for Arm's BRBE?

Yeah that is mostly for future extensibility. BRBE's current requirement
will be well within 32 types itself. 19 to be specific.

>
> Do those types fall into a hierarchy, that we could split across separate
> fields?
I will take a look and get back on this. But as mentioned before it will
cause additional level of indirection for a look up.

>
>> This also adds more generic branch types.
>
> This feels like ti should be in a separate/subsequent patch. If nothing else
> that aids bisectability if changing the size of the field breaks anything.

Makes sense, will split them.

>
>> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
>> Cc: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
>> Cc: Mark Rutland <mark.rutland@xxxxxxx>
>> Cc: Alexander Shishkin <alexander.shishkin@xxxxxxxxxxxxxxx>
>> Cc: Jiri Olsa <jolsa@xxxxxxxxxx>
>> Cc: Namhyung Kim <namhyung@xxxxxxxxxx>
>> Cc: Will Deacon <will@xxxxxxxxxx>
>> Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
>> Cc: linux-perf-users@xxxxxxxxxxxxxxx
>> Cc: linux-kernel@xxxxxxxxxxxxxxx
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@xxxxxxx>
>> ---
>> include/uapi/linux/perf_event.h | 10 ++++++++--
>> tools/include/uapi/linux/perf_event.h | 10 ++++++++--
>> tools/perf/util/branch.c | 8 +++++++-
>> tools/perf/util/branch.h | 4 ++--
>> 4 files changed, 25 insertions(+), 7 deletions(-)
>>
>> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
>> index b91d0f575d0c..361fdc6b87a0 100644
>> --- a/include/uapi/linux/perf_event.h
>> +++ b/include/uapi/linux/perf_event.h
>> @@ -256,6 +256,12 @@ enum {
>> PERF_BR_FIQ = 13, /* fiq */
>> PERF_BR_DEBUG_HALT = 14, /* debug halt */
>> PERF_BR_DEBUG_EXIT = 15, /* debug exit */
>> + PERF_BR_DEBUG_INST = 16, /* instruciton debug */
>> + PERF_BR_DEBUG_DATA = 17, /* data debug */
>
> This is really unclear. What is "instruction debug" vs "data debug" ?
>
> Are there meant for breakpoint/watchpoint exceptions? HW breakpoints vs BRK
> instructions?

Unfortunately the spec does not expand much in detail but will try
and find some more clarity.

>
>> + PERF_BR_FAULT_ALGN = 18, /* alignment fault */
>> + PERF_BR_FAULT_DATA = 19, /* data fault */
>> + PERF_BR_FAULT_INST = 20, /* instruction fault */
>
> There are many other potential faults a CPU could take; are these specifically
> what Arm's BRBE provides?

Right, these are what BRBE captures for now.

>
>> + PERF_BR_SERROR = 21, /* system error */
>
> This is really arm-specific; IIUC the closest thing on x86 is an MCE.

But ('unhandled' ?) system error can be a generic control flow change.

>
>> PERF_BR_MAX,
>> };
>>
>> @@ -1370,8 +1376,8 @@ struct perf_branch_entry {
>> in_tx:1, /* in transaction */
>> abort:1, /* transaction abort */
>> cycles:16, /* cycle count to last branch */
>> - type:4, /* branch type */
>> - reserved:40;
>> + type:6, /* branch type */
>
> As above, is this a safe-change ABI-wise?

If the bit fields here cannot be expanded without breaking ABI, then
there is a fundamental problem. Only remaining option will be to add
new fields (with new width value) which could accommodate these new
required branch types.

>
> Thanks,
> Mark.
>
>> + reserved:38;
>> };
>>
>> union perf_sample_weight {
>> diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
>> index 1882054e8684..9a82b8aaed93 100644
>> --- a/tools/include/uapi/linux/perf_event.h
>> +++ b/tools/include/uapi/linux/perf_event.h
>> @@ -256,6 +256,12 @@ enum {
>> PERF_BR_FIQ = 13, /* fiq */
>> PERF_BR_DEBUG_HALT = 14, /* debug halt */
>> PERF_BR_DEBUG_EXIT = 15, /* debug exit */
>> + PERF_BR_DEBUG_INST = 16, /* instruciton debug */
>> + PERF_BR_DEBUG_DATA = 17, /* data debug */
>> + PERF_BR_FAULT_ALGN = 18, /* alignment fault */
>> + PERF_BR_FAULT_DATA = 19, /* data fault */
>> + PERF_BR_FAULT_INST = 20, /* instruction fault */
>> + PERF_BR_SERROR = 21, /* system error */
>> PERF_BR_MAX,
>> };
>>
>> @@ -1370,8 +1376,8 @@ struct perf_branch_entry {
>> in_tx:1, /* in transaction */
>> abort:1, /* transaction abort */
>> cycles:16, /* cycle count to last branch */
>> - type:4, /* branch type */
>> - reserved:40;
>> + type:6, /* branch type */
>> + reserved:38;
>> };
>>
>> union perf_sample_weight {
>> diff --git a/tools/perf/util/branch.c b/tools/perf/util/branch.c
>> index 74e5e67b1779..1e216ea2e2a8 100644
>> --- a/tools/perf/util/branch.c
>> +++ b/tools/perf/util/branch.c
>> @@ -54,7 +54,13 @@ const char *branch_type_name(int type)
>> "IRQ",
>> "FIQ",
>> "DEBUG_HALT",
>> - "DEBUG_EXIT"
>> + "DEBUG_EXIT",
>> + "DEBUG_INST",
>> + "DEBUG_DATA",
>> + "FAULT_ALGN",
>> + "FAULT_DATA",
>> + "FAULT_INST",
>> + "SERROR"
>> };
>>
>> if (type >= 0 && type < PERF_BR_MAX)
>> diff --git a/tools/perf/util/branch.h b/tools/perf/util/branch.h
>> index 17b2ccc61094..875d99abdc36 100644
>> --- a/tools/perf/util/branch.h
>> +++ b/tools/perf/util/branch.h
>> @@ -23,8 +23,8 @@ struct branch_flags {
>> u64 in_tx:1;
>> u64 abort:1;
>> u64 cycles:16;
>> - u64 type:4;
>> - u64 reserved:40;
>> + u64 type:6;
>> + u64 reserved:38;
>> };
>> };
>> };
>> --
>> 2.25.1