Re: [PATCH v3 0/2] perf: arm-spe: Decode SPE source and use for perf c2c

From: German Gomez
Date: Tue Mar 22 2022 - 08:06:56 EST


Hi Ali, thank you for your patches

On 18/03/2022 19:59, Ali Saidi wrote:
> When synthesizing data from SPE, augment the type with source information
> for Arm Neoverse cores so we can detect situtions like cache line contention
> and transfers on Arm platforms.
>
> This changes enables the expected behavior of perf c2c on a system with SPE where
> lines that are shared among multiple cores show up in perf c2c output.
>
> These changes switch to use mem_lvl_num to encode the level information instead
> of mem_lvl which is being deprecated, but I haven't found other users of
> mem_lvl_num.
>
> Changes in v3:
> * Assume ther are only three levels of cache hierarchy
> * Split the mem_lvl_num and HITM changes in c2c into two seperate patches
>
> Ali Saidi (3):
> perf arm-spe: Use SPE data source for neoverse cores
> perf mem: Support mem_lvl_num in c2c command
> perf mem: Support HITM for when mem_lvl_num is any
>
> .../util/arm-spe-decoder/arm-spe-decoder.c | 1 +
> .../util/arm-spe-decoder/arm-spe-decoder.h | 12 ++
> tools/perf/util/arm-spe.c | 109 +++++++++++++++---
> tools/perf/util/mem-events.c | 20 +++-
> 4 files changed, 124 insertions(+), 18 deletions(-)
>

I tested on a Neoverse N1 system using the below commands and the output
looks either unchanged or improved compared to before. For example:

| $ perf mem record -e spe-ldst -a -- sleep 4
| $ perf mem report
|
| 1.39%             1  1263          L3 miss                   [k] 0xffffb9a34bda2088
| 0.58%             1  529           L1 miss                   [k] 0xffffb9a34bd3be7c
| 0.34%             1  310           N/A                       [k] 0xffffb9a34baf4d28
| 0.34%             1  309           N/A                       [k] 0xffffb9a34bb82844

... became:

| 1.39%             1  1263          RAM hit                   [k] 0xffffb9a34bda2088
| 0.58%             1  529           L2 hit                    [k] 0xffffb9a34bd3be7c
| 0.34%             1  310           L1 hit                    [k] 0xffffb9a34baf4d28
| 0.34%             1  309           L1 hit                    [k] 0xffffb9a34bb82844
                                                                      
Also some L3 misses are now labeled as "Any cache hit" with the Snoop 
bit set. For example:
                                                                      
| 0.37%             1  332           L3 miss                   [.] 0x0000aaaadf70a700    N/A

... became:                                                           

| 0.37%             1  332           Any cache hit             [.] 0x0000aaaadf70a700    HitM

Tested-by: German Gomez <german.gomez@xxxxxxx>
Reviewed-by: German Gomez <german.gomez@xxxxxxx>

Thanks,
German

(I didn't run on a non-Neoverse system but it doesn't look like any   
behaviour is changed for those)