Re: [PATCH V3 01/23] perf/x86: Support outputting XMM registers

From: Liang, Kan
Date: Mon Mar 25 2019 - 16:36:03 EST




On 3/23/2019 5:56 AM, Peter Zijlstra wrote:
On Fri, Mar 22, 2019 at 10:22:50AM -0700, Andi Kleen wrote:
diff --git a/arch/x86/include/uapi/asm/perf_regs.h b/arch/x86/include/uapi/asm/perf_regs.h
index f3329cabce5c..b33995313d17 100644
--- a/arch/x86/include/uapi/asm/perf_regs.h
+++ b/arch/x86/include/uapi/asm/perf_regs.h
@@ -28,7 +28,29 @@ enum perf_event_x86_regs {
PERF_REG_X86_R14,
PERF_REG_X86_R15,
- PERF_REG_X86_32_MAX = PERF_REG_X86_GS + 1,
- PERF_REG_X86_64_MAX = PERF_REG_X86_R15 + 1,

So this changes UAPI visible symbols... did we think about that?

Should be fine. Old programs won't use the new bits,
and it just uses not yet used bits.

Old programs (that used the above symbols) will now fail to compile.
Even if they won't use the new bits, that seems like a bad thing.


Yes, other programs which use the PERF_REG_GPR_X86_32/64_MAX symbols should be broken.
I think the new name PERF_REG_GPR_X86_32/64_MAX are more accurate. So I will keep both names in V4, and add comments for the old names.

/*
* These names are deprecated, please use new names as below to instead.
* PERF_REG_GPR_X86_32_MAX
* PERF_REG_GPR_X86_64_MAX
*/
PERF_REG_X86_32_MAX = PERF_REG_X86_GS + 1,
PERF_REG_X86_64_MAX = PERF_REG_X86_R15 + 1,


+ /* These all need two bits set because they are 128bit */
+ PERF_REG_X86_XMM0 = 32,
+ PERF_REG_X86_XMM1 = 34,
+ PERF_REG_X86_XMM2 = 36,
+ PERF_REG_X86_XMM3 = 38,
+ PERF_REG_X86_XMM4 = 40,
+ PERF_REG_X86_XMM5 = 42,
+ PERF_REG_X86_XMM6 = 44,
+ PERF_REG_X86_XMM7 = 46,
+ PERF_REG_X86_XMM8 = 48,
+ PERF_REG_X86_XMM9 = 50,
+ PERF_REG_X86_XMM10 = 52,
+ PERF_REG_X86_XMM11 = 54,
+ PERF_REG_X86_XMM12 = 56,
+ PERF_REG_X86_XMM13 = 58,
+ PERF_REG_X86_XMM14 = 60,
+ PERF_REG_X86_XMM15 = 62,
+
+ /* This does not include the XMMX registers */
+ PERF_REG_GPR_X86_32_MAX = PERF_REG_X86_GS + 1,
+ PERF_REG_GPR_X86_64_MAX = PERF_REG_X86_R15 + 1,
+
+ /* All registers include the XMMX registers */
+ PERF_REG_X86_MAX = PERF_REG_X86_XMM15 + 2,
};
#endif /* _ASM_X86_PERF_REGS_H */

Also, what happens if we run a 32bit kernel or 32bit compat task?

Then the register dump will report PERF_SAMPLE_REGS_ABI_32, should we
then still interpret the XMM registers as 2x64bit?

Yes XMM registers are 128bit in 32bit mode too.


Are they still at the same offset?

Yes.

I think that is broken.. perf_prepare_sample() does:

size += hweight(mask) * sizeof(u64);

It does size += hweight64(mask) * sizeof(u64);


And since 32bits will not have r8-r15 set, the XMM registers will shift
forward no?


I tried a 32bits kernel, but I didn't observe any issue.

The index of XMM registers always start from 32. That's hard coded.

To double check, I also dumped the mask value in perf_prepare_sample().
With command "perf record -e cycles:p -IXMM0,IXMM1 sleep 1", the mask
value is 0xf00000000, hweight64(mask) returns 4. That is expected.

Is there anything I missed?

Do we need additional PERF_SAMPLE_REGS_ABI_* definitions for this?

I don't think so.

because....?


I didn't observe any broken on 32bit. I think we don't need ABI to distinguish the XMM registers.

Thanks,
Kan