Re: [PATCH V13 - RESEND 02/10] arm64/perf: Add BRBE registers and fields

From: Mark Rutland
Date: Mon Jul 31 2023 - 05:06:51 EST


On Mon, Jul 31, 2023 at 08:03:21AM +0530, Anshuman Khandual wrote:
>
>
> On 7/28/23 22:22, James Clark wrote:
> >
> >
> > On 28/07/2023 17:20, Will Deacon wrote:
> >> On Tue, Jul 11, 2023 at 01:54:47PM +0530, Anshuman Khandual wrote:
> >>> This adds BRBE related register definitions and various other related field
> >>> macros there in. These will be used subsequently in a BRBE driver which is
> >>> being added later on.
> >>>
> >>> Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
> >>> Cc: Will Deacon <will@xxxxxxxxxx>
> >>> Cc: Marc Zyngier <maz@xxxxxxxxxx>
> >>> Cc: Mark Rutland <mark.rutland@xxxxxxx>
> >>> Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
> >>> Cc: linux-kernel@xxxxxxxxxxxxxxx
> >>> Tested-by: James Clark <james.clark@xxxxxxx>
> >>> Reviewed-by: Mark Brown <broonie@xxxxxxxxxx>
> >>> Signed-off-by: Anshuman Khandual <anshuman.khandual@xxxxxxx>
> >>> ---
> >>> arch/arm64/include/asm/sysreg.h | 103 +++++++++++++++++++++
> >>> arch/arm64/tools/sysreg | 158 ++++++++++++++++++++++++++++++++
> >>> 2 files changed, 261 insertions(+)
> >>>
> >>> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> >>> index b481935e9314..f95e30c13c8b 100644
> >>> --- a/arch/arm64/include/asm/sysreg.h
> >>> +++ b/arch/arm64/include/asm/sysreg.h
> >>> @@ -163,6 +163,109 @@
> >>> #define SYS_DBGDTRTX_EL0 sys_reg(2, 3, 0, 5, 0)
> >>> #define SYS_DBGVCR32_EL2 sys_reg(2, 4, 0, 7, 0)
> >>>
> >>> +#define __SYS_BRBINFO(n) sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 0))
> >>> +#define __SYS_BRBSRC(n) sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 1))
> >>> +#define __SYS_BRBTGT(n) sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 2))
> >>
> >> It's that time on a Friday but... aren't these macros busted? I think you
> >> need brackets before adding the offset, otherwise wouldn't, for example,
> >> target registers 0-15 all access info register 0 and __SYS_BRBTGT(16) would
> >> then start accessing source register 0?
> >>
> >> I'm surprised that the compiler doesn't warn about this, but even more
> >> surprised that you managed to test this.
> >>
> >> Please tell me I'm wrong!
> >>
> >> Will
> >
> > No I think you are right, it is wrong. Luckily there is already an
> > extraneous bracket so you you can fix it by moving one a place down:
> >
> > sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10) >> 2) + 2))
> >
> > It's interesting because the test [1] is doing quite a bit and looking
> > at the branch info, and that src and targets match up to function names.
> > I also manually looked at the branch buffers and didn't see anything
> > obviously wrong like things that looked like branch infos in the source
> > or target fields. Will have to take another look to see if it would be
> > possible for the test to catch this.
> >
> > James
> >
> > [1]:
> > https://gitlab.arm.com/linux-arm/linux-jc/-/commit/3a7ddce70c2daadb63fcc511de0a89055ca48b32
>
> ((((n) & 0x10)) >> 2 + 2) ---> ((((n) & 0x10) >> 2) + 2)
>
> The additional brackets are useful in explicitly telling the compiler but
> what it the compiler is just doing the right thing implicitly i.e computing
> the shifting operation before doing the offset addition.

No; that is not correct. In c, '+' has higher precedence than '>>'.

For 'a >> b + c' the compiler *must* treat that as 'a >> (b + c)', and not as
'(a >> b) + c'

That's trivial to test:

| [mark@gravadlaks:~]% cat shiftadd.c
| #include <stdio.h>
|
| unsigned long logshiftadd(unsigned long a,
| unsigned long b,
| unsigned long c)
| {
| printf("%ld >> %ld + %ld is %ld\n",
| a, b, c, a >> b + c);
| }
|
| int main(int argc, char *argv)
| {
| logshiftadd(0, 0, 0);
| logshiftadd(0, 0, 1);
| logshiftadd(0, 0, 2);
| printf("\n");
| logshiftadd(1024, 0, 0);
| logshiftadd(1024, 0, 1);
| logshiftadd(1024, 0, 2);
| printf("\n");
| logshiftadd(1024, 2, 0);
| logshiftadd(1024, 2, 1);
| logshiftadd(1024, 2, 2);
|
| return 0;
| }
| [mark@gravadlaks:~]% gcc shiftadd.c -o shiftadd
| [mark@gravadlaks:~]% ./shiftadd
| 0 >> 0 + 0 is 0
| 0 >> 0 + 1 is 0
| 0 >> 0 + 2 is 0
|
| 1024 >> 0 + 0 is 1024
| 1024 >> 0 + 1 is 512
| 1024 >> 0 + 2 is 256
|
| 1024 >> 2 + 0 is 256
| 1024 >> 2 + 1 is 128
| 1024 >> 2 + 2 is 64

> During testing, all > those captured branch records looked alright.

I think we clearly need better testing here.

Thanks,
Mark.