Re: [PATCH] perf/arm-cmn: Fix DTC reset

From: Robin Murphy
Date: Tue Apr 25 2023 - 11:49:59 EST

Next message: Qi Zheng: "Re: [PATCH] mm: slub: annotate kmem_cache_node->list_lock as raw_spinlock"
Previous message: Al Viro: "Re: [GIT PULL] pidfd updates"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 14/04/2023 3:19 pm, Robin Murphy wrote:

On 2023-04-06 22:25, Geoff Blake wrote:

Ran this patch on an AWS C6g.metal and unfortunately still see the
spurious IRQs trigger quickly (within 10 tries) when using the following
flow:

perf stat -a -e arm_cmn_0/event=0x5,type=0x5/ -- sleep 600
kexec -e

Adding in the simple shutdown routine, I have run over 100 of the
above cycles and the spurious IRQs haven't triggered. I think we still
need both for now.

There is no "need both" - if this patch doesn't work to reset the PMU as intended then we still need a better patch that does. After yet more trying, I still cannot reproduce your results, but I do suspect this patch isn't as good as it initially seemed.

I got my hands on a C6g.metal instance, and I'm building the mainline version of arm-cmn.c from my cmn-dev branch (including the two other pending fixes that I've sent recently) against the 5.15.0-1031-aws kernel that it came with, as a standalone module with a trivial makefile. Even running "stress -m 60" in the background, as the most effective thing I've found so far, that hnf_pocq_reqs_recvd event takes well over 8 minutes to overflow, so I have failed to achieve the necessary timing to kexec at just the right point for the residual interconnect traffic to add up and overflow the event during the handful of seconds that the kexec takes. For completeness, I have managed to run the perf stat/kexec, then run stress for 10 minutes under the new kernel, *then* finally load the module to achieve the right conditions, but that's so utterly contrived and long-winded that I don't really have the patience to do it more than the twice that I already did.

What I can do instantly and reliably is reproduce equivalent conditions with my (now even more stripped-down) remove hack[1] and a simple rmmod/insmod (with a few seconds in between for good measure), leading to demonstrable latent overflows on all 4 DTCs every time. The existing code does seem to manage to reset DTC0 such that its interrupt (IRQ 27) does not fire, consistent with what I've observed on other machines, while I see the secondary DTCs (IRQs 28, 29 and 30) each fire 100000 times spuriously and get disabled. With this patch on top[2], that consistently does not happen over 100 unload/reload cycles.

Given that you say the same write to clear DTC_CTL, but a few seconds earlier in the form of the shutdown hook, does seem to work, I have still been wary of some kind of weird timing issue all along, but the fact that I was getting such consistent behaviour even on C6g seemed to be pointing away from that :/

The closest I've got so far is by leaving this even more involved test loop (with real PMU programming in between) running overnight:

for i in {1..10000}; do sudo insmod arm-cmn.ko && sudo perf stat -e arm_cmn_0/eventid=5,type=5/ sleep 1 && sudo rmmod arm-cmn && sleep 4; done

and now coming back to find /proc/interrupts saying this:

27:          1          0          0...
28:          1          0          0...
29:          2          0          0...
30:          1          0          0...

I've quite often seen a single IRQ firing earlier than expected (not necessarily spuriously), so I still need to check what's up with that - it may be that writing to the counters doesn't always take either. However, the single extra incidence of IRQ 29 which has happened at some point after I went home is more of a smoking gun:

[84581.790043] WARNING: CPU: 0 PID: 0 at /home/ubuntu/arm-cmn.c:1828 arm_cmn_handle_irq+0x148/0x1cc [arm_cmn]

So something still snuck through reset, but it *was* at least visible and clearable by the time the IRQ was enabled. Interestingly the other warning for !dtc->cycles did not fire at the same time, even though the hack normally overflows PMCCNTR before PMEVCNTR(0). I'll keep digging...

I realise I neglected to follow up on this - where I got to was adding an extra read back of CMN_DTC_CTL after the write, tweaking my remove hack to generate overflows even more reliably for good measure, and that then ran for ~56,000 test cycles (until my time on the instance ran out) without any stray interrupts at all. However I wasn't keen on posting that as a v2 without any better justification than it being a "add random things to change the timing a bit" bodge. Since we're in the merge window now, I'll see if I can get a better answer by -rc1.

Thanks,
Robin.

Next message: Qi Zheng: "Re: [PATCH] mm: slub: annotate kmem_cache_node->list_lock as raw_spinlock"
Previous message: Al Viro: "Re: [GIT PULL] pidfd updates"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]