Re: [PATCH] arm64: dts: mediatek: mt8195: Set DSU PMU status to fail

From: Macpaul Lin
Date: Thu Aug 24 2023 - 07:59:39 EST


On 8/11/23 06:12, Nícolas F. R. A. Prado wrote:
On Thu, Jul 20, 2023 at 04:07:51PM -0400, Nícolas F. R. A. Prado wrote:
The DSU PMU allows monitoring performance events in the DSU cluster,
which is done by configuring and reading back values from the DSU PMU
system registers. However, for write-access to be allowed by ELs lower
than EL3, the EL3 firmware needs to update the setting on the ACTLR3_EL3
register, as it is disallowed by default.

That configuration is not done on the firmware used by the MT8195 SoC,
as a consequence, booting a MT8195-based machine like
mt8195-cherry-tomato-r2 with CONFIG_ARM_DSU_PMU enabled hangs the kernel
just as it writes to the CLUSTERPMOVSCLR_EL1 register, since the
instruction faults to EL3, and BL31 apparently just re-runs the
instruction over and over.

Mark the DSU PMU node in the Devicetree with status "fail", as the
machine doesn't have a suitable firmware to make use of it from the
kernel, and allowing its driver to probe would hang the kernel.

Fixes: 37f2582883be ("arm64: dts: Add mediatek SoC mt8195 and evaluation board")
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@xxxxxxxxxxxxx>

Hi Matthias,

gentle ping on this patch, as it's not possible to boot MT8195 Chromebooks with
the mainline defconfig without this fix.

I've encountered this issue for a long time since CONFIG_ARM_DSU_PMU has been enabled by this patch.
075ed7b9e408 arm64: configs: Enable all PMUs provided by Arm

I'm working on mt8195-demo board, I guess this board has the same issue as mt8195-cherry-tomato-r2. Here's the log.

[ 22.996825] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 22.997609] rcu: (detected by 1, t=5254 jiffies, g=-603, q=3023 ncpus=8)
[ 22.998468] rcu: All QSes seen, last rcu_preempt kthread activity 5252 (4294898045-4294892793), jiffies_till_next_fqs=1, root ->qsmask 0x0
[ 23.000036] rcu: rcu_preempt kthread timer wakeup didn't happen for 5251 jiffies! g-603 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x200
[ 23.001462] rcu: Possible timer handling issue on cpu=2 timer-softirq=47
[ 23.002319] rcu: rcu_preempt kthread starved for 5252 jiffies! g-603 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x200 ->cpu=2
[ 23.003625] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
[ 23.004776] rcu: RCU grace-period kthread stack dump:
[ 23.005414] task:rcu_preempt state:R stack:0 pid:15 ppid:2 flags:0x00000008
[ 23.006474] Call trace:
[ 23.006788] __switch_to+0xe4/0x15c
[ 23.007240] __schedule+0x2bc/0xaa0
[ 23.007685] schedule+0x5c/0xc4
[ 23.008087] schedule_timeout+0x80/0xf4
[ 23.008578] rcu_gp_fqs_loop+0x124/0x3d4
[ 23.009081] rcu_gp_kthread+0x124/0x160
[ 23.009571] kthread+0x118/0x11c
[ 23.009985] ret_from_fork+0x10/0x20

I have a work around to enable DSU PMU in firmware (trusted-firmware-a) to solve this hang problem.
However, I think this is not the correct place to put these codes to enable DSU PMU in trusted-firmware-a.

--- a/include/arch/aarch64/el3_common_macros.S
+++ b/include/arch/aarch64/el3_common_macros.S
@@ -39,6 +39,19 @@
msr sctlr_el3, x0
isb

+ /* enable DSU PMU */
+ mov x1, #(1 << 12)
+ mrs x0, actlr_el3
+ orr x0, x0, x1
+ msr actlr_el3, x0
+ isb
+
+ mov x1, #(1 << 12)
+ mrs x0, actlr_el2
+ orr x0, x0, x1
+ msr actlr_el2, x0
+ isb
+
#ifdef IMAGE_BL31
/* ---------------------------------------------------------------------
* Initialise the per-cpu cache pointer to the CPU.

If I put these codes in other platform dependent files to enable DSU PMU instead of the common code beginning of the EL3, it just not work.

It should be able to fixed in firmware in platform dependent files, but I'm not familiar with how actlr_el3 and actlr_el2 should be accessed. Otherwise, the DSU PMU node in dts should be disabled. Any idea is welcome.

Thanks
Macpaul Lin