Re: [PATCH RFT] arm64: dts: qcom: sc7280: Add missing LMH interrupts

From: Doug Anderson
Date: Fri Aug 25 2023 - 16:18:33 EST


Hi,

On Fri, Aug 11, 2023 at 1:58 PM Konrad Dybcio <konrad.dybcio@xxxxxxxxxx> wrote:
>
> Hook up the interrupts that signal the Limits Management Hardware has
> started some sort of throttling action.
>
> Fixes: 7dbd121a2c58 ("arm64: dts: qcom: sc7280: Add cpufreq hw node")
> Signed-off-by: Konrad Dybcio <konrad.dybcio@xxxxxxxxxx>
> ---
> test case:
>
> - hammer the CPUs (like compile the Linux kernel)
> - watch -n1 "cat /proc/interrupts | grep dcvsh"
> - the numbers go up up up up -> good

I'm not doing much on sc7280 these days, but I did try putting your
patch on a sc7280-hoglin (AKA a CRD). I tried to stress the system out
a bunch (ran 8 instances of "while true; do true; done" and opened
something to activate the GPU). I didn't see any LMH interrupts fire.
Of course, with ChromeOS firmware LMH is _supposed_ to be mostly
disabled, so maybe that's right? Our policy was always to have Linux
do as much of the throttling as possible and only use LMH as a last
resort.

I assume I don't need any specific config option turned on?

I know that on other Qualcomm boards I see LMH nodes in the device
tree, which we don't have in sc7280. Like "qcom,sdm845-lmh". Is that
important? I haven't been following what's been going on with LMH in
Linux since we try not to use it.

For giggles, I also tried putting the patch on a sc7280-villager
device to see if it had different thermals. I even put my jacket over
it to try to keep it warm. I saw the sensors go up to 109C on the
medium cores and still no LMH interrupts. Oh, and then the device shut
itself down. I guess something about thermal throttling in Linux must
be disabled but then it still handles the critical state? :( That's
concerning...

I put the same kernel on a trogdor device and that did normal Linux
throttling OK. So something is definitely wonky with sc7280... I dug
enough to find that if I used "step_wise" instead of "power_allocator"
that it works OK, so I guess something is wonky about the config of
power_allocator on sc7280. In any case, it's not affected by your
patch and I've already probably spent too much time on it. :-P

-Doug