Re: [PATCH] arm64: dts: qcom: sdm845: Fix wild reboot during Antutu test

From: Daniel Lezcano
Date: Tue Jan 16 2024 - 10:38:50 EST


On 16/01/2024 15:03, Luca Weiss wrote:
On Tue Jan 16, 2024 at 1:51 PM CET, Daniel Lezcano wrote:
On 16/01/2024 13:37, Luca Weiss wrote:
On Tue Jan 16, 2024 at 12:59 PM CET, Daniel Lezcano wrote:
Running an Antutu benchmark makes the board to do a hard reboot.

Cause: it appears the gpu-bottom and gpu-top temperature sensors are showing
too high temperatures, above 115°C.

Out of tree configuratons show the gpu thermal zone is configured to
be mitigated at 85°C with devfreq.

Add the DT snippet to enable the thermal mitigation on the sdm845
based board.

Fixes: c79800103eb18 ("arm64: dts: sdm845: Add gpu and gmu device nodes")
Cc: Amit Pundir <amit.pundir@xxxxxxxxxx>
Signed-off-by: Daniel Lezcano <daniel.lezcano@xxxxxxxxxx>

A part of this is already included with this patch:
https://lore.kernel.org/linux-arm-msm/20240102-topic-gpu_cooling-v1-4-fda30c57e353@xxxxxxxxxx/

Maybe rebase on top of that one and add the 85degC trip point or
something?

Actually, I think the patch is wrong.

I recommend telling Konrad in that patch then, not me :)

That's good Konrad is in the recipient list :)

The cooling effect does not operate on 'hot' trip point type as it is
considered as a critical trip point. The governor is not invoked, so no
mitigation happen. The 'hot' trip point type results in sending a
notification to userspace to give the last chance to do something before
'critical' is reached where the system is shut down.

I suggest to revert it and pick the one I proposed.

It hasn't been applied yet so it can be fixed in v2 there.

The patch was submitted without testing AFAICT. So it is preferable to pick the one I sent which was tested by Amit and me.



--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog