Re: Re: [PATCH] arm64: dts: qcom: sdm845: Fix wild reboot during Antutu test

From: Bjorn Andersson
Date: Sun Jan 28 2024 - 13:00:39 EST


On Tue, Jan 16, 2024 at 04:38:33PM +0100, Daniel Lezcano wrote:
> On 16/01/2024 15:03, Luca Weiss wrote:
> > On Tue Jan 16, 2024 at 1:51 PM CET, Daniel Lezcano wrote:
> > > On 16/01/2024 13:37, Luca Weiss wrote:
> > > > On Tue Jan 16, 2024 at 12:59 PM CET, Daniel Lezcano wrote:
> > > > > Running an Antutu benchmark makes the board to do a hard reboot.
> > > > >
> > > > > Cause: it appears the gpu-bottom and gpu-top temperature sensors are showing
> > > > > too high temperatures, above 115°C.
> > > > >
> > > > > Out of tree configuratons show the gpu thermal zone is configured to
> > > > > be mitigated at 85°C with devfreq.
> > > > >
> > > > > Add the DT snippet to enable the thermal mitigation on the sdm845
> > > > > based board.
> > > > >
> > > > > Fixes: c79800103eb18 ("arm64: dts: sdm845: Add gpu and gmu device nodes")
> > > > > Cc: Amit Pundir <amit.pundir@xxxxxxxxxx>
> > > > > Signed-off-by: Daniel Lezcano <daniel.lezcano@xxxxxxxxxx>
> > > >
> > > > A part of this is already included with this patch:
> > > > https://lore.kernel.org/linux-arm-msm/20240102-topic-gpu_cooling-v1-4-fda30c57e353@xxxxxxxxxx/
> > > >
> > > > Maybe rebase on top of that one and add the 85degC trip point or
> > > > something?
> > >
> > > Actually, I think the patch is wrong.
> >
> > I recommend telling Konrad in that patch then, not me :)
>
> That's good Konrad is in the recipient list :)
>
> > > The cooling effect does not operate on 'hot' trip point type as it is
> > > considered as a critical trip point. The governor is not invoked, so no
> > > mitigation happen. The 'hot' trip point type results in sending a
> > > notification to userspace to give the last chance to do something before
> > > 'critical' is reached where the system is shut down.
> > >
> > > I suggest to revert it and pick the one I proposed.
> >
> > It hasn't been applied yet so it can be fixed in v2 there.
>
> The patch was submitted without testing AFAICT. So it is preferable to pick
> the one I sent which was tested by Amit and me.
>

I would have loved to have that feedback in the thread that is wrong!

Due to my lack of understanding of this detail, and only positive
reviews I merged said series. Please fix your patch and rebase it on top
of linux-next.

Thanks,
Bjorn