Re: [PATCH] arm64: dts: rockchip: enable built-in thermal monitoring on rk3588

From: Alexey Charkov
Date: Mon Jan 22 2024 - 02:37:25 EST


On Mon, Jan 22, 2024 at 10:22 AM Dragan Simic <dsimic@xxxxxxxxxxx> wrote:
>
> On 2024-01-22 07:03, Alexey Charkov wrote:
> > On Mon, Jan 22, 2024 at 8:55 AM Dragan Simic <dsimic@xxxxxxxxxxx>
> > wrote:
> >> On 2024-01-21 19:56, Alexey Charkov wrote:
> >> > On Thu, Jan 18, 2024 at 10:48 PM Dragan Simic <dsimic@xxxxxxxxxxx> wrote:
> >> >> On 2024-01-08 14:41, Alexey Charkov wrote:
> >> >> > On Sun, Jan 7, 2024 at 2:54 AM Dragan Simic <dsimic@xxxxxxxxxxx> wrote:
> >> >> >> On 2024-01-06 23:23, Alexey Charkov wrote:
> >> >> >> > Include thermal zones information in device tree for rk3588 variants
> >> >> >> > and enable the built-in thermal sensing ADC on RADXA Rock 5B
> >> >> >> >
> >> >> >> > Signed-off-by: Alexey Charkov <alchark@xxxxxxxxx>
> >> >> >> > ---
> >> >> >> > + trips {
> >> >> >> > + threshold: trip-point-0 {
> >> >> >>
> >> >> >> It should be better to name it cpu_alert0 instead, because that's what
> >> >> >> other newer dtsi files already use.
> >> >> >
> >> >> > Reflecting on your comments here and below, I'm thinking that maybe it
> >> >> > would be better to define only the critical trip point for the SoC
> >> >> > overall, and then have alerts along with the respective cooling maps
> >> >> > separately for A76-0,1, A76-2,3, A55-0,1,2,3? After all, given that we
> >> >> > have more granular temperature measurement here than in previous RK
> >> >> > chipsets it might be better to only throttle the "offending" cores,
> >> >> > not the full package.
> >> >> >
> >> >> > What do you think?
> >> >> >
> >> >> > Downstream DT doesn't follow this approach though, so maybe there's
> >> >> > something I'm missing here.
> >> >>
> >> >> I agree, it's better to fully utilize the higher measurement
> >> >> granularity
> >> >> made possible by having multiple temperature sensors available.
> >> >>
> >> >> I also agree that we should have only the critical trip defined for
> >> >> the
> >> >> package-level temperature sensor. Let's have the separate temperature
> >> >> measurements for the CPU (sub)clusters do the thermal throttling, and
> >> >> let's keep the package-level measurement for the critical shutdowns
> >> >> only. IIRC, some MediaTek SoC dtsi already does exactly that.
> >> >>
> >> >> Of course, there are no reasons not to have the critical trips defined
> >> >> for the CPU (sub)clusters as well.
> >> >
> >> > I think I'll also add a board-specific active cooling mechanism on the
> >> > package level in the next iteration, given that Rock 5B has a PWM fan
> >> > defined as a cooling device. That will go in the separate patch that
> >> > updates rk3588-rock-5b.dts (your feedback to v2 of this patch is also
> >> > duly noted, thank you!)
> >>
> >> Great, thanks. Sure, making use of the Rock 5B's support for
> >> attaching
> >> a PWM-controlled cooling fan is the way to go.
> >>
> >> Just to reiterate a bit, any "active" trip points belong to the board
> >> dts file(s), because having a cooling fan is a board-specific feature.
> >> As a note, you may also want to have a look at the RockPro64 dts(i)
> >> files, for example; the RockPro64 also comes with a cooling fan
> >> connector and the associated PWM fan control logic.
> >
> > Thanks for the pointer! There is also a helpful doc within devicetree
> > bindings descriptions, although it sits under hwmon which was a bit
> > confusing to me. I've already tested it locally (by adding to the
> > board dts), and it spins up and down quite nicely, and even modulates
> > the fan speed swiftly when the load changes - yay!
>
> Nice! Also, isn't it like magic? :) To me, turning LEDs on/off and
> controlling fans acts as some kind of a "bridge" between the virtual
> and the real world. :)

Oh yes! I also keep admiring how one can add just a couple of lines of
text here and there that's not even real code, and the whole kernel
machinery starts crunching numbers, analyzing temperatures, running
PID loops, etc etc so that I could enjoy the satisfying whistle of a
small fan when I type `make -j8` :-D

> As a suggestion, it would be good to test with a couple of different
> fans, to make sure that the PWM values work well for more that one fan
> model. The Rock 5B requires a 5 V fan, if I'm not mistaken?

It is 5V, yes. I only have one fan to try though, and I simply relied
on the PWM values that are already defined in the upstream
rk3588-rock-5b.dts. They don't look ideal for my particular fan,
because the lowest non-zero cooling state currently uses a PWM value
of 95, which doesn't always make it spin up. But in the end it doesn't
seem to matter that much, because that tiny fan needs to spin at full
255 whenever all eight cores are loaded (and even then it can only
balance the temperature at around 60.5С), and when the load is lighter
(such as during various ./configure runs) it just switches off
completely as the temperature goes down to 46C even with the fan not
spinning.

I don't currently use the GPU/NPU/VPU though - maybe those would
produce more moderate load which could benefit from spinning the fan
at medium speeds.

Best regards,
Alexey