Re: [PATCH v2] arm64: dts: rockchip: enable built-in thermal monitoring on rk3588

From: Alexey Charkov
Date: Tue Jan 23 2024 - 14:48:25 EST


On Mon, Jan 22, 2024 at 4:04 AM Daniel Lezcano
<daniel.lezcano@xxxxxxxxxx> wrote:
>
>
> Hi Alexey,
>
>
> On 21/01/2024 20:57, Alexey Charkov wrote:
> > On Fri, Jan 19, 2024 at 8:21 PM Daniel Lezcano
> > <daniel.lezcano@xxxxxxxxxx> wrote:
> > Hello Daniel,
> >
> > Thanks a lot for your review and comments! Please see some reflections below.
> >
> >> On 09/01/2024 20:19, Alexey Charkov wrote:
> >>> Include thermal zones information in device tree for rk3588 variants
> >>> and enable the built-in thermal sensing ADC on RADXA Rock 5B
> >>>
> >>> Signed-off-by: Alexey Charkov <alchark@xxxxxxxxx>
> >>> ---
> >>> Changes in v2:
> >>> - Dropped redundant comments
> >>> - Included all CPU cores in cooling maps
> >>> - Split cooling maps into more granular ones utilizing TSADC
> >>> channels 1-3 which measure temperature by separate CPU clusters
> >>> instead of channel 0 which measures the center of the SoC die
> >>> ---
> >>> .../boot/dts/rockchip/rk3588-rock-5b.dts | 4 +
> >>> arch/arm64/boot/dts/rockchip/rk3588s.dtsi | 151 ++++++++++++++++++
> >>> 2 files changed, 155 insertions(+)
> >>>
> >>> diff --git a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
> >>> index a5a104131403..f9d540000de3 100644
> >>> --- a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
> >>> +++ b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
> >>> @@ -772,3 +772,7 @@ &usb_host1_ehci {
> >>> &usb_host1_ohci {
> >>> status = "okay";
> >>> };
> >>> +
> >>> +&tsadc {
> >>> + status = "okay";
> >>> +};
> >>> diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> >>> index 8aa0499f9b03..8d54998d0ecc 100644
> >>> --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> >>> +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> >>> @@ -10,6 +10,7 @@
> >>> #include <dt-bindings/reset/rockchip,rk3588-cru.h>
> >>> #include <dt-bindings/phy/phy.h>
> >>> #include <dt-bindings/ata/ahci.h>
> >>> +#include <dt-bindings/thermal/thermal.h>
> >>>
> >>> / {
> >>> compatible = "rockchip,rk3588";
> >>> @@ -2112,6 +2113,156 @@ tsadc: tsadc@fec00000 {
> >>> status = "disabled";
> >>> };
> >>>
> >>> + thermal_zones: thermal-zones {
> >>> + /* sensor near the center of the whole chip */
> >>> + soc_thermal: soc-thermal {
> >>> + polling-delay-passive = <20>;
> >>
> >> There is no mitigation set for this thermal zone. It is pointless to
> >> specify a passive polling.
> >
> > Indeed, it makes sense to me. There seems to be a catch though in that
> > the driver calls the generic thermal_of_zone_register during the
> > initial probe, which expects both of those polling delays to be
> > present in the device tree, otherwise it simply refuses to add the
> > respective thermal zone, see drivers/thermal/thermal_of.c:502
>
> Usually:
>
> polling-delay-passive = <0>;
> polling-delay = <0>;
>
> cf:
>
> git grep "polling-delay = <0>" arch/arm64/boot/dts

For some reason when I have both polling-delay-passive and
polling-delay set to 0, the active cooling map I have in my board DT
(using a PWM controlled fan) behaves weirdly.

I use the following fragment in my board DTS:

+&package_thermal {
+ trips {
+ package_fan: package-fan {
+ temperature = <55000>;
+ hysteresis = <2000>;
+ type = "active";
+ };
+ };
+
+ cooling-maps {
+ map-fan {
+ trip = <&package_fan>;
+ cooling-device = <&fan THERMAL_NO_LIMIT
THERMAL_NO_LIMIT>;
+ };
+ };
+};

If I add polling-delay = <1000>; at the top, the fan speeds up and
down dynamically as the package temperature swings around 55C. If I
remove that (having set polling-delay = <0>; in rk3588s.dtsi), the fan
speeds up to the midpoint cooling state once the package temperature
approaches 55C, and then it just stays there forever: it doesn't speed
up above the midpoint even as the temperature climbs above 70C, nor
does it spin down as it falls back to around 45C.

Is that the expected behavior for when the polling is disabled?

I haven't yet studied in detail if passive cooling kicks in correctly
with polling disabled, but this behavior with active cooling left me
quite confused - any pointers would be much appreciated.

Thanks a lot,
Alexey