Re: [PATCH 1/2] thermal: core: add initial support for cold and critical_cold trip point

From: Daniel Lezcano
Date: Wed Dec 13 2023 - 11:00:34 EST


On 13/12/2023 15:56, Christian Marangi wrote:
On Wed, Dec 13, 2023 at 03:43:54PM +0100, Rafael J. Wysocki wrote:
On Wed, Dec 13, 2023 at 3:20 PM Christian Marangi <ansuelsmth@xxxxxxxxx> wrote:

On Wed, Dec 13, 2023 at 03:12:41PM +0100, Daniel Lezcano wrote:
On 12/12/2023 23:13, Christian Marangi wrote:
Add initial support for cold and critical_cold trip point. Many if not
all hwmon and thermal device have normally trip point for hot
temperature and for cold temperature.

Till now only hot temperature were supported. Add support for also cold
temperature to permit complete definition of cold trip point in DT.

Thermal driver may use these additional trip point to correctly set
interrupt for cold temperature values and react based on that with
various measure like enabling attached heater, forcing higher voltage
and other specialaized peripherals.

For hwmon drivers this is needed as currently there is a problem with
setting the full operating range of the device for thermal devices
defined with hwmon. To better describe the problem, the following
example is needed:

In the scenario of a simple hwmon with an active trip point declared
and a cooling device attached, the hwmon subsystem currently set the
min and max trip point based on the single active trip point.
Thermal subsystem parse all the trip points and calculate the lowest and
the highest trip point and calls the .set_trip of hwmon to setup the
trip points.

The fact that we currently don't have a way to declare the cold/min
temperature values, makes the thermal subsystem to set the low value as
-INT_MAX.
For hwmon drivers that doesn't use clamp_value and actually reject
invalid values for the trip point, this results in the hwmon settings to
be rejected.

To permit to pass the correct range of trip point, permit to set in DT
also cold and critical_cold trip point.

Thermal driver may also define .cold and .critical_cold to act on these
trip point tripped and apply the required measure.

Agree with the feature but we need to clarify the semantic of the trip
points first. What actions do we expect for them in order to have like a
mirror reflection of the existing hot trip points.

What action do you expect with:

- 'cold' ?

- 'critical_cold' ?



This is more of a sensible topic but I think it's the thermal driver
that needs to implement these. As said in the commit description,
examples are setting higher voltage from the attached regulator,
enabling some hardware heater.

So how is it different from an active trip point? There are heating
rather than cooling devices associated with it, but other than this??


From what I read from documentation, active trip point are used to
trigger cooling-device. Cold (and crit_cold) are to setup trip point to
the device. The device will normally trigger an interrupt (or even
internally with the correct register set autonomously apply some measure
to handle the problem)

Actually what specifies an active cooling device is it requires energy in order to operate. More precisely, the goal of an active cooling device is too move the heat from one place to another place. So the system, instead of relying on the natural convection thermal transfer, will force this transfer. So the "active" means external system + energy.

In theory it's possible to have passive trip point for cold condition
but still we lack any definition of the lower spectrum of the trip
point.

Yes, absolutely :) And that is why I think we should clarify that to conform to the general semantic of the thermal management. If we define things in the thermal framework but having a different meaning in the thermal management vocabulary. That will be really odd and look amateur work :)

In the lower spectrum, an external warming device where we use energy to provide some heat is active. But if we use some kind of software solution (like what suggested before), we indeed use energy, but the solution is internal to the system, so I do believe we can consider it as passive.

IMO, we should see, especially on mobile, passive trip point for too hot, and active trip point for too cold. That would not surprising as the former has too much energy generated and the latter not enough energy.

(BTW, as a side note, active or passive trip points do not really make sense to me. It is the mitigation devices which are active or passive).

For the systems which do not have a dedicated warming up hardware, we should implement a "warming device" as a passive one (which is a different story from your proposal I agree).


Maybe with critical cold bigger measure can be applied. Currently for
critical trip point we shutdown the system (if the critical ops is not
declared) but with critical_cold condition I think it won't work... I
expect a system in -40°C would just lock up/glitch so rebooting in that
condition won't change a thing...

Anyway yes we can define a shutdown by default for that but IMHO it
wouldn't make much sense.

So why do you want to add it at all?

Again it's really to fill a hole we have from a long time... One example
is the qcom tsens driver that have trip point for cold and crit_cold.
Those in theory can be set in DT with the trip point but we lack any
definition for them. (using passive trip point would be confusing IMHO)

Another example is an Aquantia PHY that have register for the cold and
critical_cold trip point.

Currently defining a critical trip point for the negative temp results
in the system shutdown.

Yes, and the more I think about it, the more I'm inclined to have:

* trip (active|passive) + hot|cold
* trip cold (meaning "really too cold")
* trip hot (meaning "really too hot")
* trip critical (meaning "I'm about to collapse")

We may have also active devices with multiple level of warm up, so it will need to be managed by a governor like the step wise.


--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog