Re: [PATCH] thermal/core: Don't update trip points inside the hysteresis range

From: Rafael J. Wysocki
Date: Tue Aug 22 2023 - 06:58:33 EST


On Tue, Aug 22, 2023 at 12:25 AM Nícolas F. R. A. Prado
<nfraprado@xxxxxxxxxxxxx> wrote:
>
> On Mon, Aug 21, 2023 at 11:10:27PM +0200, Rafael J. Wysocki wrote:
> > On Mon, Jul 3, 2023 at 7:15 PM Nícolas F. R. A. Prado
> > <nfraprado@xxxxxxxxxxxxx> wrote:
> > >
> > > When searching for the trip points that need to be set, the nearest trip
> > > point's temperature is used for the high trip, while the nearest trip
> > > point's temperature minus the hysteresis is used for the low trip. The
> > > issue with this logic is that when the current temperature is inside a
> > > trip point's hysteresis range, both high and low trips will come from
> > > the same trip point. As a consequence instability can still occur like
> > > this:
> > > * the temperature rises slightly and enters the hysteresis range of a
> > > trip point
> > > * polling happens and updates the trip points to the hysteresis range
> > > * the temperature falls slightly, exiting the hysteresis range, crossing
> > > the trip point and triggering an IRQ, the trip points are updated
> > > * repeat
> > >
> > > So even though the current hysteresis implementation prevents
> > > instability from happening due to IRQs triggering on the same
> > > temperature value, both ways, it doesn't prevent it from happening due
> > > to an IRQ on one way and polling on the other.
> > >
> > > To properly implement a hysteresis behavior, when inside the hysteresis
> > > range, don't update the trip points. This way, the previously set trip
> > > points will stay in effect, which will in a way remember the previous
> > > state (if the temperature signal came from above or below the range) and
> > > therefore have the right trip point already set. The exception is if
> > > there was no previous trip point set, in which case a previous state
> > > doesn't exist, and so it's sensible to allow the hysteresis range as
> > > trip points.
> > >
> > > Signed-off-by: Nícolas F. R. A. Prado <nfraprado@xxxxxxxxxxxxx>
> > >
> > > ---
> > >
> > > drivers/thermal/thermal_trip.c | 21 +++++++++++++++++++--
> > > 1 file changed, 19 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/thermal/thermal_trip.c b/drivers/thermal/thermal_trip.c
> > > index 907f3a4d7bc8..c386ac5d8bad 100644
> > > --- a/drivers/thermal/thermal_trip.c
> > > +++ b/drivers/thermal/thermal_trip.c
> > > @@ -57,6 +57,7 @@ void __thermal_zone_set_trips(struct thermal_zone_device *tz)
> > > {
> > > struct thermal_trip trip;
> > > int low = -INT_MAX, high = INT_MAX;
> > > + int low_trip_id = -1, high_trip_id = -2;
> > > int i, ret;
> > >
> > > lockdep_assert_held(&tz->lock);
> > > @@ -73,18 +74,34 @@ void __thermal_zone_set_trips(struct thermal_zone_device *tz)
> > >
> > > trip_low = trip.temperature - trip.hysteresis;
> > >
> > > - if (trip_low < tz->temperature && trip_low > low)
> > > + if (trip_low < tz->temperature && trip_low > low) {
> > > low = trip_low;
> > > + low_trip_id = i;
> > > + }
> > >
> >
> > I think I get the idea, but wouldn't a similar effect be achieved by
> > adding an "else" here?
>
> No. That would only fix the problem in one direction, namely, when the
> temperature entered the hysteresis range from above. But when the temperature
> entered the range from below, we'd need to check the high threshold first to
> achieve the same result.
>
> The way I've implemented here is the simplest I could think of that works for
> both directions.

Well, what about the replacement patch below (untested)?

---
drivers/thermal/thermal_trip.c | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)

Index: linux-pm/drivers/thermal/thermal_trip.c
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_trip.c
+++ linux-pm/drivers/thermal/thermal_trip.c
@@ -55,6 +55,7 @@ void __thermal_zone_set_trips(struct the
{
struct thermal_trip trip;
int low = -INT_MAX, high = INT_MAX;
+ bool same_trip = false;
int i, ret;

lockdep_assert_held(&tz->lock);
@@ -63,6 +64,7 @@ void __thermal_zone_set_trips(struct the
return;

for (i = 0; i < tz->num_trips; i++) {
+ bool low_set = false;
int trip_low;

ret = __thermal_zone_get_trip(tz, i , &trip);
@@ -71,18 +73,31 @@ void __thermal_zone_set_trips(struct the

trip_low = trip.temperature - trip.hysteresis;

- if (trip_low < tz->temperature && trip_low > low)
+ if (trip_low < tz->temperature && trip_low > low) {
low = trip_low;
+ low_set = true;
+ same_trip = false;
+ }

if (trip.temperature > tz->temperature &&
- trip.temperature < high)
+ trip.temperature < high) {
high = trip.temperature;
+ same_trip = low_set;
+ }
}

/* No need to change trip points */
if (tz->prev_low_trip == low && tz->prev_high_trip == high)
return;

+ /*
+ * If "high" and "low" are the same, skip the change unless this is the
+ * first time.
+ */
+ if (same_trip && (tz->prev_low_trip != -INT_MAX ||
+ tz->prev_high_trip != INT_MAX))
+ return;
+
tz->prev_low_trip = low;
tz->prev_high_trip = high;