Re: [PATCH v2] thermal: core: Add trip thresholds for trip crossing detection

From: srinivas pandruvada
Date: Fri Nov 03 2023 - 11:42:28 EST


On Fri, 2023-11-03 at 15:56 +0100, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
>
> The trip crossing detection in handle_thermal_trip() does not work
> correctly in the cases when a trip point is crossed on the way up and
> then the zone temperature stays above its low temperature (that is,
> its
> temperature decreased by its hysteresis).  The trip temperature may
> be passed by the zone temperature subsequently in that case, even
> multiple times, but that does not count as the trip crossing as long
> as
> the zone temperature does not fall below the trip's low temperature
> or,
> in other words, until the trip is crossed on the way down.

In other words you want to avoid multiple trip UP notifications without
a corresponding DOWN notification.

This will reduce unnecessary noise to user space. Is this the
intention?

Thanks,
Srinivas

>
> > -----------low--------high------------|
>              |<--------->|
>              |    hyst   |
>              |           |
>              |          -|--> crossed on the way up
>              |
>          <---|-- crossed on the way down
>
> However, handle_thermal_trip() will invoke
> thermal_notify_tz_trip_up()
> every time the trip temperature is passed by the zone temperature on
> the way up regardless of whether or not the trip has been crossed on
> the way down yet.  Moreover, it will not call
> thermal_notify_tz_trip_down()
> if the last zone temperature was between the trip's temperature and
> its
> low temperature, so some "trip crossed on the way down" events may
> not
> be reported.
>
> To address this issue, introduce trip thresholds equal to either the
> temperature of the given trip, or its low temperature, such that if
> the trip's threshold is passed by the zone temperature on the way up,
> its value will be set to the trip's low temperature and
> thermal_notify_tz_trip_up() will be called, and if the trip's
> threshold
> is passed by the zone temperature on the way down, its value will be
> set
> to the trip's temperature (high) and thermal_notify_tz_trip_down()
> will
> be called.  Accordingly, if the threshold is passed on the way up, it
> cannot be passed on the way up again until its passed on the way down
> and if it is passed on the way down, it cannot be passed on the way
> down
> again until it is passed on the way up which guarantees correct
> triggering of trip crossing notifications.
>
> If the last temperature of the zone is invalid, the trip's threshold
> will be set depending of the zone's current temperature: If that
> temperature is above the trip's temperature, its threshold will be
> set to its low temperature or otherwise its threshold will be set to
> its (high) temperature.  Because the zone temperature is initially
> set to invalid and tz->last_temperature is only updated by
> update_temperature(), this is sufficient to set the correct initial
> threshold values for all trips.
>
> Link:
> https://lore.kernel.org/all/20220718145038.1114379-4-daniel.lezcano@xxxxxxxxxx
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> ---
>
> v1 (RFC) -> v2: Add missing description of a new struct thermal_trip
> field.
>
> And because no comments have been sent for a week, this is not an RFC
> any more.
>
> ---
>  drivers/thermal/thermal_core.c |   21 ++++++++++++++-------
>  include/linux/thermal.h        |    2 ++
>  2 files changed, 16 insertions(+), 7 deletions(-)
>
> Index: linux-pm/drivers/thermal/thermal_core.c
> ===================================================================
> --- linux-pm.orig/drivers/thermal/thermal_core.c
> +++ linux-pm/drivers/thermal/thermal_core.c
> @@ -345,22 +345,29 @@ static void handle_critical_trips(struct
>  }
>  
>  static void handle_thermal_trip(struct thermal_zone_device *tz,
> -                               const struct thermal_trip *trip)
> +                               struct thermal_trip *trip)
>  {
>         if (trip->temperature == THERMAL_TEMP_INVALID)
>                 return;
>  
> -       if (tz->last_temperature != THERMAL_TEMP_INVALID) {
> -               if (tz->last_temperature < trip->temperature &&
> -                   tz->temperature >= trip->temperature)
> +       if (tz->last_temperature == THERMAL_TEMP_INVALID) {
> +               trip->threshold = trip->temperature;
> +               if (tz->temperature >= trip->temperature)
> +                       trip->threshold -= trip->hysteresis;
> +       } else {
> +               if (tz->last_temperature < trip->threshold &&
> +                   tz->temperature >= trip->threshold) {
>                         thermal_notify_tz_trip_up(tz->id,
>                                                  
> thermal_zone_trip_id(tz, trip),
>                                                   tz->temperature);
> -               if (tz->last_temperature >= trip->temperature &&
> -                   tz->temperature < trip->temperature - trip-
> >hysteresis)
> +                       trip->threshold = trip->temperature - trip-
> >hysteresis;
> +               } else if (tz->last_temperature >= trip->threshold &&
> +                          tz->temperature < trip->threshold) {
>                         thermal_notify_tz_trip_down(tz->id,
>                                                    
> thermal_zone_trip_id(tz, trip),
>                                                     tz->temperature);
> +                       trip->threshold = trip->temperature;
> +               }
>         }
>  
>         if (trip->type == THERMAL_TRIP_CRITICAL || trip->type ==
> THERMAL_TRIP_HOT)
> @@ -403,7 +410,7 @@ static void thermal_zone_device_init(str
>  void __thermal_zone_device_update(struct thermal_zone_device *tz,
>                                   enum thermal_notify_event event)
>  {
> -       const struct thermal_trip *trip;
> +       struct thermal_trip *trip;
>  
>         if (atomic_read(&in_suspend))
>                 return;
> Index: linux-pm/include/linux/thermal.h
> ===================================================================
> --- linux-pm.orig/include/linux/thermal.h
> +++ linux-pm/include/linux/thermal.h
> @@ -57,12 +57,14 @@ enum thermal_notify_event {
>   * struct thermal_trip - representation of a point in temperature
> domain
>   * @temperature: temperature value in miliCelsius
>   * @hysteresis: relative hysteresis in miliCelsius
> + * @threshold: trip crossing notification threshold miliCelsius
>   * @type: trip point type
>   * @priv: pointer to driver data associated with this trip
>   */
>  struct thermal_trip {
>         int temperature;
>         int hysteresis;
> +       int threshold;
>         enum thermal_trip_type type;
>         void *priv;
>  };
>
>
>