Re: [PATCH V3] thermal/core/power_allocator: avoid thermal cdev can not be reset

From: Di Shen
Date: Sun Apr 09 2023 - 22:10:09 EST


Hi Lukasz,
Could you please apply this patch if there's no more comment? Thank you.

Best regards,
Di

On Mon, Mar 20, 2023 at 8:29 PM Lukasz Luba <lukasz.luba@xxxxxxx> wrote:
>
>
>
> On 3/20/23 09:56, Di Shen wrote:
> > Commit <0952177f2a1f>(thermal/core/power_allocator: Update once
> > cooling devices when temp is low) adds a update flag to avoid
> > the thermal event is triggered when there is no need, and
> > thermal cdev would be update once when temperature is low.
> >
> > But when the trips are writable, and switch_on_temp is set
> > to be a higher value, the cooling device state may not be
> > reset to 0, because last_temperature is smaller than the
> > switch_on_temp.
> >
> > For example:
> > First:
> > switch_on_temp=70 control_temp=85;
> > Then userspace change the trip_temp:
> > switch_on_temp=45 control_temp=55 cur_temp=54
> >
> > Then userspace reset the trip_temp:
> > switch_on_temp=70 control_temp=85 cur_temp=57 last_temp=54
> >
> > At this time, the cooling device state should be reset to 0.
> > However, because cur_temp(57) < switch_on_temp(70)
> > last_temp(54) < switch_on_temp(70) ----> update = false,
> > update is false, the cooling device state can not be reset.
> >
> > This patch adds a function thermal_cdev_needs_update() to
> > renew the update flag value only when the trips are writable,
> > so that thermal cdev->state can be reset after switch_on_temp
> > changed from low to high.
> >
> > Fixes: <0952177f2a1f> (thermal/core/power_allocator: Update once cooling devices when temp is low)
> > Signed-off-by: Di Shen <di.shen@xxxxxxxxxx>
> >
> > ---
> > V3:
> > - Add fix tag.
> >
> > V2:
> > - Compared to v1, do not revert.
> >
> > - Add a variable(last_switch_on_temp) in power_allocator_params
> > to record the last switch_on_temp value.
> >
> > - Adds a function to renew the update flag and update the
> > last_switch_on_temp when thermal trips are writable.
> >
> > V1:
> > - Revert commit 0952177f2a1f.
> > ---
> > ---
> > drivers/thermal/gov_power_allocator.c | 39 ++++++++++++++++++++++-----
> > 1 file changed, 33 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/thermal/gov_power_allocator.c b/drivers/thermal/gov_power_allocator.c
> > index 0eaf1527d3e3..c9e1f3b15f15 100644
> > --- a/drivers/thermal/gov_power_allocator.c
> > +++ b/drivers/thermal/gov_power_allocator.c
> > @@ -59,6 +59,8 @@ static inline s64 div_frac(s64 x, s64 y)
> > * governor switches on when this trip point is crossed.
> > * If the thermal zone only has one passive trip point,
> > * @trip_switch_on should be INVALID_TRIP.
> > + * @last_switch_on_temp:Record the last switch_on_temp only when trips
> > + are writable.
> > * @trip_max_desired_temperature: last passive trip point of the thermal
> > * zone. The temperature we are
> > * controlling for.
> > @@ -70,6 +72,9 @@ struct power_allocator_params {
> > s64 err_integral;
> > s32 prev_err;
> > int trip_switch_on;
> > +#ifdef CONFIG_THERMAL_WRITABLE_TRIPS
> > + int last_switch_on_temp;
> > +#endif
> > int trip_max_desired_temperature;
> > u32 sustainable_power;
> > };
> > @@ -554,6 +559,25 @@ static void get_governor_trips(struct thermal_zone_device *tz,
> > }
> > }
> >
> > +#ifdef CONFIG_THERMAL_WRITABLE_TRIPS
> > +static bool thermal_cdev_needs_update(struct thermal_zone_device *tz, int switch_on_temp)
> > +{
> > + bool update;
> > + struct power_allocator_params *params = tz->governor_data;
> > + int last_switch_on_temp = params->last_switch_on_temp;
> > +
> > + update = (tz->last_temperature >= last_switch_on_temp);
> > + params->last_switch_on_temp = switch_on_temp;
> > +
> > + return update;
> > +}
> > +#else
> > +static inline bool thermal_cdev_needs_update(struct thermal_zone_device *tz, int switch_on_temp)
> > +{
> > + return false;
> > +}
> > +#endif
> > +
> > static void reset_pid_controller(struct power_allocator_params *params)
> > {
> > params->err_integral = 0;
> > @@ -709,12 +733,15 @@ static int power_allocator_throttle(struct thermal_zone_device *tz, int trip_id)
> > return 0;
> >
> > ret = __thermal_zone_get_trip(tz, params->trip_switch_on, &trip);
> > - if (!ret && (tz->temperature < trip.temperature)) {
> > - update = (tz->last_temperature >= trip.temperature);
> > - tz->passive = 0;
> > - reset_pid_controller(params);
> > - allow_maximum_power(tz, update);
> > - return 0;
> > + if (!ret) {
> > + update = thermal_cdev_needs_update(tz, trip.temperature);
> > + if (tz->temperature < trip.temperature) {
> > + update |= (tz->last_temperature >= trip.temperature);
> > + tz->passive = 0;
> > + reset_pid_controller(params);
> > + allow_maximum_power(tz, update);
> > + return 0;
> > + }
> > }
> >
> > tz->passive = 1;
>
>
> Thanks for the patch. The code looks good. The initial value of
> 'last_switch_on_temp' would be set to 0. It won't harm us because it
> will get the proper value later.
>
> Reviewed-by: Lukasz Luba <lukasz.luba@xxxxxxx>