Re: [PATCH V4] thermal/core/power_allocator: reset thermal governor when trip point is changed

From: Lukasz Luba
Date: Fri Jun 23 2023 - 13:34:26 EST




On 6/23/23 17:55, Rafael J. Wysocki wrote:
On Fri, Jun 23, 2023 at 9:43 AM Lukasz Luba <lukasz.luba@xxxxxxx> wrote:




[snip]


I agree, the patch header doesn't explain that properly. Here is the
explanation for this Intelligent Power Allocator (IPA):

The IPA controls temperature using PID mechanism. It's a closed
feedback loop. That algorithm can 'learn' from the 'observed'
in the past reaction for it's control decisions and accumulates that
information in the part called 'error integral'. Those accumulated
'error' gaps are the differences between the set target value and the
actually achieved value. In our case the target value is the target
temperature which is coming from the trip point. That part is then used
with the 'I' (of PID) component, so we can compensate for those
'learned' mistakes.
Now, when you change the target temperature value - all your previous
learned errors won't help you. That's why Intelligent Power Allocator
should reset previously accumulated history.

Right.

And every other governor using information from the past for control
will have an analogous problem, won't it?

Not necessarily, but to play safe I would go case-by-case and make
sure other governors are aligned to this new feature.

E.g. the bang-bang governor operates only on current temperature and
current trip value + trip hysteresis. The flow graph describes it [1].
The control (state of the fan: ON or OFF) of that governor could be
simply adjusted to the new reality -> new trip point temp. That would
just mean 'toggling' the fan if needed. There are only 2 'target'
states: 0 or 1 for the fan. You can images a situation when the
temperature doesn't change, but we manipulate the trip value for that
governor. The governor would react correctly always in such situation
w/o a need of a reset IMO.




For the 2nd case IIUC the code, we pass the 'trip.temperature'
and should be ready for what you said (modification of that value).

Generally speaking, it needs to be prepared for a simultaneous change
of multiple trip points (including active), in which case it may not
be useful to invoke the ->reset() callback for each of them
individually.

Although, that looks more cleaner IMO. Resetting one by one in
a temperature order would be easily maintainable, won't be?

I wouldn't call it maintainable really.

First of all, the trips may not be ordered. There are no guarantees
whatsoever that they will be ordered, so the caller may have to
determine the temperature order in the first place. This would be an
extra requirement that currently is not there.

Apart from this, I don't see any fundamental difference between the
case when trip points are updated via sysfs and when they are updated
by the driver. The governor should reset itself in any of those cases
and even if one trip point changes, the temperature order of all of
them may change, so the governor reset mechanism should be able to
handle the case when multiple trip points are updated at the same
time. While you may argue that this is theoretical, the ACPI spec
clearly states that this is allowed to happen, for example.

If you want a generic reset callback for governors, that's fine, but
make it generic and not specific to a particular use case.

I think we agree here, but probably having slightly different
implementation in mind. Based on you explanation I think you
want simply this API:
void (*reset)(struct thermal_zone_device *tz);

1. no return value
2. no specific trip ID as argument

Do you agree?

Yes, I do.

OK, thanks.

Di could you implement that 'reset()' API according to this description,
please?


IMO such implementation and API would also work for this IPA
purpose. Would that work for the ACPI use case as well?

It would AFAICS.

Thanks Rafael for the comments and the progress that we made :)

Regards,
Lukasz

[1] https://elixir.bootlin.com/linux/v6.3/source/drivers/thermal/gov_bang_bang.c#L80