Re: [PATCH 2/5] thermal: devfreq_cooling: get a copy of device status

From: Lukasz Luba
Date: Thu Oct 22 2020 - 07:45:48 EST


Hi Daniel,

On 10/14/20 3:34 PM, Daniel Lezcano wrote:
On 21/09/2020 14:20, Lukasz Luba wrote:
Devfreq cooling needs to now the correct status of the device in order
to operate. Do not rely on Devfreq last_status which might be a stale data
and get more up-to-date values of the load.

Devfreq framework can change the device status in the background. To
mitigate this situation make a copy of the status structure and use it
for internal calculations.

In addition this patch adds normalization function, which also makes sure
that whatever data comes from the device, it is in a sane range.

Signed-off-by: Lukasz Luba <lukasz.luba@xxxxxxx>
---
drivers/thermal/devfreq_cooling.c | 52 +++++++++++++++++++++++++------
1 file changed, 43 insertions(+), 9 deletions(-)

diff --git a/drivers/thermal/devfreq_cooling.c b/drivers/thermal/devfreq_cooling.c
index 7063ccb7b86d..cf045bd4d16b 100644
--- a/drivers/thermal/devfreq_cooling.c
+++ b/drivers/thermal/devfreq_cooling.c
@@ -227,6 +227,24 @@ static inline unsigned long get_total_power(struct devfreq_cooling_device *dfc,
voltage);
}
+static void _normalize_load(struct devfreq_dev_status *status)
+{
+ /* Make some space if needed */
+ if (status->busy_time > 0xffff) {
+ status->busy_time >>= 10;
+ status->total_time >>= 10;
+ }
+
+ if (status->busy_time > status->total_time)
+ status->busy_time = status->total_time;
+
+ status->busy_time *= 100;
+ status->busy_time /= status->total_time ? : 1;
+
+ /* Avoid division by 0 */
+ status->busy_time = status->busy_time ? : 1;
+ status->total_time = 100;
+}

Not sure that works if the devfreq governor is not on-demand.

Is it possible to use the energy model directly in devfreq to compute
the energy consumption given the state transitions since the last reading ?

This change is actually trying to create a safety net for what we do.

In the original code we take the last_state directly:
- struct devfreq_dev_status *status = &df->last_status;

Then we simply multiply by 'busy_time' and div by 'total_time',
without checks... The values might be huge or zero, etc.
The _normalize_load() introduces this safety.

Apart from that, just simply taking a pointer to &df->last_status does
not protect us from:
- working on a struct which might be modified at the same time in
background - not safe
- that struct might not be updated by long time, because devfreq
didn't check it for a long (there are two polling modes in devfreq)

So taking a mutex and then a trigger the device status check and
make a copy of newest data. It is more safe.

I think this can be treated as a fix, not a feature.


The power will be read directly from devfreq which will return:

nrj + (current_power * (jiffies - last_update)) / (jiffies - last_update)

The devfreq cooling device driver would result in a much simpler code, no?

This is something that I would like to address after the EM changes are
merged. It would be the next step, how to estimate the power by taking
into consideration more information. This patch series just tries to
make it possible to use EM. The model improvements would be next.

Thank you Daniel for your review.

Regards,
Lukasz