Re: [RFC PATCH v6 6/9] thermal: cpu_cooling: implement the power cooling device API

From: Punit Agrawal
Date: Thu Jan 29 2015 - 06:19:11 EST


Hi Eduardo,

Eduardo Valentin <edubezval@xxxxxxxxx> writes:

> Hello Javi,
>
> On Fri, Dec 05, 2014 at 07:04:17PM +0000, Javi Merino wrote:
>> Add a basic power model to the cpu cooling device to implement the
>> power cooling device API. The power model uses the current frequency,
>> current load and OPPs for the power calculations. The cpus must have
>> registered their OPPs using the OPP library.
>>
>> Cc: Zhang Rui <rui.zhang@xxxxxxxxx>
>> Cc: Eduardo Valentin <edubezval@xxxxxxxxx>
>> Signed-off-by: Punit Agrawal <punit.agrawal@xxxxxxx>
>> Signed-off-by: Javi Merino <javi.merino@xxxxxxx>
>
> <big cut>
>
>> +
>> +/**
>> + * get_load() - get load for a cpu since last updated
>> + * @cpufreq_device: &struct cpufreq_cooling_device for this cpu
>> + * @cpu: cpu number
>> + *
>> + * Return: The average load of cpu @cpu in percentage since this
>> + * function was last called.
>> + */
>> +static u32 get_load(struct cpufreq_cooling_device *cpufreq_device, int cpu)
>> +{
>> + u32 load;
>> + u64 now, now_idle, delta_time, delta_idle;
>> +
>> + now_idle = get_cpu_idle_time(cpu, &now, 0);
>> + delta_idle = now_idle - cpufreq_device->time_in_idle[cpu];
>> + delta_time = now - cpufreq_device->time_in_idle_timestamp[cpu];
>> +
>> + if (delta_time <= delta_idle)
>> + load = 0;
>> + else
>> + load = div64_u64(100 * (delta_time - delta_idle), delta_time);
>> +
>> + cpufreq_device->time_in_idle[cpu] = now_idle;
>> + cpufreq_device->time_in_idle_timestamp[cpu] = now;
>> +
>> + return load;
>> +}
>
> <cut>
>
>>
>> +/**
>> + * cpufreq_get_actual_power() - get the current power
>> + * @cdev: &thermal_cooling_device pointer
>> + *
>> + * Return the current power consumption of the cpus in milliwatts.
>> + */
>> +static u32 cpufreq_get_actual_power(struct thermal_cooling_device *cdev)
>> +{
>> + unsigned long freq;
>> + int cpu;
>> + u32 static_power, dynamic_power, total_load = 0;
>> + struct cpufreq_cooling_device *cpufreq_device = cdev->devdata;
>> +
>> + freq = cpufreq_quick_get(cpumask_any(&cpufreq_device->allowed_cpus));
>> +
>> + for_each_cpu(cpu, &cpufreq_device->allowed_cpus) {
>> + u32 load;
>> +
>> + if (cpu_online(cpu))
>> + load = get_load(cpufreq_device, cpu);
>> + else
>> + load = 0;
>> +
>> + total_load += load;
>> + }
>> +
>> + cpufreq_device->last_load = total_load;
>> +
>> + static_power = get_static_power(cpufreq_device, freq);
>> + dynamic_power = get_dynamic_power(cpufreq_device, freq);
>> +
>> + return static_power + dynamic_power;
>> +}
>
> With respect to load computation vs. frequency usage vs. power
> estimation, while getting actual power for a given interval T. What if
> in interval T, we have used, say, 3 different cpu frequencies, and the
> load on the first was 50%, on the second 80%, and on the last frequency,
> the load was 60%, what should be the right load value for computing the
> actual power?
>
> I mean, we are using the total idle time for a given interval, but 1 -
> idle not always seams to reflect actual load on different opps, if opps
> change over time within T time interval window.

The value returned by cpufreq_get_actual_power is used as a proxy for
the estimate of the requested power of the actor for the next window
duration. Even though the frequency might have changed in the previous
period, the current frequency reflects the latest information about the
required performance. As it is an estimate, and to avoid making the
power calculations more complicated, we used utilisation (1 - idle time)
to calculate the request. The estimate for the T+1 period becomes more
accurate as the load stabilises.

In our testing on different workloads using 100ms as the polling period
for thermal control, we didn't see any problems arising from the use of
this definition of utilisation.

Having said that, there are a number of ways to improve the accuracy of
the power calculations. As part of investigating the effects of
improving model accuracy and it's effect on thermal control and
performance, we plan to look at fine-grained frequency and load tracking
once the initial set of patches are merged.

Cheers,
Punit

>
> BR,
>
>
> BR,
>
> Eduardo Valentin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/