Re: [PATCH] Thermal: fix iteration over CPU frequency list

From: Zhang Rui
Date: Fri Feb 01 2013 - 02:59:40 EST


On Thu, 2013-01-24 at 16:24 +0100, Gu1 wrote:
> In different places in the Thermal code, the CPU frequency list is iterated
> in an incorrect way, leading to endless loops when the frequency list contains
> a CPUFREQ_TABLE_INVALID entry, which is the case by default in the the Exynos
> 4x12 cpufreq driver, for example.
>
> The frequency list is iterated with a while loop, and when a
> CPUFREQ_TABLE_INVALID entry is encountered, the continue; statement is used to
> skip it, but the index is not incremented, causing an endless loop.
>
> A similar bug was fixed by hongbo.zhang in commit:
> Thermal: fix bug of counting cpu frequencies
>
> Signed-off-by: Gu1 <gu1@xxxxxxxxxxxx>
> ---
> drivers/thermal/cpu_cooling.c | 8 +++-----
> drivers/thermal/exynos_thermal.c | 9 +++++----
> 2 files changed, 8 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
> index 836828e..51acd26 100644
> --- a/drivers/thermal/cpu_cooling.c
> +++ b/drivers/thermal/cpu_cooling.c
> @@ -123,7 +123,7 @@ static int is_cpufreq_valid(int cpu)
> */
> static unsigned int get_cpu_frequency(unsigned int cpu, unsigned long level)
> {
> - int ret = 0, i = 0;
> + int ret = 0, i;
> unsigned long level_index;
> bool descend = false;
> struct cpufreq_frequency_table *table =
> @@ -131,7 +131,7 @@ static unsigned int get_cpu_frequency(unsigned int cpu, unsigned long level)
> if (!table)
> return ret;
>
> - while (table[i].frequency != CPUFREQ_TABLE_END) {
> + for (i = 0; table[i].frequency != CPUFREQ_TABLE_END; i++) {
> if (table[i].frequency == CPUFREQ_ENTRY_INVALID)
> continue;
>
> @@ -145,7 +145,6 @@ static unsigned int get_cpu_frequency(unsigned int cpu, unsigned long level)
> /*return if level matched and table in descending order*/
> if (descend && i == level)
> return table[i].frequency;
> - i++;
> }
> i--;
>
> @@ -154,13 +153,12 @@ static unsigned int get_cpu_frequency(unsigned int cpu, unsigned long level)
> level_index = i - level;
>
> /*Scan the table in reverse order and match the level*/
> - while (i >= 0) {
> + for (; i >= 0; i--) {
> if (table[i].frequency == CPUFREQ_ENTRY_INVALID)
> continue;
> /*return if level matched*/
> if (i == level_index)
> return table[i].frequency;
> - i--;
> }
> return ret;
> }

so the "level" parameter is the index in the frequency table, right?

> diff --git a/drivers/thermal/exynos_thermal.c b/drivers/thermal/exynos_thermal.c
> index 224751e..fa9e1d7 100644
> --- a/drivers/thermal/exynos_thermal.c
> +++ b/drivers/thermal/exynos_thermal.c
> @@ -233,7 +233,8 @@ static int exynos_get_crit_temp(struct thermal_zone_device *thermal,
>
> static int exynos_get_frequency_level(unsigned int cpu, unsigned int freq)
> {
> - int i = 0, ret = -EINVAL;
> + int i, ret = -EINVAL;
> + unsigned int count = 0;
> struct cpufreq_frequency_table *table = NULL;
> #ifdef CONFIG_CPU_FREQ
> table = cpufreq_frequency_get_table(cpu);
> @@ -241,12 +242,12 @@ static int exynos_get_frequency_level(unsigned int cpu, unsigned int freq)
> if (!table)
> return ret;
>
> - while (table[i].frequency != CPUFREQ_TABLE_END) {
> + for (i = 0; table[i].frequency != CPUFREQ_TABLE_END; i++) {
> if (table[i].frequency == CPUFREQ_ENTRY_INVALID)
> continue;
> if (table[i].frequency == freq)
> - return i;
> - i++;
> + return count;
> + count++;
> }
> return ret;
> }

but we ignore the invalid entry here.

take the following cpufreq table for example, with your patch,
entry frequency
0 2.4G
1 invalid
2 2G
3 invalid
4 1.6G
5 end

in exynos_get_frequency_level(), freq 1.6G is translated to level 2,
because count is increased only twice, for entry 0 and entry 2, right?

but then, in get_cpu_frequency(), level 2 is translated to 2G HZ, which
I do not think is what we want.

I think we are doing something wrong here, and here is a cleanup patch I made
to fix this issue, please review.