Re: [PATCH v4 2/4] PM / devfreq: Fix kernel warning with cpufreq passive register fail

From: Christian Marangi
Date: Sun Jun 19 2022 - 18:19:23 EST


On Sat, Jun 18, 2022 at 04:08:35AM +0900, Chanwoo Choi wrote:
> On 22. 6. 15. 18:20, Ansuel Smith wrote:
> > On Wed, Jun 15, 2022 at 04:11:13PM +0900, Chanwoo Choi wrote:
> >> On 22. 6. 15. 08:09, Christian 'Ansuel' Marangi wrote:
> >>> When the cpufreq passive register path from the passive governor fails,
> >>> the cpufreq_passive_unregister is called and a kernel WARNING is always
> >>> reported.
>
> >>> This is caused by the fact that the devfreq driver already call the
> >>> governor unregister with the GOV_STOP, for this reason the second
> >>> cpufreq_passive_unregister always return error and a WARN is printed
> >>> from the WARN_ON function.
>
> >>> Remove the unregister call from the error handling of the cpufreq register
> >>> notifier as it's fundamentally wrong and already handled by the devfreq
> >>> core code.
>
> If possible, could you make the patch description more simply?
>
> >>>
> >>> Fixes: a03dacb0316f ("PM / devfreq: Add cpu based scaling support to passive governor")
> >>> Signed-off-by: Christian 'Ansuel' Marangi <ansuelsmth@xxxxxxxxx>
> >>> ---
> >>> drivers/devfreq/governor_passive.c | 1 -
> >>> 1 file changed, 1 deletion(-)
> >>>
> >>> diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
> >>> index 95de336f20d5..dcc9dd518197 100644
> >>> --- a/drivers/devfreq/governor_passive.c
> >>> +++ b/drivers/devfreq/governor_passive.c
> >>> @@ -331,7 +331,6 @@ static int cpufreq_passive_register_notifier(struct devfreq *devfreq)
> >>> err_put_policy:
> >>> cpufreq_cpu_put(policy);
> >>> err:
> >>> - WARN_ON(cpufreq_passive_unregister_notifier(devfreq));
> >>>
> >>> return ret;
> >>> }
> >>
> >> I think that it is necessary to free the resource when error happen.
> >
> > Thing is that it should not be done in the register. Following the flow
> > of the devfreq core code, if a gov fails to START, the gov STOP is
> > called and we correctly free our resources. In the current
> > implementation we call the free 2 times and the second time will always
> > print error as the notifier is already unregistered.
> >
> >> Also, after merging the your patch1, I think that cpufreq_passive_unregister_notifier(devfreq)
> >> will not return error. Instead, just 0 for success.
> >
> > With path1 we removed the error with the parent_cpu_data deletion but
> > the unregister error is still there.
> >
> >>
> >> Instead, 'err_free_cpu_data' and 'err_put_policy' goto statement are wrong exception
> >> handling. If fix the exception handling code in cpufreq_passive_register_notifier
> >> as following and with your patch1, I'll handle the resource for free/un-registration
> >> when error happen during cpufreq_passive_register_notifier.
> >>
> >
> > Don't know the main problem here is calling unregister 2 times.
>
> Ah. I understood. To fix the error path handling with unregister function
> is called twice, I think that need to to fix it as following:
>
>
> diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
> index a35b39ac656c..8f38a63beefc 100644
> --- a/drivers/devfreq/governor_passive.c
> +++ b/drivers/devfreq/governor_passive.c
> @@ -289,22 +289,25 @@ static int cpufreq_passive_register_notifier(struct devfreq *devfreq)
> parent_cpu_data = kzalloc(sizeof(*parent_cpu_data),
> GFP_KERNEL);
> if (!parent_cpu_data) {
> + cpufreq_cpu_put(policy);
> ret = -ENOMEM;
> - goto err_put_policy;
> + goto err;
> }
>
> cpu_dev = get_cpu_device(cpu);
> if (!cpu_dev) {
> dev_err(dev, "failed to get cpu device\n");
> + cpufreq_cpu_put(policy);
> ret = -ENODEV;
> - goto err_free_cpu_data;
> + goto err;
> }
>
> opp_table = dev_pm_opp_get_opp_table(cpu_dev);
> if (IS_ERR(opp_table)) {
> dev_err(dev, "failed to get opp_table of cpu%d\n", cpu);
> + cpufreq_cpu_put(policy);
> ret = PTR_ERR(opp_table);
> - goto err_free_cpu_data;
> + goto err;
> }
>
> parent_cpu_data->dev = cpu_dev;
> @@ -324,15 +327,7 @@ static int cpufreq_passive_register_notifier(struct devfreq *devfreq)
> if (ret)
> dev_err(dev, "failed to update the frequency\n");
>
> - return ret;
> -
> -err_free_cpu_data:
> - kfree(parent_cpu_data);

Wait! This would leak the just allocated parent_cpu_data in case of
error since we return before it's added to the cpu_data_list and it
won't be freed on the unregister function.

I'm sending this patch (since the other got merged) with the description
reworked. Think there isn't another way to have the code linear and
remove the ""double return"".

> -err_put_policy:
> - cpufreq_cpu_put(policy);
> err:
> - WARN_ON(cpufreq_passive_unregister_notifier(devfreq));
> -
> return ret;
> }
>
>
> --
> Best Regards,
> Samsung Electronics
> Chanwoo Choi

--
Ansuel