Re: [PATCH] opp: Reinitialize the list_kref before adding the static OPPs again

From: Viresh Kumar
Date: Sun Oct 20 2019 - 22:25:22 EST


On 18-10-19, 14:12, Stephen Boyd wrote:
> Quoting Viresh Kumar (2019-10-18 02:28:41)
> > The list_kref reaches a count of 0 when all the static OPPs are removed,
> > for example when dev_pm_opp_of_cpumask_remove_table() is called, though
> > the actual OPP table may not get freed as it may still be referenced by
> > other parts of the kernel, like from a call to
> > dev_pm_opp_set_supported_hw(). And if we call
> > dev_pm_opp_of_cpumask_add_table() again at this point, we must
> > reinitialize the list_kref otherwise the kernel will hit a WARN() in
> > kref infrastructure for incrementing a kref with value 0.
> >
> > Fixes: 11e1a1648298 ("opp: Don't decrement uninitialized list_kref")
> > Reported-by: Dmitry Osipenko <digetx@xxxxxxxxx>
> > Signed-off-by: Viresh Kumar <viresh.kumar@xxxxxxxxxx>
> > ---
> > drivers/opp/of.c | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > diff --git a/drivers/opp/of.c b/drivers/opp/of.c
> > index 6dc41faf74b5..1cbb58240b80 100644
> > --- a/drivers/opp/of.c
> > +++ b/drivers/opp/of.c
> > @@ -663,6 +663,13 @@ static int _of_add_opp_table_v2(struct device *dev, struct opp_table *opp_table)
> > return 0;
> > }
> >
> > + /*
> > + * Re-initialize list_kref every time we add static OPPs to the OPP
> > + * table as the reference count may be 0 after the last tie static OPPs
>
> s/tie/time/
>
> > + * were removed.
> > + */
> > + kref_init(&opp_table->list_kref);
>
> It seems racy.

I am not sure if I see a race here, but maybe I am missing something.
Care to explain ?

> Why are we doing this vs. making an entirely new and
> different OPP structure? Or why is the count reaching 0 when something
> is obviously still referencing it?

The kref for the opp table is opp_table->kref and the one here is
different. This is list_kref which is used for freeing OPPs added
statically from the DT. The static OPPs get added to the OPP table
when one calls dev_pm_opp_of_cpumask_add_table() and must be removed
on a call to dev_pm_opp_of_cpumask_remove_table(). The opp table
structure may not get freed at this moment though as it is still
referenced by the caller of dev_pm_opp_set_supported_hw().

And now when we try to add the static OPPs again (re-insertion of
cpufreq module), we need to reinitialize the list_kref again as its
count reached 0 earlier and the resources (static OPPs) were freed.

--
viresh