Re: [PATCH] opp: Reinitialize the list_kref before adding the static OPPs again

From: Stephen Boyd
Date: Mon Oct 28 2019 - 08:01:36 EST


Quoting Viresh Kumar (2019-10-20 19:25:16)
> On 18-10-19, 14:12, Stephen Boyd wrote:
> > Quoting Viresh Kumar (2019-10-18 02:28:41)
> > > The list_kref reaches a count of 0 when all the static OPPs are removed,
> > > for example when dev_pm_opp_of_cpumask_remove_table() is called, though
> > > the actual OPP table may not get freed as it may still be referenced by
> > > other parts of the kernel, like from a call to
> > > dev_pm_opp_set_supported_hw(). And if we call
> > > dev_pm_opp_of_cpumask_add_table() again at this point, we must
> > > reinitialize the list_kref otherwise the kernel will hit a WARN() in
> > > kref infrastructure for incrementing a kref with value 0.
> > >
> > > Fixes: 11e1a1648298 ("opp: Don't decrement uninitialized list_kref")
> > > Reported-by: Dmitry Osipenko <digetx@xxxxxxxxx>
> > > Signed-off-by: Viresh Kumar <viresh.kumar@xxxxxxxxxx>
> > > ---
> > > drivers/opp/of.c | 7 +++++++
> > > 1 file changed, 7 insertions(+)
> > >
> > > diff --git a/drivers/opp/of.c b/drivers/opp/of.c
> > > index 6dc41faf74b5..1cbb58240b80 100644
> > > --- a/drivers/opp/of.c
> > > +++ b/drivers/opp/of.c
> > > @@ -663,6 +663,13 @@ static int _of_add_opp_table_v2(struct device *dev, struct opp_table *opp_table)
> > > return 0;
> > > }
> > >
> > > + /*
> > > + * Re-initialize list_kref every time we add static OPPs to the OPP
> > > + * table as the reference count may be 0 after the last tie static OPPs
> >
> > s/tie/time/
> >
> > > + * were removed.
> > > + */
> > > + kref_init(&opp_table->list_kref);
> >
> > It seems racy.
>
> I am not sure if I see a race here, but maybe I am missing something.
> Care to explain ?

Some static OPP is removed at the same time that this function is
called?

>
> > Why are we doing this vs. making an entirely new and
> > different OPP structure? Or why is the count reaching 0 when something
> > is obviously still referencing it?
>
> The kref for the opp table is opp_table->kref and the one here is
> different. This is list_kref which is used for freeing OPPs added
> statically from the DT. The static OPPs get added to the OPP table
> when one calls dev_pm_opp_of_cpumask_add_table() and must be removed
> on a call to dev_pm_opp_of_cpumask_remove_table(). The opp table
> structure may not get freed at this moment though as it is still
> referenced by the caller of dev_pm_opp_set_supported_hw().
>
> And now when we try to add the static OPPs again (re-insertion of
> cpufreq module), we need to reinitialize the list_kref again as its
> count reached 0 earlier and the resources (static OPPs) were freed.
>

Right. I don't understand why the count reaches 0 if we can still get a
pointer to something. I guess we've got this kref thing that has a
lifetime beyond the life of what it's tracking, which is weird. Usually
the kref is embedded inside the pointer that is returned by the "get"
call, but here it's outside it and used to track when we should free
static OPPs. Why are we removing static OPPs? Shouldn't they just stick
around forever until the device is deleted vs. populated over and over
again?