Re: [RFT][PATCH 0/3] cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq

From: Rafael J. Wysocki
Date: Thu Oct 17 2019 - 12:34:44 EST


On Thu, Oct 17, 2019 at 12:00 PM Sudeep Holla <sudeep.holla@xxxxxxx> wrote:
>
> On Thu, Oct 17, 2019 at 03:27:25PM +0530, Viresh Kumar wrote:
> > On 16-10-19, 15:23, Sudeep Holla wrote:
> > > Thanks for the spinning these patches so quickly.
> > >
> > > I did give it a spin, but unfortunately it doesn't fix the bug I reported.
> > > So I looked at my bug report in detail and looks like the cpufreq_driver
> > > variable is set to NULL at that point and it fails to dereference it
> > > while trying to execute:
> > > ret = cpufreq_driver->verify(new_policy);
> > > (Hint verify is at offset 0x1c/28)
> > >
> > > So I suspect some race as this platform with bL switcher tries to
> > > unregister and re-register the cpufreq driver during the boot.
> > >
> > > I need to spend more time on this as reverting the initial PM QoS patch
> > > to cpufreq.c makes the issue disappear.

I guess you mean commit 67d874c3b2c6 ("cpufreq: Register notifiers
with the PM QoS framework")?

That would make sense, because it added the cpufreq_notifier_min() and
cpufreq_notifier_max() that trigger handle_update() via
schedule_work().

[BTW, Viresh, it looks like cpufreq_set_policy() should still ensure
that the new min is less than the new max, because the QoS doesn't do
that.]

> > Is this easily reproducible ? cpufreq_driver == NULL shouldn't be the case, it
> > get updated only once while registering/unregistering cpufreq drivers. That is
> > the last thing which can go wrong from my point of view :)
> >
>
> Yes, if I boot my TC2 with bL switcher enabled, it always crashes on boot.

It does look like handle_update() races with
cpufreq_unregister_driver() and cpufreq_remove_dev (called from there
indirectly) does look racy.