Re: [PATCH v7 3/5] clk: Supply the critical clock {init, enable, disable} framework

From: Lee Jones
Date: Thu Jul 30 2015 - 05:50:32 EST


On Wed, 29 Jul 2015, Michael Turquette wrote:
> Quoting Lee Jones (2015-07-28 06:00:55)
> > On Tue, 28 Jul 2015, Maxime Ripard wrote:
> >
> > > On Mon, Jul 27, 2015 at 09:53:38AM +0100, Lee Jones wrote:
> > > > On Mon, 27 Jul 2015, Maxime Ripard wrote:
> > > >
> > > > > On Wed, Jul 22, 2015 at 02:04:13PM +0100, Lee Jones wrote:
> > > > > > These new API calls will firstly provide a mechanisms to tag a clock as
> > > > > > critical and secondly allow any knowledgeable driver to (un)gate clocks,
> > > > > > even if they are marked as critical.
> > > > > >
> > > > > > Suggested-by: Maxime Ripard <maxime.ripard@xxxxxxxxxxxxxxxxxx>
> > > > > > Signed-off-by: Lee Jones <lee.jones@xxxxxxxxxx>
> > > > > > ---
> > > > > > drivers/clk/clk.c | 45 ++++++++++++++++++++++++++++++++++++++++++++
> > > > > > include/linux/clk-provider.h | 2 ++
> > > > > > include/linux/clk.h | 30 +++++++++++++++++++++++++++++
> > > > > > 3 files changed, 77 insertions(+)
> > > > > >
> > > > > > diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> > > > > > index 61c3fc5..486b1da 100644
> > > > > > --- a/drivers/clk/clk.c
> > > > > > +++ b/drivers/clk/clk.c
> > > > > > @@ -46,6 +46,21 @@ static struct clk_core *clk_core_lookup(const char *name);
> > > > > >
> > > > > > /*** private data structures ***/
> > > > > >
> > > > > > +/**
> > > > > > + * struct critical - Provides 'play' over critical clocks. A clock can be
> > > > > > + * marked as critical, meaning that it should not be
> > > > > > + * disabled. However, if a driver which is aware of the
> > > > > > + * critical behaviour wants to control it, it can do so
> > > > > > + * using clk_enable_critical() and clk_disable_critical().
> > > > > > + *
> > > > > > + * @enabled Is clock critical? Once set, doesn't change
> > > > > > + * @leave_on Self explanatory. Can be disabled by knowledgeable drivers
> > > > > > + */
> > > > > > +struct critical {
> > > > > > + bool enabled;
> > > > > > + bool leave_on;
> > > > > > +};
> > > > > > +
> > > > > > struct clk_core {
> > > > > > const char *name;
> > > > > > const struct clk_ops *ops;
> > > > > > @@ -75,6 +90,7 @@ struct clk_core {
> > > > > > struct dentry *dentry;
> > > > > > #endif
> > > > > > struct kref ref;
> > > > > > + struct critical critical;
> > > > > > };
> > > > > >
> > > > > > struct clk {
> > > > > > @@ -995,6 +1011,10 @@ static void clk_core_disable(struct clk_core *clk)
> > > > > > if (WARN_ON(clk->enable_count == 0))
> > > > > > return;
> > > > > >
> > > > > > + /* Refuse to turn off a critical clock */
> > > > > > + if (clk->enable_count == 1 && clk->critical.leave_on)
> > > > > > + return;
> > > > > > +
> > > > >
> > > > > I think it should be handled by a separate counting. Otherwise, if you
> > > > > have two users that marked the clock as critical, and then one of them
> > > > > disable it...
> > > > >
> > > > > > if (--clk->enable_count > 0)
> > > > > > return;
> > > > > >
> > > > > > @@ -1037,6 +1057,13 @@ void clk_disable(struct clk *clk)
> > > > > > }
> > > > > > EXPORT_SYMBOL_GPL(clk_disable);
> > > > > >
> > > > > > +void clk_disable_critical(struct clk *clk)
> > > > > > +{
> > > > > > + clk->core->critical.leave_on = false;
> > > > >
> > > > > .. you just lost the fact that it was critical in the first place.
> > > >
> > > > I thought about both of these points, which is why I came up with this
> > > > strategy.
> > > >
> > > > Any device which uses the *_critical() API should a) have knowledge of
> > > > what happens when a particular critical clock is gated and b) have
> > > > thought about the consequences.
> > >
> > > Indeed.
> > >
> > > > I don't think we can use reference counting, because we'd need as
> > > > many critical clock owners as there are critical clocks.
> > >
> > > Which we can have if we replace the call to clk_prepare_enable you add
> > > in your fourth patch in __set_critical_clocks.
> >
> > What should it be replaced with?
> >
> > > > Cast your mind back to the reasons for this critical clock API. One
> > > > of the most important intentions of this API is the requirement
> > > > mitigation for each of the critical clocks to have an owner
> > > > (driver).
> > > >
> > > > With regards to your second point, that's what 'critical.enabled'
> > > > is for. Take a look at clk_enable_critical().
> > >
> > > I don't think this addresses the issue, if you just throw more
> > > customers at it, the issue remain with your implementation.
> > >
> > > If you have three customers that used the critical API, and if on of
> > > these calls clk_disable_critical, you're losing leave_on.
> >
> > That's the idea. See my point above, the one you replied "Indeed"
> > to. So when a driver uses clk_disable_critical() it's saying, "I know
> > why this clock is a critical clock, and I know that nothing terrible
> > will happen if I disable it, as I have that covered". So then if it's
> > not the last user to call clk_disable(), the last one out the door
> > will be allowed to finally gate the clock, regardless whether it's
> > critical aware or not.
> >
> > Then, when we come to enable the clock again, the critical aware user
> > then re-marks the clock as leave_on, so not critical un-aware user can
> > take the final reference and disable the clock.
> >
> > > Which means that if there's one of the two users left that calls
> > > clk_disable on it, the clock will actually be disabled, which is
> > > clearly not what we want to do, as we have still a user that want the
> > > clock to be enabled.
> >
> > That's not what happens (at least it shouldn't if I've coded it up
> > right). The API _still_ requires all of the users to give-up their
> > reference.
> >
> > > It would be much more robust to have another count for the critical
> > > stuff, initialised to one by the __set_critical_clocks function.
> >
> > If I understand you correctly, we already have a count. We use the
> > original reference count. No need for one of our own.
> >
> > Using your RAM Clock (Clock 4) as an example
> > --------------------------------------------
> >
> > Early start-up:
> > Clock 4 is marked as critical and a reference is taken (ref == 1)
> >
> > Driver probe:
> > SPI enables Clock 4 (ref == 2)
> > I2C enables Clock 4 (ref == 3)
> >
> > Suspend (without RAM driver's permission):
> > SPI disables Clock 4 (ref == 2)
> > I2C disables Clock 4 (ref == 1)
> > /*
> > * Clock won't be gated because:
> > * .leave_on is True - can't dec final reference
>
> I am clearly missing the point. The clock won't be gated because the
> enable_count is still 1! What does .leave_on do here?

The point of _this_ (the extended) part of the API is so that the
clock _can_ be turned off. Without the possibility to disable
.leave_on and the logic with accompanies it (i.e.
clk_disable_critical()) the clock will _never_ be gated.

> > */
> >
> > Suspend (with RAM driver's permission):
> > /* Order is unimportant */
> > SPI disables Clock 4 (ref == 2)
> > RAM disables Clock 4 (ref == 1) /* Won't turn off here (ref > 0)
> > I2C disables Clock 4 (ref == 0) /* (.leave_on == False) last ref can be taken */
> > /*
> > * Clock will be gated because:
> > * .leave_on is False, so (ref == 0)
>
> Again, .leave_on does nothing new here. We gate the clock because the
> reference count is 0.

It's the fact that .leave_on has been disabled in
clk_disable_critical() that allows the final reference to be taken.

> > */
> >
> > Resume:
> > /* Order is unimportant */
> > SPI enables Clock 4 (ref == 1)
> > RAM enables Clock 4 and re-enables .leave_on (ref == 2)
> > I2C enables Clock 4 (ref == 3)
>
> Same again. As soon as RAM calls clk_enable_critical the ref count goes
> up. .leave_on does nothing as far as I can tell. The all works because
> of the reference counting, which already exists before this patch
> series.

So fundamentally you're right in what you say. All you really need to
disable a critical clock is write a knowledgeable driver, which is
intentionally unbalanced i.e. just calls clk_disable(). All this
extended API really does is makes the process more official and
ensures that an unintentionally unbalanced driver doesn't bugger up
the running platform. We could also add a new WARN() to say that said
driver is unbalanced, as it just tried to turn off a critical clock.

What do you think is best?

--
Lee Jones
Linaro STMicroelectronics Landing Team Lead
Linaro.org â Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/