Re: [PATCH v2] serial: 8250_dw: Fix common clocks usage race condition

From: Andy Shevchenko
Date: Mon Mar 23 2020 - 07:52:47 EST


On Mon, Mar 23, 2020 at 02:11:49PM +0300, Sergey Semin wrote:
> On Mon, Mar 23, 2020 at 11:20:51AM +0200, Andy Shevchenko wrote:
> > On Mon, Mar 23, 2020 at 05:46:09AM +0300, Sergey.Semin@xxxxxxxxxxxxxxxxxxxx wrote:
> > > From: Serge Semin <Sergey.Semin@xxxxxxxxxxxxxxxxxxxx>
> >
> > The question to CLK framework maintainers, is it correct approach in general
> > for this case?
>
> You should have been more specific then, if you wanted to see someone
> special.

I didn't get your comment here. Since you put the question under a pile of
words in the commit message, and actually in the changelog, not even in the
message, I repeated it clearly that clock maintainers can see it.

> > On Wed, Mar 18, 2020 at 05:19:53PM +0200, Andy Shevchenko wrote:
> >> Also it would be nice to see come clock framework guys' opinions...
>
> Who can give a better comments regarding the clk API if not the
> subsystem maintainers?

You already got one from Maxime.

...

> > > + /*
> > > + * Some platforms may provide a reference clock shared between several
> > > + * devices. In this case before using the serial port first we have to
> > > + * make sure nothing will change the rate behind our back and second
> > > + * the tty/serial subsystem knows the actual reference clock rate of
> > > + * the port.
> > > + */
> >
> > > + if (clk_rate_exclusive_get(d->clk)) {
> > > + dev_warn(p->dev, "Couldn't lock the clock rate\n");
> >
> > So, if this fails, in ->shutdown you will disbalance reference count, or did I
> > miss something?
> >
>
> Hm, you are right. I didn't fully thought this through. The thing is
> that according to the clk_rate_exclusive_get() function code currently
> it never fails. Though this isn't excuse for introducing a prone to future
> bugs code.
>
> Anyway if according to design a function may return an error we must take
> into account in the code using it. Due to this obligation and seeing we can't
> easily detect whether clk_rate_exclusive_get() has been failed while the
> driver is being executed in the shutdown method, the best approach would be
> to just return an error in startup method in case of the clock rate exclusivity
> acquisition failure. If you are ok with this, I'll have it fixed in v3
> patchset.

It needs to be carefully tested on other platforms than yours.

> > > + } else if (d->clk) {
> >
> > > + p->uartclk = clk_get_rate(d->clk);
> > > + if (!p->uartclk) {
> > > + clk_rate_exclusive_put(d->clk);
> > > + dev_err(p->dev, "Clock rate not defined\n");
> > > + return -EINVAL;
> > > + }
> >
> > This operations I didn't get. If we have d->clk and suddenly get 0 as a rate
> > (and note, that we still update uartclk member!), we try to put (why?) the
> > exclusiveness of rate.
> >
>
> Here is what I had in my mind while implementing this code. If d->clk
> isn't NULL, then there is a "baudclk" clock handler and we can use it to
> alter/retrieve the baud clock rate. But the same clock could be used by
> some other driver and that driver could have changed the rate while we
> didn't have this tty port started up (opened). In this case that driver
> could also have the clock exclusively acquired. So instead of trying to
> set the current p->uartclk rate to the clock, check the return value,
> if it's an error, try to get the current clock rate, check the return
> value, and so on, I just get the current baud clock rate and make sure
> the value is not zero

> (clk_get_rate() returns a zero rate in case of
> internal errors).

Have you considered !CLK case?

> At the same time dw8250_set_termios() will try to update
> the baud clock rate anyway (also by the serial core at the point of the port
> startup), so we don't need such complication in the DW 8250 port startup
> code.
>
> > (and note, that we still update uartclk member!),
>
> Yes, if we can't determine the current baud clock rate, then the there is
> a problem with the clock device, so we don't know at what rate it's
> currently working. Zero is the most appropriate value to be set in this case.
>
> > we try to put (why?) the > exclusiveness of rate.
>
> Yes, we put the exclusivity and return an error, because this if-branch has
> been taken only if the exclusivity has been successfully acquired.

So, this means that above code requires elaboration in the comments to explain
how it supposed to work.

--
With Best Regards,
Andy Shevchenko