Re: [PATCH v11 06/12] pwm: imx27: Use 64-bit division macro and function

From: Guru Das Srinagesh
Date: Thu Apr 02 2020 - 16:55:25 EST


On Thu, Apr 02, 2020 at 01:16:54PM -0700, Guru Das Srinagesh wrote:
> On Tue, Mar 31, 2020 at 10:49:29PM +0200, Thierry Reding wrote:
> > On Tue, Mar 31, 2020 at 01:20:58PM -0700, Guru Das Srinagesh wrote:
> > > On Tue, Mar 31, 2020 at 05:24:52PM +0200, Arnd Bergmann wrote:
> > > > On Mon, Mar 30, 2020 at 10:44 PM Guru Das Srinagesh
> > > > <gurus@xxxxxxxxxxxxxx> wrote:
> > > > >
> > > > > On Fri, Mar 20, 2020 at 06:09:39PM +0100, Arnd Bergmann wrote:
> > > > > > On Fri, Mar 20, 2020 at 2:42 AM Guru Das Srinagesh <gurus@xxxxxxxxxxxxxx> wrote:
> > > > > >
> > > > > > > @@ -240,8 +240,7 @@ static int pwm_imx27_apply(struct pwm_chip *chip, struct pwm_device *pwm,
> > > > > > >
> > > > > > > period_cycles /= prescale;
> > > > > > > c = (unsigned long long)period_cycles * state->duty_cycle;
> > > > > > > - do_div(c, state->period);
> > > > > > > - duty_cycles = c;
> > > > > > > + duty_cycles = div64_u64(c, state->period);
> > > > > > >
> > > > > >
> > > > > > This change looks fine, but I wonder if the code directly above it
> > > > > >
> > > > > > c = clk_get_rate(imx->clk_per);
> > > > > > c *= state->period;
> > > > > > do_div(c, 1000000000);
> > > > > > period_cycles = c;
> > > > > >
> > > > > > might run into an overflow when both the clock rate and the period
> > > > > > are large numbers.
> > > > >
> > > > > Hmm. Seems to me like addressing this would be outside the scope of this
> > > > > patch series.
> > > >
> > > > I think it should be part of the same series, addressing bugs that
> > > > were introduced
> > > > by the change to 64-bit period. If it's not getting fixed along with
> > > > the other regressions,
> > > > I fear nobody is going to go back and fix it later.
> > >
> > > Makes sense, I agree. Would this be an acceptable fix?
> > >
> > > Instead of multiplying c and state->period first and then dividing by
> > > 10^9, first divide state->period by 10^9 and then multiply the quotient
> > > of that division with c and assign it to period_cycles. Like so:
> > >
> > > c = clk_get_rate(imx->clk_per);
> > > c *= div_u64(state->period, 1000000000);
> > > period_cycles = c;
> > >
> > > This should take care of overflow not happening because state->period is
> > > converted from nanoseconds to seconds early on and so becomes a small
> > > number.
> >
> > Doesn't that mean that anything below a 1 second period will be clamped
> > to just 0?
>
> True. How about this then?
>
> int pwm_imx27_calc_period_cycles(struct pwm_state state,
> unsigned long clk_rate,
> unsigned long *period_cycles)
> {
> u64 c1, c2;
>
> c1 = clk_rate;
> c2 = state->period;
> if (c2 > c1) {
> c2 = c1;
> c1 = state->period;
> }
>
> if (!c1 || !c2) {
> pr_err("clk rate and period should be nonzero\n");
> return -EINVAL;
> }
>
> if (c2 <= div64_u64(U64_MAX, c1)) {
> c = c1 * c2;
> do_div(c, 1000000000);
> } else if (c2 <= div64_u64(U64_MAX, div64_u64(c1, 1000))) {
> do_div(c1, 1000);
> c = c1 * c2;
> do_div(c, 1000000);
> } else if (c2 <= div64_u64(U64_MAX, div64_u64(c1, 1000000))) {
> do_div(c1, 1000000);
> c = c1 * c2;
> do_div(c, 1000);
> } else if (c2 <= div64_u64(U64_MAX, div64_u64(c1, 1000000000))) {
> do_div(c1, 1000000000);
> c = c1 * c2;
> }
>
> *period_cycles = c;
>
> return 0;
> }
>
> ...
>
> ret = pwm_imx27_calc_period_cycles(state, clk_get_rate(imx->clk_per),
> &period_cycles);
> if (ret)
> return ret;
>
> I unit tested this logic out by calculating period_cycles using both the
> existing logic and the proposed one, and the results are as below.
>
> --------------------------------------------------------------------------------
> clk_rate period existing proposed
> --------------------------------------------------------------------------------
> 1000000000 18446744073709551615 18446744072 18446744073000000000
> (U64_MAX)
> --------------------------------------------------------------------------------
> 1000000000 4294967291 4294967291 4294967291
> --------------------------------------------------------------------------------
>
> Overflow occurs in the first case with the existing logic, whereas the
> proposed logic handles it correctly.

Well, not "correctly" exactly, but a best-effort attempt to handle the
overflow with som loss of precision.

Thank you.

Guru Das.