Re: [PATCH] tty: serial: qcom-geni-serial: minor fixes to get_clk_div_rate()

From: Vijaya Krishna Nivarthi
Date: Thu Jun 09 2022 - 14:08:38 EST


Hi,

Re-sending as my earlier message bounced.


On 6/9/2022 4:07 AM, Doug Anderson wrote:
Hi,

On Wed, Jun 8, 2022 at 11:34 AM Vijaya Krishna Nivarthi
<quic_vnivarth@xxxxxxxxxxx> wrote:
Hi,


On 6/8/2022 12:55 AM, Doug Anderson wrote:
Hi,

On Tue, Jun 7, 2022 at 10:40 AM Vijaya Krishna Nivarthi
<quic_vnivarth@xxxxxxxxxxx> wrote:
Hi,

On 6/7/2022 1:29 AM, Doug Anderson wrote:

My only concern continues to be...

Given ser_clk is the final frequency that this function is going to
return and best_div is going to be the clk_divider, is it ok if the
divider cant divide the frequency exactly?

In other words, Can this function output combinations like (402,4)
(501,5) ?

If ok, then we can go ahead with this patch or even previous perhaps.
I don't see why not. You're basically just getting a resulting clock
that's not an integral "Hz", right?

So if "baud" is 9600 and sampling_rate is 16 then desired_clk is (9600
* 16) = 153600

Let's imagine that we do all the math and we finally decide that our
best bet is with the rate 922000 and a divider of 6. That means that
the actual clock we'll make is 153666.67 when we _wanted_ 153600.
There's no reason it needs to be integral, though, and 153666.67 would
still be better than making 160000.

Thank you for clarification.
power?)
Actually power saving was the anticipation behind returning first
frequency in original patch, when we cant find exact frequency.
Right, except that if you just pick the first clock you find it would
be _wildly_ off. I guess if you really want to do this the right way,
you need to set a maximum tolerance and pick the first rate you find
that meets that tolerance. Random web search for "uart baud rate
tolerance" makes me believe that +/- 5% deviation is OK, but to be
safe you probably want something lower. Maybe 2%? So if the desired
clock is within 2% of a clock you can make, can you just pick that
one?
Ok, 2% seems good.
Please note that we go past cases when we have an divider that can
exactly divide the frequency(105/1, 204/2, 303/3) and end up with one
that doesn't.
Ah, good point. Luckily that's a 1-line fix, right?
Apologies, I could not figure out how.
Ah, sorry. Not quite 1 line, but this (untested)


freq = clk_round_rate(clk, mult);

if (freq % desired_clk == 0) {
ser_clk = freq;
best_div = freq / desired_clk;
break;
}

candidate_div = max(1, DIV_ROUND_CLOSEST(freq, desired_clk));
candidate_freq = freq / candidate_div;
diff = abs((long)desired_clk - candidate_freq);
if (diff < best_diff) {
best_diff = diff;
ser_clk = freq;
best_div = candidate_div;
}
But then once again, we would likely need 2 loops because while we are
ok with giving up on search for best_div on finding something within 2%
tolerance, we may not want to give up on exact match (freq % desired_clk
== 0 )
Ah, it took me a while to understand why two loops. It's because in
one case you're trying multiplies and in the other you're bumping up
to the next closest clock rate. I don't think you really need to do
that. Just test the (rate - 2%) and the rate. How about this (only
lightly tested):

ser_clk = 0;
maxdiv = CLK_DIV_MSK >> CLK_DIV_SHFT;
div = 1;
while (div < maxdiv) {
div <= maxdiv ?
mult = (unsigned long long)div * desired_clk;
if (mult != (unsigned long)mult)
break;

two_percent = mult / 50;

/*
* Loop requesting (freq - 2%) and possibly (freq).
*
* We'll keep track of the lowest freq inexact match we found
* but always try to find a perfect match. NOTE: this algorithm
* could miss a slightly better freq if there's more than one
* freq between (freq - 2%) and (freq) but (freq) can't be made
* exactly, but that's OK.
*
* This absolutely relies on the fact that the Qualcomm clock
* driver always rounds up.
*/
test_freq = mult - two_percent;
while (test_freq <= mult) {
freq = clk_round_rate(clk, test_freq);

/*
* A dead-on freq is an insta-win. This implicitly
* handles when "freq == mult"
*/
if (!(freq % desired_clk)) {
*clk_div = freq / desired_clk;
return freq;
}

/*
* Only time clock framework doesn't round up is if
* we're past the max clock rate. We're done searching
* if that's the case.
*/
if (freq < test_freq)
return ser_clk;

/* Save the first (lowest freq) within 2% */
if (!ser_clk && freq <= mult + two_percent) {
ser_clk = freq;
*clk_div = div;
}
My last concern is with search happening only within 2% tolerance.

Do we fail otherwise?

This real case has best tolerance of 1.9%.

[   17.963672] 20220530 desired_clk-51200000
[   21.193550] 20220530 returning ser_clk-52174000, div-1, diff-974000

Seems close.

Thank you.

/*
* If we already rounded up past mult then this will
* cause the loop to exit. If not then this will run
* the loop a second time with exactly mult.
*/
test_freq = max(freq + 1, mult);
}

/*
* test_freq will always be bigger than mult by at least 1.
* That means we can get the next divider with a DIV_ROUND_UP.
* This has the advantage of skipping by a whole bunch of divs
* If the clock framework already bypassed them.
*/
div = DIV_ROUND_UP(test_freq, desired_clk);
}

return ser_clk;