Re: [PATCH 0/2] Add test to validate udelay

From: Doug Anderson
Date: Wed May 07 2014 - 14:32:51 EST


John,

On Wed, May 7, 2014 at 11:10 AM, John Stultz <john.stultz@xxxxxxxxxx> wrote:
> On 05/06/2014 09:19 PM, Doug Anderson wrote:
>> John,
>>
>> On Tue, May 6, 2014 at 5:25 PM, John Stultz <john.stultz@xxxxxxxxxx> wrote:
>>> Really, I'm curious about the backstory that made you generate the test?
>>> I assume something bit you where udelay was way off? Or were you using
>>> udelay for some sort of accuracy sensitive use?
>> Several times we've seen cases where udelay() was pretty broken with
>> cpufreq if you were actually implementing udelay() with
>> loops_per_jiffy. I believe it may also be broken upstream on
>> multicore systems, though now that ARM arch timers are there maybe we
>> don't care as much?
>>
>> Specifically, there is a lot of confusion between the global loops per
>> jiffy and the per CPU one. On ARM I think we always use the global
>> one and we attempt to scale it as cpufreq changes. ...but...
>>
>> * cores tend scale together and there's a single global. That means
>> you might have started the delay loop at one freq and ended it at
>> another (if another CPU changes the freq).
>
> Good point. The loops based delay would clearly be broken w/ ASMP unless
> we use per-cpu values that are scaled(and as you point out, we don't
> scale the value mid-delay). Time based counters for udelay() - like the
> arch timer you mention - are a much better way to work around this.

Locally we have this:
* https://chromium-review.googlesource.com/189885
ARM: Don't ever downscale loops_per_jiffy in SMP systems

I didn't think upstream would really want this given the move to arch
timers, but I'm happy to post it.


>> * I believe there's some strange issues in terms of how the loops per
>> jiffy variable is initialized and how the "original CPU freq" is. I
>> know we ran into issues on big.LITTLE where the LITTLE cores came up
>> and clobbered the loops_per_jiffy variable but it was still doing math
>> based on the big cores.
>
> Hrm. I don't have a theory on this right now, but clearly there are
> issues to be resolved, so having your tests included would be a good
> thing to help find these issues.

Locally we added:
* https://chromium-review.googlesource.com/189725
init: Don't decrease loops_per_jiffy when a CPU comes up

...and that "fixed" things for us. Specifically what happened was:

- A15s boot up at 1.8GHz (though they can actually go up to 1.9/2.0).

- A7s boot up at ~600MHz and clobbered loops_per_jiffy with something tiny.

- In the first "cpufreq_callback" in "arch/arm/kernel/smp.c", we
stored the current _A15_ frequency upon the first cpufreq transition
and the current _A7_ loops per jiffy (since the A7s clobbered it).


It seemed like our situation was so messed up that upstream probably
wouldn't want the fix (using loops_per_jiffy in an HMP system is
pretty insane), but again I'm happy to send it up.


We also took out some IKS (big.LITTLE) attempt to fix up
loops_per_jiffy that was massively flawed in a number of ways
<https://chromium-review.googlesource.com/189728> but that's not
upstream.


-Doug
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/