Re: [PATCH 4/4] ARM: multi_v7_defconfig: Switch AXP20x driver from module to built-in

From: Maxime Ripard
Date: Fri Jul 21 2017 - 10:17:27 EST


Hi,

On Wed, Jun 14, 2017 at 09:07:12AM +0200, Maxime Ripard wrote:
> On Tue, Jun 13, 2017 at 09:14:00AM -0700, Kevin Hilman wrote:
> > Maxime Ripard <maxime.ripard@xxxxxxxxxxxxxxxxxx> writes:
> >
> > > On Tue, Jun 06, 2017 at 12:45:17PM -0700, Kevin Hilman wrote:
> > >> On Mon, May 22, 2017 at 12:44 AM, Maxime Ripard
> > >> <maxime.ripard@xxxxxxxxxxxxxxxxxx> wrote:
> > >> > Hi Kevin,
> > >> >
> > >> > On Thu, May 18, 2017 at 11:59:50AM -0700, Kevin Hilman wrote:
> > >> >> On Fri, Mar 17, 2017 at 10:39 AM, Kevin Hilman <khilman@xxxxxxxxxxxx> wrote:
> > >> >> > On Fri, Feb 10, 2017 at 12:42 AM, Maxime Ripard
> > >> >> > <maxime.ripard@xxxxxxxxxxxxxxxxxx> wrote:
> > >> >> >> On Wed, Feb 08, 2017 at 11:09:31PM +0100, Rask Ingemann Lambertsen wrote:
> > >> >> >>> The AXP20X regulator support is currently built as a module, which means
> > >> >> >>> it's not available until the root fs has been mounted, but the boot loader
> > >> >> >>> might not have enabled the required regulators, so build their drivers
> > >> >> >>> into the kernel.
> > >> >> >>>
> > >> >> >>> Signed-off-by: Rask Ingemann Lambertsen <rask@xxxxxxxxxxxx>
> > >> >> >>
> > >> >> >> Queued for 4.12.
> > >> >> >
> > >> >> > Hello, kernelci.org is reporting boot failures on sun5i-r8-chip in
> > >> >> > linux-next[1] for a few days and with a variety of defconfigs. I
> > >> >> > bisected it[2] down to this patch.
> > >> >> >
> > >> >> > I verified that reverting this patch on top of next-20170310 makes my
> > >> >> > chip board boot again.
> > >> >>
> > >> >> FYI... this board is still broken in linux-next (and now in mainline),
> > >> >> and reverting $SUBJECT patch still makes it work.
> > >> >>
> > >> >> Is nobody else using mainline on this board?
> > >> >
> > >> > I thought about that during the weekend, and it might just be a
> > >> > symptom.
> > >> >
> > >> > The CHIP has brown out issues, especially when you enable the WiFi
> > >> > chip, which should happen around the time of the failure when the PMIC
> > >> > regulator support is compiled as a module.
> > >> >
> > >> > We mitigate that in upstream's U-Boot by enabling the two regulators
> > >> > for the WiFi chip in U-boot, which levels a bit the current over the
> > >> > boot.
> > >> >
> > >> > You have a few ways to prevent that from happening. Having a better
> > >> > power supply / cable will help, I'm not sure how reasonable that is.
> > >> >
> > >> > Another thing that can work is, if your USB plugs can take it, to
> > >> > increase the overcurrent trigger in the PMIC, ideally in U-Boot.
> > >> >
> > >> > The last, and probably cleaner one, would be to just power it through
> > >> > the 5v input on its header, and not the USB. There's not current
> > >> > limitation there, so it shouldn't cause any problems anymore.
> > >>
> > >> I'm now powering the board via the header (5V to the CHG-IN pin) and
> > >> it doesn't change anything. Still fails in the same way, and
> > >> reverting $SUBJECT defconfig patch makes it work again.
> > >
> > > I tried it today with sunxi_defconfig that has AXP20X_REGULATOR
> > > built-in as well. It can boot fine on my CHIP here.
> >
> > What about multi_v7_defconfig?
>
> It seems to work in our farm.
>
> It's lagging behind at the moment, so it hasn't been published yet,
> but here is the last multi_v7 boot.
> http://code.bulix.org/a43kkf-147625?raw

A bit of an update for that. It turned out that our farm also had this
issue. We tried to power it through the 5V plug, and it didn't change
anything.

After wasting way too much time on this, we started digging into it
today with Chen-Yu.

We found out after enabling DEBUG_DRIVER that the last line was always
a cpufreq rate change. Removing the handle on the CPU regulator wasn't
changing anything, which left us with the other option: the clocks.

It turns out that in 4.12 we also switched to a new clock framework
for the sun5i family. A few printk down the line, the clock
calculation were not propagated to the PLL, resulting in a CPU crash.

Now, you might ask why it was crashing in multi_v7, and not in
sunxi_defconfig. The default governor in multi_v7 is ondemand, the one
in sunxi is performance, and therefore it never changes the CPU clock
rate.

And I guess reverting the regulator option patch just prevented the
cpufreq-dt from probing since it was missing the CPU regulator
described in DT.

I'll send a patch addressing this, cc'd to stable.

Maxime

--
Maxime Ripard, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

Attachment: signature.asc
Description: PGP signature