Re: watchdog: pcf2127: systemd fails on 5.11

From: Bruno Thomsen
Date: Wed Feb 24 2021 - 10:31:49 EST


Den man. 22. feb. 2021 kl. 23.43 skrev Guenter Roeck <linux@xxxxxxxxxxxx>:
>
> On Thu, Feb 18, 2021 at 01:35:36PM +0100, Bruno Thomsen wrote:
> > Hi,
> >
> > After updating the kernel from 5.8.17 to 5.11 systemd (246.6) is
> > unable to init watchdog in pcf2127 during boot. Kernel option
> > CONFIG_WATCHDOG_OPEN_TIMEOUT=300 is working as expected.
> > It's possible to get watchdog from userspace working in
> > the following 2 ways.
> > 1) Disable watchdog in systemd and use busybox watchdog.
> > 2) Restart systemd after boot with "kill 1".
> >
> > During boot setting the system clock from RTC is working.
> > RTC read/write from userland with hwclock is also working.
> >
> > DTS: imx7d-flex-concentrator-mfg.dts
> > SOC: NXP i.MX7D
> > Drivers: rtc-pcf2127, spi-imx
> > Communication: SPI
> >
> > There are no patches applied to the kernel.
> >
> > When systemd changes watchdog timeout it receives an
> > error that to our best knowledge comes from spi-imx[1].
> >
> > We suspect it's a race condition between drivers or
> > incompatible error handling.
> >
> > Any help in investigating the issue is appreciated.
> >
> Difficult to say without access to hardware. The code does have a
> potential problem, though: It calls pcf2127_wdt_ping not only from
> watchdog code but also from various rtc related functions, but there
> is not access protection. This is even more concerning because the ping
> function is called from an interrupt handler. At the same time, the
> watchdog initialization sets min_hw_heartbeat_ms to 500, which suggests
> that there may be a minimum time between heartbeats (which is clearly
> violated by the current code).

Hi Guenter

Thanks for input.

You could be right about that, I don't think the watchdog feature should
be available for use if the alarm feature is enabled due to how CTRL2
register behaves.

The hardware I am testing on is a custom board, but it's actually
possible to get a Raspberry Pi module called RasClock that has
the chip.

I will test some locking around WD_VAL register access as that is used
in pcf2127_wdt_ping function.

My initial test shows that spin_lock_irqsave around regmap calls are not
a good idea as it result in:
BUG: scheduling while atomic: watchdog/70/0x00000002
BUG: scheduling while atomic: systemd/1/0x00000002

/Bruno