Re: iTCO_wdt watchdog on Asus P10S-WS motherboard FREEZES MOTHERBOARD COMPLETELY

From: Mika Westerberg
Date: Tue Sep 20 2016 - 08:50:21 EST


On Thu, Sep 08, 2016 at 07:01:09PM +0200, David Madore wrote:
> TL;DR: the iTCO_wdt watchdog on the Asus P10S-WS motherboard, instead
> of rebooting the machine, places the motherboard in a completely
> nonfunctional state, from which it can be revived only by a hard power
> cycle. I suspect this is a BIOS bug: seeking advice on how/where to
> report this, and what to do generally. Maybe Linux can work around?
>
>
> Dear list,
>
> I have an Asus P10S-WS motherboard (Intel C236 chipset). I have been
> trying to get the iTCO_wdt hardware watchdog to work (I have been
> successfully using this driver with similar Intel chipset based Asus
> motherboards before, and I know it to work reliably). I am using
> Linux 4.7.3.
>
> I trigger a reboot by killing (with kill -9) the wd_keepalive daemon
> once it has opened the watchdog device.
>
> Sadly, it appears that on this motherboard, the watchdog does not
> reboot the machine (or at least, does not successfully reboot it).
> Instead, the machine enters a "frozen" state (fans spinning, screen
> black, all peripherals unresponsive) from which it cannot be woken up
> by pressing the reset button, or even the power button twice (the
> first press does turn the machine off, but it returns to the same
> nonfunctional state after power on). Instead, power has to be cut
> completely, at the power supply level.
>
> In this nonfunctional state, the Asus POST status display shows the
> number "62", which according to the motherboard manual is the code for
> "installation of the PCH runtime services" (I have no idea of what
> that means).
>
> I suspect that this is a BIOS ^W UEFI bug and in no way Linux's fault.
> It could also be a hardware problem, a chipset bug, or something else.
> And even if it is a firmware bug, it is conceivable that there is a
> way to work around the problem from Linux. So I ask for guidance from
> the wisdom of this list:
>
> * Is there something Linux can do about the problem?
>
> * Is there a chance some kernel developer knows someone at Asus and
> can bring this problem to their attention?
>
> * Can someone report success using the iTCO_wdt watchdog with other
> motherboards having the same Intel C236 chipset? (Note: for it to
> work, the i2c_smbus module needs to be loaded: it took me a long
> time to figure out.)
>
> * Is all hope lost for my motherboard? (I badly need a hardware
> watchdog: if there is no way to get it to work on this motherboard,
> I will need to buy a new one.)
>
> Any suggestions are welcome (or even words of comfort :-).

Does the machine have WDAT ACPI table (see /sys/firmware/acpi/tables/*)?
If it does, you can try the new WDAT watchdog driver instead [1]. It
still uses the same hardware, though but via set of instructions
provided by the BIOS that should work (given the vendor has tested
it on Windows).

[1] http://www.mail-archive.com/linux-kernel@xxxxxxxxxxxxxxx/msg1230607.html