RE: HW power fault defect cause system hang on kernel 5.4.y

From: Bao, Joseph
Date: Wed Nov 10 2021 - 21:17:12 EST


Hi Bjorn,

Thanks for the encouragement! Stuart already helps patch the hang issue, do I still go an open a report at https://bugzilla.kernel.org?

Regards
Joseph

-----Original Message-----
From: Bjorn Helgaas <helgaas@xxxxxxxxxx>
Sent: Tuesday, November 9, 2021 11:30 PM
To: Bao, Joseph <joseph.bao@xxxxxxxxx>
Cc: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>; linux-pci@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Stuart Hayes <stuart.w.hayes@xxxxxxxxx>; Lukas Wunner <lukas@xxxxxxxxx>
Subject: Re: HW power fault defect cause system hang on kernel 5.4.y

On Tue, Nov 09, 2021 at 07:59:59AM +0000, Bao, Joseph wrote:
> Hi Lukas/Stuart,
> Want to follow up with you whether the system hang is expected when HW
> has a defect keeping PCI_EXP_SLTSTA_PFD always HIGH.

A system hang in response to a hardware defect like this is never the expected situation. Worst case we should be able to work around it with a quirk. Far better would be a generic fix that could recognize and deal with the situation even without a quirk.

But I don't know the fix yet. I'm just responding to encourage you to keep pestering us and not give up :) In the meantime, it might be worth opening a report at https://bugzilla.kernel.org with a description of how you trigger the problem, and attaching the complete dmesg log and "sudo lspci -vv" output.

Bjorn