Re: Regression from dcadfd7f7c74ef9ee415e072a19bdf6c085159eb

From: Takashi Sakamoto
Date: Sun Dec 10 2023 - 03:05:05 EST


Hi Mario,

On Tue, Nov 28, 2023 at 12:09:41AM -0600, Mario Limonciello wrote:
> Can you check FCH::PM::S5_RESET_STATUS on next boot after failure has
> occurred? It is available at MMIO FED80300 or through indirect IO access at
> 0xC0.
>
> If MMIO doesn't work, double check FCH::PM_ISACONTROL bit 1 (described on
> page 296) to confirm if your system enables it.
>
> The meanings of the different bits can be found in a recent PPR:
> https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/55901_B1_pub_053.zip
>
> Indirect IO is described on PDF page 294.
>
> This will at least give us a hint what's going on in this case.

I attempt to check the above. As a result, the value of offset
::S5_RESET_STATUS is 0x00080800 every time after the unexpected reboot.
(both cases of lspci and 1394 OHCI driver). The 'mmioen' bit of
::ISACONTROL offset is always enabled.

According to the document, the bit 20 of ::S5_RESET_STATUS expresses
'do_k8_full_reset' to mean that the system reboot was caused by
'CF9 = 0x0E'. But I have no idea about it...

For the attempt, I found a patch to i2c-piix4[1] and applied it with
slight fix, like:

```
diff --git a/drivers/i2c/busses/i2c-piix4.c b/drivers/i2c/busses/i2c-piix4.c
index 809fbd014cd6..11c1ba3afa76 100644
--- a/drivers/i2c/busses/i2c-piix4.c
+++ b/drivers/i2c/busses/i2c-piix4.c
@@ -99,7 +99,9 @@
#define SB800_PIIX4_PORT_IDX_SHIFT_KERNCZ 3

#define SB800_PIIX4_FCH_PM_ADDR 0xFED80300
-#define SB800_PIIX4_FCH_PM_SIZE 8
+#define SB800_PIIX4_FCH_PM_SIZE 0x100
+#define SB800_PIIX4_FCH_PM_OFFSET_ISACONTROL 0x04
+#define SB800_PIIX4_FCH_PM_OFFSET_S5_RESET_STATUS 0xc0

/* insmod parameters */

@@ -200,6 +202,9 @@ static int piix4_sb800_region_request(struct device *dev,

mmio_cfg->addr = addr;

+ dev_info(dev, "ISACONTROL = 0x%08x", ioread32(addr + SB800_PIIX4_FCH_PM_OFFSET_ISACONTROL));
+ dev_info(dev, "S5_RESET_STATUS = 0x%08x", ioread32(addr + SB800_PIIX4_FCH_PM_OFFSET_S5_RESET_STATUS));
+
return 0;
}
```

> On 12/3/2023 06:29, Takashi Sakamoto wrote:
> > In lspci case, I can work with debugger and figure out that 'pread(2)' to
> > file descriptor for 'config' node in sysfs causes the unexpected system
> > reboot. Additionally I can regenerate it by hexdump(1) to the node:
>
> OK - is this by chance related to access to PCI extended config space
> failing for this device then? If you read just the first 256 bytes it's ok,
> but beyond that it fails?

I can regenerate unexpected system reboot even if readling from the node
enough shorter than 256 bytes. This time I use dd(1) for the purpose since
hexdump uses stream I/O API (reads 4096 bytes once).

> If so, can you please try to reproduce using this series from Bjorn applied:
> https://lore.kernel.org/r/20231121183643.249006-1-helgaas@xxxxxxxxxx
>
> And then add this to kernel command line:
> efi=debug "dyndbg=file arch/x86/pci/* +p"
>
> Capture the dmesg and share it.

I try the series by backport way, but it takes a bit time.


[1] https://lore.kernel.org/lkml/20230407203720.18184-1-yazen.ghannam@xxxxxxx/T/

Thanks a lot

Takashi Sakamoto