Re: [PATCH V1] mmc: sdhci-pci-gli: GL975[05]: Mask the replay timer timeout of AER

From: Victor Shih
Date: Fri Oct 06 2023 - 06:30:32 EST


On Mon, Oct 2, 2023 at 10:18 AM Kai-Heng Feng
<kai.heng.feng@xxxxxxxxxxxxx> wrote:
>
> Hi Victor,
>
> On Tue, Sep 26, 2023 at 4:21 PM Victor Shih <victorshihgli@xxxxxxxxx> wrote:
> >
> > On Fri, Sep 22, 2023 at 3:11 PM Kai-Heng Feng
> > <kai.heng.feng@xxxxxxxxxxxxx> wrote:
> > >
> > > Hi Victor,
> > >
> > > On Wed, Sep 20, 2023 at 4:54 PM Victor Shih <victorshihgli@xxxxxxxxx> wrote:
> > > >
> > > > On Tue, Sep 19, 2023 at 3:31 PM Kai-Heng Feng
> > > > <kai.heng.feng@xxxxxxxxxxxxx> wrote:
> > > > >
> > > > > Hi Victor,
> > > > >
> > > > > On Tue, Sep 19, 2023 at 3:10 PM Victor Shih <victorshihgli@xxxxxxxxx> wrote:
> > > > > >
> > > > > > On Tue, Sep 19, 2023 at 12:24 PM Kai-Heng Feng
> > > > > > <kai.heng.feng@xxxxxxxxxxxxx> wrote:
> > > > > > >
> > > > > > > Hi Victor,
> > > > > > >
> > > > > > > On Mon, Sep 18, 2023 at 6:31 PM Victor Shih <victorshihgli@xxxxxxxxx> wrote:
> > > > > > > >
> > > > > > > > From: Victor Shih <victor.shih@xxxxxxxxxxxxxxxxxxx>
> > > > > > > >
> > > > > > > > Due to a flaw in the hardware design, the GL975x replay timer frequently
> > > > > > > > times out when ASPM is enabled. As a result, the system will resume
> > > > > > > > immediately when it enters suspend. Therefore, the replay timer
> > > > > > > > timeout must be masked.
> > > > > > >
> > > > > > > This patch solves AER error when its PCI config gets accessed, but the
> > > > > > > AER still happens at system suspend:
> > > > > > >
> > > > > > > [ 1100.103603] ACPI: EC: interrupt blocked
> > > > > > > [ 1100.268244] ACPI: EC: interrupt unblocked
> > > > > > > [ 1100.326960] pcieport 0000:00:1c.0: AER: Corrected error received:
> > > > > > > 0000:00:1c.0
> > > > > > > [ 1100.326991] pcieport 0000:00:1c.0: PCIe Bus Error:
> > > > > > > severity=Corrected, type=Data Link Layer, (Transmitter ID)
> > > > > > > [ 1100.326993] pcieport 0000:00:1c.0: device [8086:7ab9] error
> > > > > > > status/mask=00001000/00002000
> > > > > > > [ 1100.326996] pcieport 0000:00:1c.0: [12] Timeout
> > > > > > >
> > > > > > > Kai-Heng
> > > > > > >
> > > > > >
> > > > > > Hi, Kai-Heng
> > > > > >
> > > > > > Could you try applying the patch and re-testing again after restarting
> > > > > > the system?
> > > > >
> > > > > Same issue happens after coldboot.
> > > > >
> > > > > > Because I applied the patch and restarted the system and it didn't happen.
> > > > > > The system can enter suspend normally.
> > > > > >
> > > > > > If you still have the issue after following the above instructions,
> > > > > > please provide me with your environment and I will verify it again.
> > > > >
> > > > > The patch gets applied on top of next-20230918. Please let me know
> > > > > what else you want to know.
> > > > >
> > > > > Kai-Heng
> > > > >
> > > >
> > > > Hi, Kai-Heng
> > > >
> > > > If I want to mask the replay timer timeout AER of the upper layer root port,
> > > > could you give me some suggestions?
> > > > Or could you provide sample code for my reference?
> > >
> > > I am not aware of anyway to mask "replay timer timeout" from root port.
> > > I wonder if the device supoprt D3hot? Or should it stay at D0 when
> > > ASPM L1.2 is enabled?
> > >
> > > Kai-Heng
> > >
> >
> > Hi, Kai-Heng
> >
> > Do you know any way to mask the replay timer timeout AER of the
> > upstream port from the device?
>
> Per PCIe Spec, I don't think it's possible to only mask 'replay timer timeout'.
>
> > The device supports D3hot.
>
> Do you think such error plays any crucial rule? Otherwise disable
> 'correctable' errors may be plausible.
>
> Kai-Heng
>

Hi, Kai-Heng

Due to a flaw in the hardware design, the GL975x replay timer frequently
times out when ASPM is enabled.
This patch solves the AER error of the replay timer timeout for GL975x.
We have not encountered any other errors so far.
Does your 'correctable' errors mean the AER error of the replay timer timeout?
May I ask if you have any other comments on this patch?

Thanks, Victor Shih

> >
> > Thanks, Victor Shih
> >
> > > >
> > > > Thanks, Victor Shih
> > > >
> > > > > >
> > > > > > Thanks, Victor Shih
> > > > > >
> > > > > > > >
> > > > > > > > Signed-off-by: Victor Shih <victor.shih@xxxxxxxxxxxxxxxxxxx>
> > > > > > > > ---
> > > > > > > > drivers/mmc/host/sdhci-pci-gli.c | 16 ++++++++++++++++
> > > > > > > > 1 file changed, 16 insertions(+)
> > > > > > > >
> > > > > > > > diff --git a/drivers/mmc/host/sdhci-pci-gli.c b/drivers/mmc/host/sdhci-pci-gli.c
> > > > > > > > index d83261e857a5..d8a991b349a8 100644
> > > > > > > > --- a/drivers/mmc/host/sdhci-pci-gli.c
> > > > > > > > +++ b/drivers/mmc/host/sdhci-pci-gli.c
> > > > > > > > @@ -28,6 +28,9 @@
> > > > > > > > #define PCI_GLI_9750_PM_CTRL 0xFC
> > > > > > > > #define PCI_GLI_9750_PM_STATE GENMASK(1, 0)
> > > > > > > >
> > > > > > > > +#define PCI_GLI_9750_CORRERR_MASK 0x214
> > > > > > > > +#define PCI_GLI_9750_CORRERR_MASK_REPLAY_TIMER_TIMEOUT BIT(12)
> > > > > > > > +
> > > > > > > > #define SDHCI_GLI_9750_CFG2 0x848
> > > > > > > > #define SDHCI_GLI_9750_CFG2_L1DLY GENMASK(28, 24)
> > > > > > > > #define GLI_9750_CFG2_L1DLY_VALUE 0x1F
> > > > > > > > @@ -152,6 +155,9 @@
> > > > > > > > #define PCI_GLI_9755_PM_CTRL 0xFC
> > > > > > > > #define PCI_GLI_9755_PM_STATE GENMASK(1, 0)
> > > > > > > >
> > > > > > > > +#define PCI_GLI_9755_CORRERR_MASK 0x214
> > > > > > > > +#define PCI_GLI_9755_CORRERR_MASK_REPLAY_TIMER_TIMEOUT BIT(12)
> > > > > > > > +
> > > > > > > > #define SDHCI_GLI_9767_GM_BURST_SIZE 0x510
> > > > > > > > #define SDHCI_GLI_9767_GM_BURST_SIZE_AXI_ALWAYS_SET BIT(8)
> > > > > > > >
> > > > > > > > @@ -561,6 +567,11 @@ static void gl9750_hw_setting(struct sdhci_host *host)
> > > > > > > > value &= ~PCI_GLI_9750_PM_STATE;
> > > > > > > > pci_write_config_dword(pdev, PCI_GLI_9750_PM_CTRL, value);
> > > > > > > >
> > > > > > > > + /* mask the replay timer timeout of AER */
> > > > > > > > + pci_read_config_dword(pdev, PCI_GLI_9750_CORRERR_MASK, &value);
> > > > > > > > + value |= PCI_GLI_9750_CORRERR_MASK_REPLAY_TIMER_TIMEOUT;
> > > > > > > > + pci_write_config_dword(pdev, PCI_GLI_9750_CORRERR_MASK, value);
> > > > > > > > +
> > > > > > > > gl9750_wt_off(host);
> > > > > > > > }
> > > > > > > >
> > > > > > > > @@ -770,6 +781,11 @@ static void gl9755_hw_setting(struct sdhci_pci_slot *slot)
> > > > > > > > value &= ~PCI_GLI_9755_PM_STATE;
> > > > > > > > pci_write_config_dword(pdev, PCI_GLI_9755_PM_CTRL, value);
> > > > > > > >
> > > > > > > > + /* mask the replay timer timeout of AER */
> > > > > > > > + pci_read_config_dword(pdev, PCI_GLI_9755_CORRERR_MASK, &value);
> > > > > > > > + value |= PCI_GLI_9755_CORRERR_MASK_REPLAY_TIMER_TIMEOUT;
> > > > > > > > + pci_write_config_dword(pdev, PCI_GLI_9755_CORRERR_MASK, value);
> > > > > > > > +
> > > > > > > > gl9755_wt_off(pdev);
> > > > > > > > }
> > > > > > > >
> > > > > > > > --
> > > > > > > > 2.25.1
> > > > > > > >