Re: [PATCH v3 1/2] PCI: dwc: Implement general suspend/resume functionality for L2/L3 transitions

From: Frank Li
Date: Fri Jul 21 2023 - 10:10:51 EST


On Fri, Jul 21, 2023 at 10:09:18AM +0800, Shawn Lin wrote:
>
> On 2023/7/21 0:07, Manivannan Sadhasivam wrote:
> > On Thu, Jul 20, 2023 at 10:37:36AM -0400, Frank Li wrote:
> > > On Thu, Jul 20, 2023 at 07:55:09PM +0530, Manivannan Sadhasivam wrote:
> > > > On Tue, Jul 18, 2023 at 03:34:26PM +0530, Manivannan Sadhasivam wrote:
> > > > > On Mon, Jul 17, 2023 at 02:36:19PM -0400, Frank Li wrote:
> > > > > > On Mon, Jul 17, 2023 at 10:15:26PM +0530, Manivannan Sadhasivam wrote:
> > > > > > > On Wed, Apr 19, 2023 at 12:41:17PM -0400, Frank Li wrote:
> > > > > > > > Introduced helper function dw_pcie_get_ltssm to retrieve SMLH_LTSS_STATE.
> > > > > > > > Added API pme_turn_off and exit_from_l2 for managing L2/L3 state transitions.
> > > > > > > >
> > > > > > > > Typical L2 entry workflow:
> > > > > > > >
> > > > > > > > 1. Transmit PME turn off signal to PCI devices.
> > > > > > > > 2. Await link entering L2_IDLE state.
> > > > > > >
> > > > > > > AFAIK, typical workflow is to wait for PME_To_Ack.
> > > > > >
> > > > > > 1 Already wait for PME_to_ACK, 2, just wait for link actual enter L2.
> > > > > > I think PCI RC needs some time to set link enter L2 after get ACK from
> > > > > > PME.
> > > > > >
> > > >
> > > > One more comment. If you transition the device to L2/L3, then it can loose power
> > > > if Vaux was not provided. In that case, can all the devices work after resume?
> > > > Most notably NVMe?
> > >
> > > I have not hardware to do such test, NVMe driver will reinit everything after
> > > resume if no L1.1\L1.2 support. If there are L1.1\L1.2, NVME expect it leave
> > > at L1.2 at suspend to get better resume latency.
> > >
> >
> > To be precise, NVMe driver will shutdown the device if there is no ASPM support
> > and keep it in low power mode otherwise (there are other cases as well but we do
> > not need to worry).
> >
> > But here you are not checking for ASPM state in the suspend path, and just
> > forcing the link to be in L2/L3 (thereby D3Cold) even though NVMe driver may
> > expect it to be in low power state like ASPM/APST.
> >
> > So you should only put the link to L2/L3 if there is no ASPM support. Otherwise,
> > you'll ending up with bug reports when users connect NVMe to it.
> >
>
>
> At this topic, it's very interesting to look at
>
> drivers/pci/controller/dwc/pcie-tegra194.c
>
>
> static int tegra_pcie_dw_suspend_noirq(struct device *dev)
> {
> struct tegra_pcie_dw *pcie = dev_get_drvdata(dev);
>
> if (!pcie->link_state)
> return 0;
>
> tegra_pcie_downstream_dev_to_D0(pcie);
> tegra_pcie_dw_pme_turnoff(pcie);
> tegra_pcie_unconfig_controller(pcie);
>
> return 0;
> }
>
> It brings back all the downstream components to D0, as I assumed it was L0
> indeed, before sending PME aiming to enter L2.

If current state is L1.1 or L1.2, hardware can auto enter to D0\L0 when
there are any PCI bus activity, include PME. I supposed
tegra_pcie_downstream_dev_to_D0() just make sure come back from L2/L3,
which may enter by runtime PM previously, or other reason.

NVME ASPM problem is (at least when I debug at other platform about 1 year
ago):

1. NVME will not release MSI interrupt during suspsend.
2. PCI controler enter L2 at suspned_noirq();
3. CPU hot plug try to down second core (CORE1, CORE2, ...)
4. GIC try to disable MSI irq by write config space.
5. panic here because config space can't be access at L2.

I suposed tegra should have problem when ASPM enable with NVME devices.

Frank
>
> > - Mani
> >
> > > This API help remove duplicate codes and it can be improved gradually.
> > >
> > >
> > > >
> > > > - Mani
> > > >
> > > >
> > > > --
> > > > மணிவண்ணன் சதாசிவம்
> >