Re: [PATCH] Revert "PCI: dwc: Wait for link up only if link is started"

From: Johan Hovold
Date: Wed Jul 26 2023 - 04:40:52 EST


On Tue, Jul 25, 2023 at 03:05:15PM -0500, Bjorn Helgaas wrote:
> [+cc Fabio, Xiaolei, Jon]
>
> On Thu, Jul 06, 2023 at 10:26:10AM +0200, Johan Hovold wrote:
> > This reverts commit da56a1bfbab55189595e588f1d984bdfb5cf5924.
> >
> > A recent commit broke controller probe by returning an error in case the
> > link does not come up during host initialisation.
> >
> > As explained in commit 886a9c134755 ("PCI: dwc: Move link handling into
> > common code") and as indicated by the comment "Ignore errors, the link
> > may come up later" in the code, waiting for link up and ignoring errors
> > is the intended behaviour:
> >
> > Let's standardize this to succeed as there are usecases where
> > devices (and the link) appear later even without hotplug. For
> > example, a reconfigured FPGA device.
> >
> > Reverting the offending commit specifically fixes a regression on
> > Qualcomm platforms like the Lenovo ThinkPad X13s which no longer reach
> > the interconnect sync state if a slot does not have a device populated
> > (e.g. an optional modem).
> >
> > Note that enabling asynchronous probing by default as was done for
> > Qualcomm platforms by commit c0e1eb441b1d ("PCI: qcom: Enable async
> > probe by default"), should take care of any related boot time concerns.
> >
> > Finally, note that the intel-gw driver is the only driver currently not
> > providing a start_link callback and instead starts the link in its
> > host_init callback, and which may avoid an additional one-second timeout
> > during probe by making the link-up wait conditional. If anyone cares,
> > that can be done in a follow-up patch with a proper motivation.
> >
> > Fixes: da56a1bfbab5 ("PCI: dwc: Wait for link up only if link is started")
> > Reported-by: Bjorn Andersson <quic_bjorande@xxxxxxxxxxx>
> > Cc: Sajid Dalvi <sdalvi@xxxxxxxxxx>
> > Cc: Ajay Agarwal <ajayagarwal@xxxxxxxxxx>
> > Signed-off-by: Johan Hovold <johan+linaro@xxxxxxxxxx>
>
> da56a1bfbab5 appeared in v6.5-rc1, so we should definitely fix this
> before v6.5.
>
> Based on the conversation here, I applied this to for-linus for v6.5.
>
> I looked for Bjorn A's report but couldn't find it; I'd like to
> include the URL if there is one. I did add the reports from Fabio
> Estevam, Xiaolei Wang, and Jon Hunter (Fabio and Xiaolei even included
> patches).

Bjorn only mentioned to me off-list that that something was wrong with
PCI in linux-next and that it prevented us from reaching the sync state.
So there is no URL to a public report to link to.

> Current commit log, corrections/additions welcome:
>
> This reverts commit da56a1bfbab55189595e588f1d984bdfb5cf5924.
>
> Bjorn Andersson, Fabio Estevam, Xiaolei Wang, and Jon Hunter reported that
> da56a1bfbab5 ("PCI: dwc: Wait for link up only if link is started") broke
> controller probing by returning an error in case the link does not come up
> during host initialisation, e.g., when the slot is empty.

I think that adding the corresponding Reported-by tags and Links after
my SoB below should be enough to credit reports that I was not aware of
when investigating this.

But if you decide to rewrite this paragraph, then please spell out "for
example" as I would not use "e.g." outside of parentheses.

> As explained in commit 886a9c134755 ("PCI: dwc: Move link handling into
> common code") and as indicated by the comment "Ignore errors, the link may
> come up later" in the code, waiting for link up and ignoring errors is the
> intended behaviour:
>
> Let's standardize this to succeed as there are usecases where devices
> (and the link) appear later even without hotplug. For example, a
> reconfigured FPGA device.
>
> Reverting the offending commit specifically fixes a regression on Qualcomm
> platforms like the Lenovo ThinkPad X13s which no longer reach the
> interconnect sync state if a slot does not have a device populated (e.g. an
> optional modem).
>
> Note that enabling asynchronous probing by default as was done for Qualcomm
> platforms by commit c0e1eb441b1d ("PCI: qcom: Enable async probe by
> default"), should take care of any related boot time concerns.
>
> Finally, note that the intel-gw driver is the only driver currently not
> providing a .start_link() callback and instead starts the link in its
> .host_init() callback, which may avoid an additional one-second timeout
> during probe by making the link-up wait conditional. If anyone cares, that
> can be done in a follow-up patch with a proper motivation.
>
> [bhelgaas: add Fabio Estevam, Xiaolei Wang, Jon Hunter reports]
> Fixes: da56a1bfbab5 ("PCI: dwc: Wait for link up only if link is started")
> Link: https://lore.kernel.org/r/20230704122635.1362156-1-festevam@xxxxxxxxx/
> Link: https://lore.kernel.org/r/20230705010624.3912934-1-xiaolei.wang@xxxxxxxxxxxxx/
> Link: https://lore.kernel.org/r/6ca287a1-6c7c-7b90-9022-9e73fb82b564@xxxxxxxxxx
> Link: https://lore.kernel.org/r/20230706082610.26584-1-johan+linaro@xxxxxxxxxx
> Reported-by: Bjorn Andersson <quic_bjorande@xxxxxxxxxxx>
> Reported-by: Fabio Estevam <festevam@xxxxxxxxx>
> Reported-by: Xiaolei Wang <xiaolei.wang@xxxxxxxxxxxxx>
> Reported-by: Jon Hunter <jonathanh@xxxxxxxxxx>
> Signed-off-by: Johan Hovold <johan+linaro@xxxxxxxxxx>
> Signed-off-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxx>
> Cc: Sajid Dalvi <sdalvi@xxxxxxxxxx>
> Cc: Ajay Agarwal <ajayagarwal@xxxxxxxxxx>

Looks like you've "sorted" the trailers here instead of keeping the
temporal order (which would make it more clear what you added after I
posted the patch) and adding each Link after its corresponding
Reported-by tag (e.g. as suggested by checkpatch these days).

Johan