Re: [PATCH v2 2/2] PCI: Add missing link delays required by the PCIe spec

From: Bjorn Helgaas
Date: Tue Oct 29 2019 - 16:55:01 EST


On Fri, Oct 04, 2019 at 03:39:47PM +0300, Mika Westerberg wrote:
> Currently Linux does not follow PCIe spec regarding the required delays
> after reset. A concrete example is a Thunderbolt add-in-card that
> consists of a PCIe switch and two PCIe endpoints:
> ...

> @@ -1025,15 +1025,11 @@ static void __pci_start_power_transition(struct pci_dev *dev, pci_power_t state)
> if (state == PCI_D0) {
> pci_platform_power_transition(dev, PCI_D0);
> /*
> - * Mandatory power management transition delays, see
> - * PCI Express Base Specification Revision 2.0 Section
> - * 6.6.1: Conventional Reset. Do not delay for
> - * devices powered on/off by corresponding bridge,
> - * because have already delayed for the bridge.
> + * Mandatory power management transition delays are handled
> + * in pci_pm_runtime_resume() of the corresponding
> + * downstream/root port.
> */
> if (dev->runtime_d3cold) {
> - if (dev->d3cold_delay && !dev->imm_ready)
> - msleep(dev->d3cold_delay);

This removes the only use of d3cold_delay. I assume that's
intentional? If we no longer need it, we might as well remove it from
the pci_dev and remove the places that set it. It'd be nice if that
could be a separate patch, even if we waited a little longer than
necessary at that one bisection point.

It also removes one of the three uses of imm_ready, leaving only the
two in FLR. I suspect there are other places we should use imm_ready,
e.g., transitions to/from D1 and D2, but that would be beyond the
scope of this patch.

> + /*
> + * For PCIe downstream and root ports that do not support speeds
> + * greater than 5 GT/s need to wait minimum 100 ms. For higher
> + * speeds (gen3) we need to wait first for the data link layer to
> + * become active.
> + *
> + * However, 100 ms is the minimum and the PCIe spec says the
> + * software must allow at least 1s before it can determine that the
> + * device that did not respond is a broken device. There is
> + * evidence that 100 ms is not always enough, for example certain
> + * Titan Ridge xHCI controller does not always respond to
> + * configuration requests if we only wait for 100 ms (see
> + * https://bugzilla.kernel.org/show_bug.cgi?id=203885).
> + *
> + * Therefore we wait for 100 ms and check for the device presence.
> + * If it is still not present give it an additional 100 ms.
> + */
> + if (pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT &&
> + pci_pcie_type(dev) != PCI_EXP_TYPE_DOWNSTREAM)
> + return;

Shouldn't this be:

if (!pcie_downstream_port(dev))
return

so we include PCI/PCI-X to PCIe bridges?