Re: [PATCH v9 00/14] pci: Work around ASMedia ASM2824 PCIe link training failures

From: Maciej W. Rozycki
Date: Wed Jun 14 2023 - 20:41:17 EST


On Wed, 14 Jun 2023, Bjorn Helgaas wrote:

> > This is v9 of the change to work around a PCIe link training phenomenon
> > where a pair of devices both capable of operating at a link speed above
> > 2.5GT/s seems unable to negotiate the link speed and continues training
> > indefinitely with the Link Training bit switching on and off repeatedly
> > and the data link layer never reaching the active state.
> >
> > With several requests addressed and a few extra issues spotted this
> > version has now grown to 14 patches. It has been verified for device
> > enumeration with and without PCI_QUIRKS enabled, using the same piece of
> > RISC-V hardware as previously. Hot plug or reset events have not been
> > verified, as this is difficult if at all feasible with hardware in
> > question.
> >
> > Last iteration:
> > <https://lore.kernel.org/r/alpine.DEB.2.21.2304060100160.13659@xxxxxxxxxxxxxxxxx/>,
> > and my input to it:
> > <https://lore.kernel.org/r/alpine.DEB.2.21.2306080224280.36323@xxxxxxxxxxxxxxxxx/>.
>
> Thanks, I applied these to pci/enumeration for v6.5.

Great, thanks!

> I tweaked a few things, so double-check to be sure I didn't break
> something:
>
> - Moved dev->link_active_reporting init to set_pcie_port_type()
> because it does other PCIe-related stuff.
>
> - Reordered to keep all the link_active_reporting things together.
>
> - Reordered to clean up & factor pcie_retrain_link() before exposing
> it to the rest of the PCI core.
>
> - Moved pcie_retrain_link() a little earlier to keep it next to
> pcie_wait_for_link_status().
>
> - Squashed the stubs into the actual quirk so we don't have the
> intermediate state where we call the stubs but they never do
> anything (let me know if there's a reason we need your order).
>
> - Inline pcie_parent_link_retrain(), which seemed like it didn't add
> enough to be worthwhile.

Ack, I'll double-check and report back. A minor nit I've spotted below:

> static int pci_dev_wait(struct pci_dev *dev, char *reset_type, int timeout)
> {
> - bool retrain = true;
> int delay = 1;
> + bool retrain = false;
> + struct pci_dev *bridge;
> +
> + if (pci_is_pcie(dev)) {
> + retrain = true;
> + bridge = pci_upstream_bridge(dev);
> + }

If doing it this way, which I actually like, I think it would be a little
bit better performance- and style-wise if this was written as:

if (pci_is_pcie(dev)) {
bridge = pci_upstream_bridge(dev);
retrain = !!bridge;
}

(or "retrain = bridge != NULL" if you prefer this style), and then we
don't have to repeatedly check two variables iff (pcie && !bridge) in the
loop below:

> @@ -1201,9 +1190,9 @@ static int pci_dev_wait(struct pci_dev *dev, char *reset_type, int timeout)
> }
>
> if (delay > PCI_RESET_WAIT) {
> - if (retrain) {
> + if (retrain && bridge) {

-- i.e. code can stay then as:

if (retrain) {

here. I hope you find this observation rather obvious, so will you amend
your tree, or shall I send an incremental update?

Otherwise I don't find anything suspicious with the interdiff itself
(thanks for posting it, that's really useful indeed!), but as I say I'll
yet double-check how things look and work with your tree. Hopefully
tomorrow (Thu), as I have other stuff yet to complete tonight.

Maciej