Re: [PATCH v5 5/5] PCI: Work around PCIe link training failures

From: Maciej W. Rozycki
Date: Tue Nov 29 2022 - 04:58:07 EST


On Wed, 9 Nov 2022, Alex Williamson wrote:

> > 05:00.0 supports the "bus" method, i.e., pci_reset_bus_function(),
> > which tries pci_dev_reset_slot_function() followed by
> > pci_parent_bus_reset(). Both of them return -ENOTTY if the device
> > (05:00.0) has a secondary bus ("dev->subordinate"), so I think nothing
> > happens here.
>
> Right, the pci-sysfs reset attribute is only meant for a reset scope
> limited to the device, we'd need something to call pci_reset_bus() to
> commit to the whole hierarchy, which is not something we typically do.
> vfio-pci will only bind to endpoint devices, so it shouldn't provide an
> interface to inject a bus reset here either.
>
> Based on the fact that there's a pericom switch in play here, I'll just
> note that I think this is the same device with other link speed issues
> as well:
>
> https://lore.kernel.org/all/20161026180140.23495.27388.stgit@xxxxxxxxxx/

Thanks for the pointer. This has been superseded by commit acd61ffb2f16
("PCI: Add ACS quirk for Pericom PI7C9X2G switches"), right? In which
case it is a match ([12d8:2304]), though the quirk does not trigger here,
i.e. no message is printed about store-forward mode activation:

pcieport 0000:05:00.0: calling pci_fixup_pericom_acs_store_forward+0x0/0xba @ 1
pcieport 0000:05:00.0: pci_fixup_pericom_acs_store_forward+0x0/0xba took 0 usecs
[...]
pci 0000:05:00.0: calling pci_fixup_pericom_acs_store_forward+0x0/0xba @ 1
pci 0000:05:00.0: pci_fixup_pericom_acs_store_forward+0x0/0xba took 0 usecs
[...]
pcieport 0000:06:01.0: calling pci_fixup_pericom_acs_store_forward+0x0/0xba @ 1
pcieport 0000:06:01.0: pci_fixup_pericom_acs_store_forward+0x0/0xba took 3 usecs
[...]
pcieport 0000:06:02.0: calling pci_fixup_pericom_acs_store_forward+0x0/0xba @ 1
pcieport 0000:06:02.0: pci_fixup_pericom_acs_store_forward+0x0/0xba took 2 usecs

NB I don't know why the quirk for the upstream port (05:00.0) is called
twice, both via pcieport and via pci.

> This fell off my plate some time ago, but as noted there, enabling ACS
> when the upstream and downstream ports run at different link rates
> exposes errata where packets are queued and not delivered within the
> switch.
>
> Could enabling ACS on this device be contributing to the issue here,
> for example triggering the Asmedia downstream port to get into this
> link reseting issue? A test with
> pci=disable_acs_redir=0000:06:01.0;0000:06:02.0 could be interesting
> assuming this occurs on an platform that has an IOMMU, ie. calls
> pci_request_acs(). Thanks,

We have no IOMMU support for any RISC-V machine at the moment:

config ARCH_RV64I
[...]
select SWIOTLB if MMU

and:

software IO TLB: area num 4.
software IO TLB: mapped [mem 0x00000000fb732000-0x00000000ff732000] (64MB)

so IIUC this issue does not apply. Thank you for your input.

Maciej