Re: [PATCH v5] PCI: Check for PCIe downtraining conditions

From: Alex G.
Date: Mon Jul 23 2018 - 19:59:30 EST




On 07/23/2018 05:14 PM, Jakub Kicinski wrote:
On Tue, 24 Jul 2018 00:52:22 +0300, Tal Gilboa wrote:
On 7/24/2018 12:01 AM, Jakub Kicinski wrote:
On Mon, 23 Jul 2018 15:03:38 -0500, Alexandru Gagniuc wrote:
PCIe downtraining happens when both the device and PCIe port are
capable of a larger bus width or higher speed than negotiated.
Downtraining might be indicative of other problems in the system, and
identifying this from userspace is neither intuitive, nor
straightforward.

The easiest way to detect this is with pcie_print_link_status(),
since the bottleneck is usually the link that is downtrained. It's not
a perfect solution, but it works extremely well in most cases.

Signed-off-by: Alexandru Gagniuc <mr.nuke.me@xxxxxxxxx>
---

For the sake of review, I've created a __pcie_print_link_status() which
takes a 'verbose' argument. If we agree want to go this route, and update
the users of pcie_print_link_status(), I can split this up in two patches.
I prefer just printing this information in the core functions, and letting
drivers not have to worry about this. Though there seems to be strong for
not going that route, so here it goes:

FWIW the networking drivers print PCIe BW because sometimes the network
bandwidth is simply over-provisioned on multi port cards, e.g. 80Gbps
card on a x8 link.

Sorry to bike shed, but currently the networking cards print the info
during probe. Would it make sense to move your message closer to probe
time? Rather than when device is added. If driver structure is
available, we could also consider adding a boolean to struct pci_driver
to indicate if driver wants the verbose message? This way we avoid
duplicated prints.

I have no objection to current patch, it LGTM. Just a thought.

I don't see the reason for having two functions. What's the problem with
adding the verbose argument to the original function?

IMHO it's reasonable to keep the default parameter to what 90% of users
want by a means on a wrapper. The non-verbose output is provided by
the core already for all devices.

What do you think of my proposal above Tal? That would make the extra
wrapper unnecessary since the verbose parameter would be part of the
driver structure, and it would avoid the duplicated output.

I see how it might make sense to add another member to the driver struct, but is it worth the extra learning curve? It seems to be something with the potential to confuse new driver developers, and having a very marginal benefit.
Although, if that's what people want...

Alex