Re: Issues with "PCI/LINK: Report degraded links via link bandwidth notification"

From: Alex G.
Date: Tue Feb 02 2021 - 14:55:14 EST


On 1/29/21 3:56 PM, Bjorn Helgaas wrote:
On Thu, Jan 28, 2021 at 06:07:36PM -0600, Alex G. wrote:
On 1/28/21 5:51 PM, Sinan Kaya wrote:
On 1/28/2021 6:39 PM, Bjorn Helgaas wrote:
AFAICT, this thread petered out with no resolution.

If the bandwidth change notifications are important to somebody,
please speak up, preferably with a patch that makes the notifications
disabled by default and adds a parameter to enable them (or some other
strategy that makes sense).

I think these are potentially useful, so I don't really want to just
revert them, but if nobody thinks these are important enough to fix,
that's a possibility.

Hide behind debug or expert option by default? or even mark it as BROKEN
until someone fixes it?

Instead of making it a config option, wouldn't it be better as a kernel
parameter? People encountering this seem quite competent in passing kernel
arguments, so having a "pcie_bw_notification=off" would solve their
problems.

I don't want people to have to discover a parameter to solve issues.
If there's a parameter, notification should default to off, and people
who want notification should supply a parameter to enable it. Same
thing for the sysfs idea.

I can imagine cases where a per-port flag would be useful. For example, a machine with a NIC and a couple of PCIe storage drives. In this example, the PCIe drives donwtrain willie-nillie, so it's useful to turn off their notifications, but the NIC absolutely must not downtrain. It's debatable whether it should be default on or default off.

I think we really just need to figure out what's going on. Then it
should be clearer how to handle it. I'm not really in a position to
debug the root cause since I don't have the hardware or the time.

I wonder
(a) if some PCIe devices are downtraining willie-nillie to save power
(b) if this willie-nillie downtraining somehow violates PCIe spec
(c) what is the official behavior when downtraining is intentional

My theory is: YES, YES, ASPM. But I don't know how to figure this out without having the problem hardware in hand.


If nobody can figure out what's going on, I think we'll have to make it
disabled by default.

I think most distros do "CONFIG_PCIE_BW is not set". Is that not true?

Alex