Re: [PATCH] PCI/LINK: Account for BW notification in vector calculation

From: Alex G
Date: Tue Apr 23 2019 - 12:03:10 EST




On 4/23/19 10:34 AM, Alex Williamson wrote:
On Tue, 23 Apr 2019 09:33:53 -0500
Alex G <mr.nuke.me@xxxxxxxxx> wrote:

On 4/22/19 7:33 PM, Alex Williamson wrote:
On Mon, 22 Apr 2019 19:05:57 -0500
Alex G <mr.nuke.me@xxxxxxxxx> wrote:
echo 0000:07:00.0:pcie010 |
sudo tee /sys/bus/pci_express/drivers/pcie_bw_notification/unbind

That's a bad solution for users, this is meaningless tracking of a
device whose driver is actively managing the link bandwidth for power
purposes.

0.5W savings on a 100+W GPU? I agree it's meaningless.

Evidence? Regardless, I don't have control of the driver that's making
these changes, but the claim seems unfounded and irrelevant.

The number of 5mW/Gb/lane doesn't ring a bell? [1] [2]. Your GPU supports 5Gb/s, so likely using an older, more power hungry process. I suspect it's still within the same order of magnitude.


I'm assigning a device to a VM [snip]
I can see why we might want to be notified of degraded links due to signal issues,
but what I'm reporting is that there are also entirely normal reasons
[snip] we can't seem to tell the difference

Unfortunately, there is no way in PCI-Express to distinguish between an expected link bandwidth change and one due to error.

If you're using virt-manager to configure the VM, then virt-manager could have a checkbox to disable link bandwidth management messages. I'd rather we avoid kernel-side heuristics (like Lukas suggested). If you're confident that your link will operate as intended, and don't want messages about it, that's your call as a user -- we shouldn't decide this in the kernel.

Alex

[1] https://www.synopsys.com/designware-ip/technical-bulletin/reduce-power-consumption.html