Re: [PATCH] PCI/LINK: Account for BW notification in vector calculation

From: Alex G
Date: Tue Apr 23 2019 - 13:53:14 EST


On 4/23/19 12:10 PM, Bjorn Helgaas wrote:
On Tue, Apr 23, 2019 at 09:33:53AM -0500, Alex G wrote:
On 4/22/19 7:33 PM, Alex Williamson wrote:
There is nothing wrong happening here that needs to fill logs. I
thought maybe if I enabled notification of autonomous bandwidth
changes that it might categorize these as something we could
ignore, but it doesn't. How can we identify only cases where this
is an erroneous/noteworthy situation? Thanks,

You don't. Ethernet doesn't. USB doesn't. This logging behavior is
consistent with every other subsystem that deals with multi-speed links.

Can you point me to the logging in these other subsystems so I can
learn more about how they deal with this?

I don't have any in-depth articles about the logging in these systems, but I can extract some logs from my machines.

Ethernet:

[Sun Apr 21 11:14:06 2019] e1000e: eno1 NIC Link is Down
[Sun Apr 21 11:14:17 2019] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[Sun Apr 21 11:14:23 2019] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[Sun Apr 21 23:33:31 2019] e1000e: eno1 NIC Link is Down
[Sun Apr 21 23:33:43 2019] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[Sun Apr 21 23:33:48 2019] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx

I used to have one of these "green" ethernet switches that went down to 100mbps automatically. You can imagine how "clogged" the logs were with link up messages. Thank goodness that switch was killed in a thunderstorm.

USB will log every device insertion and removal, very verbosely (see appendix A).


I agree that emitting log messages for normal and expected events will
lead to user confusion and we need to do something.

e8303bb7a75c ("PCI/LINK: Report degraded links via link bandwidth
notification") was merged in v5.1-rc1, so we still have (a little)
time to figure this out before v5.1.

I always viewed the system log as a system log, instead of a database of system errors. I may have extremist views, but going back to Alex's example, I prefer to see that the power saving mechanism is doing something to save power on my laptop (I'll just ignore it on a desktop).

If you think increasing code complexity because people don't want things logged into the system log, then I'm certain we can work out some sane solution. It's the same problem we see with GCC, where people want warning messages here, but don't want the same messages there.

Alex


P.S. The pedantic in me points out that one of the examples I gave is a terrible example. ASPM "allows hardware-autonomous, dynamic Link power reduction beyond what is achievable by software-only control" [1].

[1] PCI-Express 3.0 -- 5.4.1. Active State Power Management (ASPM)


Appendix A:

[1618067.987084] usb 1-3.5: new high-speed USB device number 79 using xhci_hcd
[1618068.179914] usb 1-3.5: New USB device found, idVendor=0bda, idProduct=4014, bcdDevice= 0.05
[1618068.179924] usb 1-3.5: New USB device strings: Mfr=3, Product=1, SerialNumber=2
[1618068.179930] usb 1-3.5: Product: USB Audio
[1618068.179936] usb 1-3.5: Manufacturer: Generic
[1618068.179941] usb 1-3.5: SerialNumber: 200901010001
[1618068.280100] usb 1-3.6: new low-speed USB device number 80 using xhci_hcd
[1618068.342541] Bluetooth: hci0: Waiting for firmware download to complete
[1618068.342795] Bluetooth: hci0: Firmware loaded in 1509081 usecs
[1618068.342887] Bluetooth: hci0: Waiting for device to boot
[1618068.354919] Bluetooth: hci0: Device booted in 11797 usecs
[1618068.356006] Bluetooth: hci0: Found Intel DDC parameters: intel/ibt-12-16.ddc
[1618068.358958] Bluetooth: hci0: Applying Intel DDC parameters completed
[1618068.378624] usb 1-3.6: New USB device found, idVendor=04d9, idProduct=1400, bcdDevice= 1.43
[1618068.378626] usb 1-3.6: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[1618068.390686] input: HID 04d9:1400 as /devices/pci0000:00/0000:00:14.0/usb1/1-3/1-3.6/1-3.6:1.0/0003:04D9:1400.0139/input/input921
[1618068.444282] hid-generic 0003:04D9:1400.0139: input,hidraw1: USB HID v1.10 Keyboard [HID 04d9:1400] on usb-0000:00:14.0-3.6/input0
[1618068.456373] input: HID 04d9:1400 Mouse as /devices/pci0000:00/0000:00:14.0/usb1/1-3/1-3.6/1-3.6:1.1/0003:04D9:1400.013A/input/input922
[1618068.457929] input: HID 04d9:1400 Consumer Control as /devices/pci0000:00/0000:00:14.0/usb1/1-3/1-3.6/1-3.6:1.1/0003:04D9:1400.013A/input/input923
[1618068.509294] input: HID 04d9:1400 System Control as /devices/pci0000:00/0000:00:14.0/usb1/1-3/1-3.6/1-3.6:1.1/0003:04D9:1400.013A/input/input924
[1618068.509518] hid-generic 0003:04D9:1400.013A: input,hidraw2: USB HID v1.10 Mouse [HID 04d9:1400] on usb-0000:00:14.0-3.6/input1
[1618068.588078] usb 1-3.7: new full-speed USB device number 81 using xhci_hcd
[1618068.679132] usb 1-3.7: New USB device found, idVendor=046d, idProduct=c52b, bcdDevice=12.03
[1618068.679137] usb 1-3.7: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[1618068.679139] usb 1-3.7: Product: USB Receiver
[1618068.679142] usb 1-3.7: Manufacturer: Logitech
[1618068.692430] logitech-djreceiver 0003:046D:C52B.013D: hiddev96,hidraw3: USB HID v1.11 Device [Logitech USB Receiver] on usb-0000:00:14.0-3.7/input2
[1618068.817334] input: Logitech Performance MX as /devices/pci0000:00/0000:00:14.0/usb1/1-3/1-3.7/1-3.7:1.2/0003:046D:C52B.013D/0003:046D:101A.013E/input/input925
[1618068.820357] logitech-hidpp-device 0003:046D:101A.013E: input,hidraw4: USB HID v1.11 Mouse [Logitech Performance MX] on usb-0000:00:14.0-3.7:1