Re: [PATCH v5 2/2] PCI: Don't assume root ports are power manageable

From: Limonciello, Mario
Date: Mon Jun 05 2023 - 13:36:27 EST


On 6/4/2023 6:40 AM, Rafael J. Wysocki wrote:
On Sat, Jun 3, 2023 at 12:38 AM Limonciello, Mario
<mario.limonciello@xxxxxxx> wrote:

On 6/2/2023 5:20 PM, Bjorn Helgaas wrote:
Hi Mario,

The patch itself looks fine, but since I don't have all the power
management details in my head, it would help me a lot to make the
description more concrete.
OK, please let me know if after reviewing my responses you
would prefer me to take an attempt at rewriting the commit
message or if you can handle changing it.
On Tue, May 30, 2023 at 11:39:47AM -0500, Mario Limonciello wrote:
Using a USB keyboard or mouse to wakeup the system from s2idle fails when
that xHCI device is connected to a USB-C port for an AMD USB4 router.
It sounds like the real issue is that "Root Ports in D3hot/D3cold may
not support wakeup", and the USB, xHCI, USB-C, AMD USB4 router bits
are probably not really relevant. And hopefully even the "AMD
platforms" mentioned below is not relevant.
Yeah. It comes down to how much you want in the commit
about how we got to this conclusion versus a generic
fix. I generally like to be verbose about a specific case
something fixes so that when distros decide what to pull
in to their older maintenance kernels they can understand
what's important.
Due to commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend")
all PCIe ports go into D3 during s2idle.

When specific root ports are put into D3 over s2idle on some AMD platforms
it is not possible for the platform to properly identify wakeup sources.
This happens whether the root port goes into D3hot or D3cold.
Can we connect this to a spec so it's not just the empirical "some AMD
platforms work like X" observation?

"s2idle" is meaningful on the power management side of the house, but
it doesn't appear in PCI or ACPI specs, so I don't know what it means
here. I assume the D3hot/D3cold state of the Root Port is the
critical factor, regardless of how it got there.
Unfortunately (?) for this particular issue it's only a
critical factor when the system is in s2idle.

PME works fine to wake up the device if the root port is
in either D3hot or D3cold when the system isn't in s2idle.
Why doesn't it work fine when the system is in s2idle then?

Getting to the root of this would be really helpful here IMO.
The process of the hardware going into s2idle has a certain
sequence of events by the platform.

This sequence is what causes the PME to not be able to work
during resume.  This issue has been root caused and is
understood by AMD platform designers.

It's why the AML doesn't provide any of those ACPI power
management routines outlined in the ACPI spec.

If the AML is patched to advertise these routines the exact
same issue is reproduced under Windows 11.

Comparing registers between Linux and Windows 11 this behavior to put
these specific root ports into D3 at suspend is unique to Linux. On an
affected system Windows does not put those specific root ports into D3
over Modern Standby.

Windows avoids putting Root Ports that are not power manageable (e.g do
not have platform firmware support) into low power states.
The Windows behavior was probably useful to you in debugging, but I
don't really care about these Windows details because I don't think
they help us maintain this in the future.
OK.
Linux shouldn't assume root ports support D3 just because they're on a
machine newer than 2015, the ports should also be deemed power manageable.
Add an extra check explicitly for root ports to ensure D3 isn't selected
for them if they are not power-manageable through platform firmware.
But I *would* like to know specifically what "power manageable" means
here. I might naively assume that a device with the PCI Power
Management Capability is "power manageable", and that if PME_Support
includes D3hot and D3cold, we're good to go. But obviously it's more
complicated than that, and I'd like to cite the spec that mentions the
actual things we need here.
Power manageable through platform firmware means the device
has ACPI methods like like _PR0, _PS0.
Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend")
Reported-by: Iain Lane <iain@xxxxxxxxxxxxxxxxxxx>
Closes: https://forums.lenovo.com/t5/Ubuntu/Z13-can-t-resume-from-suspend-with-external-USB-keyboard/m-p/5217121
Acked-by: Rafael J. Wysocki <rafael@xxxxxxxxxx>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx>
Reviewed-by: Mika Westerberg <mika.westerberg@xxxxxxxxxxxxxxx>
Signed-off-by: Mario Limonciello <mario.limonciello@xxxxxxx>
---
v4->v5:
* Add tags
* Fix title
* Adjust commit message
v3->v4:
* Move after refactor
---
drivers/pci/pci.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index d1fa040bcea7..d293db963327 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3015,6 +3015,14 @@ bool pci_bridge_d3_possible(struct pci_dev *bridge)
if (dmi_check_system(bridge_d3_blacklist))
return false;

+ /*
+ * It's not safe to put root ports that don't support power
+ * management into D3.
I assume "it's not safe" really means "Root Ports in D3hot/D3cold may
not be able to signal PME interrupts unless ... <mumble> platform
firmware <mumble> e.g., ACPI method <mumble> ..."

Can we include some of those hints here?
I'm cautious about hardcoding logic used by
acpi_bus_get_power_flags() in a comment in case it changes.

How about:

"Root ports in D3 may not be able to reliably signal wakeup
events unless platform firmware signals power management
capabilities".
I would rather write "unless then can be power-managed with the help
of the platform firmware".

The meaning of "signaling" is unclear in this context and even if it
was clear, the platform firmware support actually needs to be used
here, its mere existence is not sufficient AFAICS.
OK thanks!