Re: [PATCH v14.a 1/1] PCI: Only put Intel PCIe ports >= 2015 into D3

From: Bjorn Helgaas
Date: Wed Aug 23 2023 - 07:46:53 EST


On Wed, Aug 23, 2023 at 07:04:53AM +0200, Lukas Wunner wrote:
> On Tue, Aug 22, 2023 at 07:02:43PM -0500, Bjorn Helgaas wrote:
> > On Tue, Aug 22, 2023 at 12:11:10PM +0200, Rafael J. Wysocki wrote:
> > > What we need to deal with here is basically non-compliant systems and
> > > so we have to catch the various forms of non-compliance.
> >
> > Thanks for this, that helps. If pci_bridge_d3_possible() is a list of
> > quirks for systems that are known to be broken (or at least not known
> > to work correctly and avoiding D3 is acceptable), then we should
> > document and use it that way.
> >
> > The current documentation ("checks if it is possible to move to D3")
> > frames it as "does the bridge have the required features?" instead of
> > "do we know about something broken in this bridge or this platform?"
> >
> > If something is broken, I would expect tests based on the device or
> > DMI check. But several some are not obvious defects. E.g.,
> > "bridge->is_hotplug_bridge && !pciehp_is_native(bridge)" -- what
> > defect are we finding there? What does the spec require that isn't
> > happening?
>
> This particular check doesn't pertain to a defect, but indeed
> follows from the spec:
>
> If hotplug control wasn't granted to the OS, the OS shall not put
> the hotplug port in D3 behind firmware's back because the power state
> affects accessibility of devices downstream of the hotplug port.
>
> Put another way, the firmware expects to have control of hotplug
> and hotplug may break if the OS fiddles with the power state of the
> hotplug port.
>
> Here's a bugzilla where this caused issues:
> https://bugzilla.kernel.org/show_bug.cgi?id=53811
>
> On the other hand Thunderbolt hotplug ports are required to runtime
> suspend to D3 in order to save power.

Sounds like there may be a requirement in a Thunderbolt spec about
this, so maybe we could add that citation? I guess this goes with the
"bridge->is_thunderbolt" check?

> On Macs they're always handled
> natively by the OS. Hence the code comment.

And I guess this goes with the "System Management Mode" and
"Thunderbolt on non-Macs" comments? A citation to the source behind
"OS shall not put the hotplug port in D3 behind firmware's back" would
be super helpful here.

> A somewhat longer explanation I gave in 2016:
> https://lore.kernel.org/all/20160617213209.GA1927@xxxxxxxxx/
>
> Perhaps the code comment preceding that check can be rephrased to
> convey its meaning more clearly...

Thanks! I think it would be worth trying to separate out the "normal"
things that correspond to the spec from the "quirk" things that work
around defects. That's not material for *this* patch, though.

It's also a little weird that pci_bridge_d3_possible() itself looks
like it's invariant for the life of the system, but we call it several
times (pci_pm_init(), pci_bridge_d3_update(), pcie_portdrv_probe(),
etc). I guess this is because we save the result in dev->bridge_d3,
but then pci_bridge_d3_update() updates dev->bridge_d3 based on other
things, so the original value is lost. Maybe another bit or two could
avoid those extra calls.

Bjorn