Re: [PATCH] PCI/portdrv: Disallow runtime suspend when waekup is required but PME service isn't supported

From: Kai-Heng Feng
Date: Thu Aug 12 2021 - 01:20:22 EST


On Wed, Aug 11, 2021 at 3:11 PM Lukas Wunner <lukas@xxxxxxxxx> wrote:
>
> On Wed, Aug 11, 2021 at 01:06:27PM +0800, Kai-Heng Feng wrote:
> > On Wed, Aug 11, 2021 at 12:21 AM Lukas Wunner <lukas@xxxxxxxxx> wrote:
> > >
> > > On Tue, Aug 10, 2021 at 11:37:12PM +0800, Kai-Heng Feng wrote:
> > > I honestly don't know. I was just wondering whether it is okay
> > > to enable PME on devices if control is not granted by the firmware.
> > > The spec is fairly vague. But I guess the idea is that enabling PME
> > > on devices is correct, just handling the interrupts is done by firmware
> > > instead of the OS.
> >
> > Does this imply that current ACPI doesn't handle this part?
>
> Apparently not, according to the "lspci-bridge-after-hotplug" you've
> attached to the bugzilla, the PME Interrupt Enable bit wasn't set in
> the Root Control register. The kernel doesn't register an IRQ handler
> for PME because firmware doesn't grant it control, so it's firmware's
> job to enable and handle the IRQ (or poll the relevant register or
> whatever).
>
> RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
> ^^^^^^^^^^

OK, I'll send a patch that checks this flag for PME capability.

>
> > The Windows approach is to make the entire hierarchy stays at D0, I
> > think maybe it's a better way than relying on PME polling.
>
> Including the endpoint device, i.e. the NIC?

Yes, including the endpoint device.

>
>
> > > If you do want to change core code, I'd suggest modifying
> > > pci_dev_check_d3cold() so that it blocks runtime PM on upstream
> > > bridges if PME is not handled natively AND firmware failed to enable
> > > the PME interrupt at the root port. The rationale is that upstream
> > > bridges need to remain in D0 so that PME polling is possible.
> >
> > How do I know that firmware failed to enable PME IRQ?
>
> Check whether PCI_EXP_RTCTL_PMEIE was set by firmware in the Root Control
> register.

I originally thought there can be a special ACPI method to query this info.

>
>
> > > An alternative would be a quirk for this specific laptop which clears
> > > pdev->pme_support.
> >
> > This won't scale, because many models are affected.
>
> We already have quirks which clear pdev->pme_support, e.g.
> pci_fixup_no_d0_pme() and pci_fixup_no_msi_no_pme().
> Perhaps something like that would be appropriate here.

OK, I'll take this approach.

Kai-Heng

>
> Thanks,
>
> Lukas