Re: [PATCH v2 2/2] PCI: Fix runtime PM race with PME polling

From: Lukas Wunner
Date: Mon Jan 22 2024 - 17:18:17 EST


On Thu, Jan 18, 2024 at 11:50:49AM -0700, Alex Williamson wrote:
> On Thu, 3 Aug 2023 11:12:33 -0600 Alex Williamson <alex.williamson@xxxxxxxxxx wrote:
> > Testing that a device is not currently in a low power state provides no
> > guarantees that the device is not immenently transitioning to such a state.
> > We need to increment the PM usage counter before accessing the device.
> > Since we don't wish to wake the device for PME polling, do so only if the
> > device is already active by using pm_runtime_get_if_active().
> >
> > Signed-off-by: Alex Williamson <alex.williamson@xxxxxxxxxx>
> > ---
> > drivers/pci/pci.c | 23 ++++++++++++++++-------
> > 1 file changed, 16 insertions(+), 7 deletions(-)
>
> Resurrecting this patch (currently commit d3fcd7360338) for discussion
> as it's been identified as the source of a regression in:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=218360
>
> Copying Mika, Lukas, and Rafael as it's related to:
>
> 000dd5316e1c ("PCI: Do not poll for PME if the device is in D3cold")
>
> where we skip devices in D3cold when processing the PME list.
>
> I think the issue in the above bz is that the downstream TB3/USB4 port
> is in D3 (presumably D3hot) and I therefore infer the device is in state
> RPM_SUSPENDED. This commit is attempting to make sure the device power
> state is stable across the call such that it does not transition into
> D3cold while we're accessing it.
>
> To do that I used pm_runtime_get_if_active(), but in retrospect this
> requires the device to be in RPM_ACTIVE so we end up skipping anything
> suspended or transitioning.

How about dropping the calls to pm_runtime_get_if_active() and
pm_runtime_put() and instead simply do:

if (pm_runtime_suspended(&pdev->dev) &&
pdev->current_state != PCI_D3cold)
pci_pme_wakeup(pdev, NULL);

Thanks,

Lukas