Re: [PATCH v2] PCI: Call _REG when saving/restoring PCI state

From: Bjorn Helgaas
Date: Tue Jun 06 2023 - 15:58:59 EST


On Tue, Jun 06, 2023 at 02:40:45PM -0500, Limonciello, Mario wrote:
> On 6/6/2023 2:23 PM, Bjorn Helgaas wrote:
> > On Tue, Jun 06, 2023 at 11:23:21AM -0500, Mario Limonciello wrote:
> > > ASMedia PCIe GPIO controllers fail functional tests after returning from
> > > suspend (S3 or s2idle). This is because the BIOS checks whether the
> > > OSPM has called the `_REG` method to determine whether it can interact with
> > > the OperationRegion assigned to the device.
> > >
> > > As described in 6.5.4 in the APCI spec, `_REG` is used to inform the AML
> > > code on the availability of an operation region.
> > >
> > > To fix this issue, call acpi_evaluate_reg() when saving and restoring the
> > > state of PCI devices.
> > >
> > > Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/06_Device_Configuration/Device_Configuration.html#reg-region
> > > Signed-off-by: Mario Limonciello <mario.limonciello@xxxxxxx>
> > > ---
> > > v1->v2:
> > > * Handle case of no CONFIG_ACPI
> > > * Rename function
> > > * Update commit message
> > > * Move ACPI calling code into pci-acpi.c instead
> > > * Cite the ACPI spec
> >
> > Thanks for the spec reference (s/APCI/ACPI/ and add the revision if
> > you rev this (r6.5 is the latest, AFAIK) if you rev this).
> >
> > I don't see text in that section that connects S3 with _REG. If it's
> > there, you might have to quote the relevant sentence or two in the
> > commit log.
>
> I don't think there is anything the spec connecting this
> with S3.  At least from my perspective S3 is the reason
> this was exposed but there is a deficiency that exists
> that _REG is not being called by Linux.
>
> I intend to re-word the commit message something to the
> effect of explaining what _REG does and why _REG should be
> called, along with citations.
>
> Then in another paragraph "Fixing this resolves an issue ...".
>
> > You mentioned _REG being sort of a mutex to synchronize OSPM vs
> > platform access; if there's spec language to that effect, let's cite
> > it.
>
> That sentence I included was cited from the spec.

If it's necessary to justify the commit, include the citation in the
commit log.

> > Ideally we should have been able to read the PCI and ACPI specs and
> > implement this without tripping over problem on this particular
> > hardware. I'm looking for the text that enables that "clean-room"
> > implementation. If the spec doesn't have that text, it's either a
> > hole in the spec or a BIOS defect that depends on something the spec
> > doesn't require.
>
> IMO both the spec and BIOS are correct, it's a Linux
> issue that _REG wasn't used.

What tells Linux that _REG needs to be used here? If there's nothing
that tells Linux to use _REG here, I claim it's a BIOS defect. I'm
happy to be convinced otherwise; the way to convince me is to point to
the spec.

If it's a BIOS defect, it's fine to work around it, but we need to
understand that, own up to it, and make the exact requirements very
clear. Otherwise we're likely to break this in the future because
future developers and maintainers will rely on the specs.

> > Doing this in pci_save_state() still seems wrong to me. For example,
> > e1000_probe() calls pci_save_state(), but this is not part of suspend.
> > IIUC, this patch will disconnect the opregion when we probe an e1000
> > NIC. Is that what you intend?
>
> Thanks for pointing this one out.  I was narrowly focused
> on callers in PCI core.  This was a caller I wasn't
> aware of; I agree it doesn't make sense.
>
> I think pci_set_power_state() might be another good
> candidate to use.  What do you think of this?

I can't suggest a call site because (1) I'm not a power management
person, and (2) I don't think we have a clear statement of when it is
required. This must be expressed in terms of PCI power state
transitions, or at least something discoverable from a pci_dev, not
"s2idle" or even "S3" because those are meaningless in the PCI
context.

Bjorn