Re: [PATCH v9 7/8] PCI/DPC: Clear AER registers in EDR mode

From: Kuppuswamy Sathyanarayanan
Date: Tue Oct 29 2019 - 19:39:58 EST



On 10/29/19 3:48 PM, Bjorn Helgaas wrote:
On Tue, Oct 29, 2019 at 01:04:29PM -0700, Kuppuswamy Sathyanarayanan wrote:
On 10/28/19 4:27 PM, Bjorn Helgaas wrote:
On Thu, Oct 03, 2019 at 04:39:03PM -0700, sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx wrote:
From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx>

As per PCI firmware specification r3.2 Downstream Port Containment
Related Enhancements ECN,
Specific reference, please, e.g., the section/table/figure of the PCI
Firmware Spec being modified by the ECN.
Ok. I will include it.
OS is responsible for clearing the AER
registers in EDR mode. So clear AER registers in dpc_process_error()
function.

Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx>
Acked-by: Keith Busch <keith.busch@xxxxxxxxx>
---
drivers/pci/pcie/dpc.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/drivers/pci/pcie/dpc.c b/drivers/pci/pcie/dpc.c
index fafc55c00fe0..de2d892bc7c4 100644
--- a/drivers/pci/pcie/dpc.c
+++ b/drivers/pci/pcie/dpc.c
@@ -275,6 +275,10 @@ static void dpc_process_error(struct dpc_dev *dpc)
pci_aer_clear_fatal_status(pdev);
}
+ /* In EDR mode, OS is responsible for clearing AER registers */
+ if (dpc->firmware_dpc)
I guess "EDR mode" is effectively the same as "firmware-first mode"?
Yes, EDR mode is an upgrade to FF mode in which firmware allows OS
to share some of it job by sending ACPI notification. If you don't
get ACPI notification, EDR mode is effectively same as FF mode.
May be I can add some documentation in code to explain the EDR mode better.
Hmm, somehow the connection between FF and EDR needs to be clear from
the code, so people who weren't involved in the development of EDR and
don't even have access to the specs/ECNs can make sense out of this.

At least, the only place we set "firmware_dpc = 1" is:

+ if (pcie_aer_get_firmware_first(pdev))
+ dpc->firmware_dpc = 1;

If they're the same, why do we need two different names for it?
For better readability and performance, I tried to cache the value of
pcie_aer_get_firmware_first() result in DPC driver.
pcie_aer_get_firmware_first() already caches the value, so I don't
think you're gaining any useful performance here, and having two
different names *decreases* readability.
Ok. I can replace "firmware_dpc" with pcie_aer_get_firmware_first() calls.

I do agree that pcie_aer_get_firmware_first() is not optimally
implemented. I think we should probably look up the firmware-first
indication explicitly during enumeration so we don't have to bother
with the dev->__aer_firmware_first_valid thing. And if we got rid of
all those leading underscores, it would probably run faster, too ;)
I agree that pcie_aer_get_firmware_first() can be optimized. I can submit a
patch for it once this patch set is merged.

+ pci_cleanup_aer_error_status_regs(pdev);
+
/*
* Irrespective of whether the DPC event is triggered by
* ERR_FATAL or ERR_NONFATAL, since the link is already down,
--
2.21.0

--
Sathyanarayanan Kuppuswamy
Linux kernel developer

--
Sathyanarayanan Kuppuswamy
Linux kernel developer