Re: [PATCH V3] AER: aer_root_reset() non-native handling

From: Bjorn Helgaas
Date: Sat Nov 14 2020 - 10:37:22 EST


On Wed, Nov 04, 2020 at 03:22:44PM -0800, Sean V Kelley wrote:
> If an OS has not been granted AER control via _OSC, then
> the OS should not make changes to PCI_ERR_ROOT_COMMAND and
> PCI_ERR_ROOT_STATUS related registers. Per section 4.5.1 of
> the System Firmware Intermediary (SFI) _OSC and DPC Updates
> ECN [1], this bit also covers these aspects of the PCI
> Express Advanced Error Reporting. Based on the above and
> earlier discussion [2], make the following changes:
>
> Add a check for the native case (i.e., AER control via _OSC)
>
> Note that the current "clear, reset, enable" order suggests that the
> reset might cause errors that we should ignore. Lacking historical
> context, these will be retained.
>
> [1] System Firmware Intermediary (SFI) _OSC and DPC Updates ECN, Feb 24,
> 2020, affecting PCI Firmware Specification, Rev. 3.2
> https://members.pcisig.com/wg/PCI-SIG/document/14076
> [2] https://lore.kernel.org/linux-pci/20201020162820.GA370938@bjorn-Precision-5520/
>
> Signed-off-by: Sean V Kelley <sean.v.kelley@xxxxxxxxx>

What do I do with this patch in combination with "[PATCH v10 00/16]
Add RCEC handling to PCI/AER"? I tried applying this and the RCEC
series on top, but they conflict.

I was thinking it would be easiest to include this as the first patch
in the series so I wouldn't have to resolve the conflicts, but maybe
you had something else in mind?

> ---
> Changes since V2 :
>
> Fixed an unfortunate copy/paste error.
>
> Changes since V1 [1]:
>
> Noted lack of historical context on isolation of both the
> pci_bus_error_reset() and the clearing of Root Error Status. In fact,
> the call to aer_enable_rootport() likewise disables system error generation
> in response to error messages around the clearing of the error status. So
> retained the wrapping of the "clear, reset, enable".
>
> [1] https://lore.kernel.org/linux-pci/20201030223443.165783-1-sean.v.kelley@xxxxxxxxx/
>
> Thanks,
>
> Sean
> ---
> drivers/pci/pcie/aer.c | 31 +++++++++++++++++--------------
> 1 file changed, 17 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index 65dff5f3457a..4ab379fa1506 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -1358,26 +1358,29 @@ static int aer_probe(struct pcie_device *dev)
> static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
> {
> int aer = dev->aer_cap;
> + int rc = 0;

Unnecessary init, I think.

> u32 reg32;
> - int rc;
> -
>
> - /* Disable Root's interrupt in response to error messages */
> - pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, &reg32);
> - reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK;
> - pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
> + if (pcie_aer_is_native(dev)) {
> + /* Disable Root's interrupt in response to error messages */
> + pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, &reg32);
> + reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK;
> + pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
> + }
>
> rc = pci_bus_error_reset(dev);
> - pci_info(dev, "Root Port link has been reset\n");
> + pci_info(dev, "Root Port link has been reset (%d)\n", rc);
>
> - /* Clear Root Error Status */
> - pci_read_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, &reg32);
> - pci_write_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, reg32);
> + if (pcie_aer_is_native(dev)) {
> + /* Clear Root Error Status */
> + pci_read_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, &reg32);
> + pci_write_config_dword(dev, aer + PCI_ERR_ROOT_STATUS, reg32);
>
> - /* Enable Root Port's interrupt in response to error messages */
> - pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, &reg32);
> - reg32 |= ROOT_PORT_INTR_ON_MESG_MASK;
> - pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
> + /* Enable Root Port's interrupt in response to error messages */
> + pci_read_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, &reg32);
> + reg32 |= ROOT_PORT_INTR_ON_MESG_MASK;
> + pci_write_config_dword(dev, aer + PCI_ERR_ROOT_COMMAND, reg32);
> + }
>
> return rc ? PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_RECOVERED;
> }
> --
> 2.29.2
>