Re: [PATCH v2 0/5] Parse the PCIe AER and set to relevant registers

From: LeoLiuoc
Date: Wed Apr 12 2023 - 05:26:08 EST




在 2023/4/8 7:18, Bjorn Helgaas 写道:
[+cc Sathy, Ming, since they commented on the previous version]

On Tue, Nov 15, 2022 at 11:11:15AM +0800, LeoLiu-oc wrote:
From: leoliu-oc <leoliu-oc@xxxxxxxxxxx>

According to the sec 18.3.2.4, 18.3.2.5 and 18.3.2.6 in ACPI r6.5, the
register values form HEST PCI Express AER Structure should be written to
relevant PCIe Device's AER Capabilities. So the purpose of the patch set
is to extract register values from HEST PCI Express AER structures and
program them into AER Capabilities. Refer to the ACPI Spec r6.5 for a more
detailed description.

I wasn't involved in this part of the ACPI spec, and I don't
understand how this is intended to work.

I see that this series extracts AER mask, severity, and control
information from the ACPI HEST table and uses it to configure PCIe
devices as they are enumerated.

What I don't understand is how this relates to ownership of the AER
capability as negotiated by the _OSC method. Firmware can configure
the AER capability itself, and if it retains control of the AER
capability, the OS can't write to it (with the exception of clearing
EDR error status), so this wouldn't be necessary.

There is no relationship between the ownership of the AER related register and the ownership of the AER capability in the OS or Firmware. The processing here is to initialize the AER related register, not the AER event. If Firmware is configured with AER register, it will not be able to handle the runtime hot reset and link retrain cases in addition to the hotplug case you mentioned below.


If the OS owns the AER capability, I assume it gets to decide for
itself how to configure AER, no matter what the ACPI HEST says.


What information does the OS use to decide how to configure AER? The ACPI Spec has the following description: PCI Express (PCIe) root ports may implement PCIe Advanced Error Reporting (AER) support. This table(HEST) contains information platform firmware supplies to OSPM for configuring AER support on a given root port. We understand that HEST stands for user to express expectations.

In the current implementation, the OS already configures a PCIE device based on _HPP/_HPX method when configuring a PCI device inserted into a hot-plug slot or initial configuration of a PCI device at system boot. HEST is just another way to express the desired configuration of the user.

Yours sincerely,
Leoliu-oc

Maybe this is intended for the case where firmware retains AER
ownership but the OS uses native hotplug (pciehp), and this is a way
for the OS to configure new devices as the firmware expects? But in
that case, we still have the problem that the OS can't write to the
AER capability to do this configuration.

Bjorn