Re: [PATCH v2 1/3] PCI/AER: Fix incorrect return from aer_hest_parse()

From: Bjorn Helgaas
Date: Tue Jun 04 2013 - 13:39:28 EST


On Mon, Jun 3, 2013 at 3:18 PM, Betty Dall <betty.dall@xxxxxx> wrote:
> On Sat, 2013-06-01 at 18:38 -0600, Bjorn Helgaas wrote:
>> [+cc Bob for ACPI HEST spec questions]
>>
>> On Thu, May 30, 2013 at 8:39 AM, Betty Dall <betty.dall@xxxxxx> wrote:
>> > The function aer_hest_parse() is called to determine if the given
>> > PCI device is firmware first or not. The code loops through each
>> > section of the HEST table to look for a match. The bug is that
>> > the function always returns whether the last HEST section is firmware
>> > first. The fix stops the iteration once the info.firmware_first
>> > variable is set. This is similar to how the function aer_hest_parse_aff()
>> > stops the iteration.
>> >
>> > Signed-off-by: Betty Dall <betty.dall@xxxxxx>
>> > ---
>> >
>> > drivers/pci/pcie/aer/aerdrv_acpi.c | 3 +++
>> > 1 files changed, 3 insertions(+), 0 deletions(-)
>> >
>> >
>> > diff --git a/drivers/pci/pcie/aer/aerdrv_acpi.c b/drivers/pci/pcie/aer/aerdrv_acpi.c
>> > index 5194a7d..39b8671 100644
>> > --- a/drivers/pci/pcie/aer/aerdrv_acpi.c
>> > +++ b/drivers/pci/pcie/aer/aerdrv_acpi.c
>> > @@ -42,6 +42,9 @@ static int aer_hest_parse(struct acpi_hest_header *hest_hdr, void *data)
>> > u8 bridge = 0;
>> > int ff = 0;
>> >
>> > + if (info->firmware_first)
>> > + return 0;
>> > +
>> > switch (hest_hdr->type) {
>> > case ACPI_HEST_TYPE_AER_ROOT_PORT:
>> > pcie_type = PCI_EXP_TYPE_ROOT_PORT;
>>
>> Not related directly to your patch, Betty, but I can't figure out why
>> the ACPI spec defines the HEST structures for PCIe as it does. I'm
>> looking at ACPI 5.0, sec 18.3.2.3 - 18.3.2.5.
>
>> 1) The PCIe Root Port, PCIe Device, and PCIe/PCI-X Bridge structures
>> all include Bus, Device, and Function fields. But there's no Segment.
>> The current Linux code (hest_match_pci()) assumes HEST records can
>> only apply to PCI domain 0. Is Linux missing something, or is the
>> HEST really this limited?
> You are right that the HEST table does not have the Segment for the PCIe
> sources. The Linux code uses the Generic Source type that points to a
> UEFI CPER record. Those do have the Segment. The code in
> acpi/apei/ghes.c that parses the HEST and invokes the
> aer_recover_queue() is using the segment from the CPER record.

My question has nothing to do with CPER. Drivers can enable error
reporting for their devices with pci_enable_pcie_error_reporting().
If the device is marked "firmware-first" in the HEST, this call fails
without enabling reporting. The HEST can only mark devices in domain
0 as "firmware-first." Therefore pci_enable_pcie_error_reporting()
will enable reporting for:

- domain 0 devices not marked "firmware-first" and
- all devices in other domains.

This inconsistency seems like a hole in the spec, but maybe I'm
missing something.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/