Re: [PATCH 1/3] PCI/AER: Option to leave System Error Interrupts as-is

From: Keith Busch
Date: Fri Nov 02 2018 - 12:37:05 EST


On Fri, Nov 02, 2018 at 05:26:23PM +0100, Borislav Petkov wrote:
> On Fri, Nov 02, 2018 at 10:17:30AM -0600, Keith Busch wrote:
> > VMD acts a bit like a host-bus adapter. The firmware knows about the
> > adapter, but not about anything on the bus that it attaches to.
> >
> > This "hybrid" approach is basically saying that the firmware knows about
> > the HBA, and it wants a chance to be notified of errors on the bus it
> > attaches to, but the firmware can't do anything about such errors.
> >
> > The bus in this case is PCIe, where we have capable error handling in the
> > kernel driver, so we ultimately want the AER driver handling the errors.
>
> Not a problem - GHES already knows about AER and calls into it for
> CPER_SEC_PCIE errors:
>
> ghes_do_proc
> -> ghes_handle_aer
> |-> aer_recover_queue

That requires firmware know about the PCIe domain that experienced an
error so that it can provide an appropriate CPER. That wouldn't be
possible for errors occuring within a VMD domain.