RE: PCI Error Recovery API Proposal. (WAS:: [PATCH/RFC]PCIErrorRecovery)

From: Paul Mackerras
Date: Thu Mar 17 2005 - 23:10:58 EST


Nguyen, Tom L writes:

> We decided to implement PCI Express error handling based on the PCI
> Express specification in a platform independent manner. This allows any
> platform that implements PCI Express AER per the PCI SIG specification
> can take advantage of the advanced features, much like SHPC hot-plug or
> PCI Express hot-plug implementations.

Does the PCI Express AER specification define an API for drivers?

> For PCI Express the endpoint device driver can take recovery action on
> its own, depending on the nature of the error so long as it does not
> affect the upstream device. This can include endpoint device resets.

Likewise, with EEH the device driver could take recovery action on its
own. But we don't want to end up with multiple sets of recovery code
in drivers, if possible. Also we want the recovery code to be as
simple as possible, otherwise driver authors will get it wrong.

> To support the AER driver calling an upstream device to initiate a reset
> of the link we need a specific callback since the driver doing the reset
> is not the driver who got the error. In the case of general PCI this

I would see the AER driver as being included in the "platform" code.
The AER driver would be be closely involved in the recovery process.

What is the state of a link during the time between when an error is
detected and when a link reset is done? Is the link usable? What
happens if you try to do a MMIO read from a device downstream of the
link?

Regards,
Paul.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/