Re: [PATCH/RFC] I/O-check interface for driver's error handling

From: Linas Vepstas
Date: Wed Mar 02 2005 - 18:43:30 EST


On Thu, Mar 03, 2005 at 09:46:12AM +1100, Benjamin Herrenschmidt was heard to remark:
> On Wed, 2005-03-02 at 14:02 -0600, Linas Vepstas wrote:
> > On Wed, Mar 02, 2005 at 09:27:27AM +1100, Benjamin Herrenschmidt was heard to remark:
>
> > That's a style issue. Propose an API, I'll code it. We can have
> > the master recovery thread be a state machine, and so every device
> > driver gets notified of state changes:
> >
> > typedef enum pci_bus_state {
> > DEVICE_IO_FROZEN=1,
> > DEVICE_IO_THAWED,
> > DEVICE_PERM_FAILURE,
> > };
> >
> > struct pci_driver {
> > ....
> > void (*io_state_change) (struct pci_dev *dev, pci_bus_state);
> > };
> >
> > would that work?
>
> Too much ppc64-centric.

Ah Ben, you are hard to make happy.

> Also, we want to use the re-enable IOs facility of EEH to give the
> driver a chance to extract diagnostic infos from the HW.

Yes, of course. Recovery would have to happen through multiple steps,
one of which is re-enabling i/o. (Another one would be to ask all of
the device drivers affected by the outage if any of them need a
device reset.)

Right now, my goal was not to specify the final interface with all the
bits correct, but to get agreement that this particular approach, of
modifying struct pci_driver, would make everyone happy.

--linas


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/