Re: libata error handling

From: Mike Anderson
Date: Fri Aug 19 2005 - 15:31:23 EST


Luben Tuikov <luben_tuikov@xxxxxxxxxxx> wrote:
> On 08/19/05 15:38, Patrick Mansfield wrote:
> The eh_timed_out + eh_strategy_handler is actually pretty perfect,
> and _complete_, for any application and purpose in recovering a
> LU/device/host (in that order ;-) ).
>
> > The two problems I see with the hook are:
> >
> > It calls the driver in interrupt context, so the called function can't
> > sleep.
>
> Consider this: When SCSI Core told you that the command timed out,
> A) it has already finished,
> B) it hasn't already finished.
>
> In case A, you can return EH_HANDLED. In case B, you return
> EH_NOT_HANDLED, and deal with it in the eh_strategy_handler.
> (Hint: you can still "finish" it from there.)
>

But dealing with it in the eh_strategy_handler means that you may be
stopping all IO on the host instance as the first lun returns
EH_NOT_HANDLED for LUN based canceling.

I still think we can do better here for an LLDD that cannot execute a
cancel in interrupt context.

Having a error handler that works is a plus, I would hope that
some factoring would happen over time from the eh_strategy_handler to
some transport (or other factor point) error handler. I would think from a
testing, support, and block level multipath predictability sharing code
would be a good goal.

-andmike
--
Michael Anderson
andmike@xxxxxxxxxx
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/