Re: scsi_reset requirement questions

Doug Ledford (dledford@redhat.com)
Thu, 12 Aug 1999 23:50:49 -0400


"WANG,YIDING (HP-SanJose,ex1)" wrote:
>
> I am not clear on scsi_reset requirement. Here are the questions:
>
> 1, When scsi_reset called lower layer driver with SCSI_RESET_SYNCHRONOUS
> flag, should lower driver discard all commands since mid level
> driver will send all of them again (according to the comment in
> scsi_obsolete.c)? Or all command should be returned with DID_ERROR
> status through scsi_done?
>
> 2, What happens if the flag is SCSI_RESET_ASYNCHRONOUS? Should
> lower driver process all command as usual afetr reset?

OK, first off, the ASYNC and SYNC flags on the scsi reset call doesn't mean
what you are implying. The real intent of those two flags is that in the case
of 1 above, the command is triggering a reset condition as a result of the
return value passed to old_scsi_done() and this reset is being done as part of
the command completion processing. This is a hint to the low level driver's
reset routine that it shouldn't find the command down there anywhere, but that
it needs to take some sort of appropriate corrective action. It's up to the
low level driver to know what state it's card is in and what kind of
corrective action is needed. For example, in my driver I flag the device and
the bus as both being subject to a reset action when I do the first reset. If
I get another reset action and I haven't had any successful command
completions, then I perform stronger reset actions such as a full bus reset
(this is a gross over simplification, but it serves a purpose). The ASYNC
flag just indicates to you that the command should be somewhere on your card
and that the timeout timer is responsible for this reset action, not a command
completion.

> 3, Only SCSI_RESET_SUGGEST_HOST_RESET and SCSI_RESET_SUGGEST_BUS_RESET
> are available. What if device reset is required?

OK, the reset processing in the mid layer sucks. We all know this. You need
to implement your own heuristics to control these issues. About the best help
you'll get from the mid layer is this: the first time a command times out,
your driver's abort() function will get called, the second time it times out
your driver's reset() function will get called without either of the two flags
above, and the third time it times out your driver's reset() function will get
called with the BUS_RESET flag, and the fourth and final time the command
times out your driver's reset() routine will get called with the HOST_RESET
flag. That all sounds very simple. The harder parts are dealing with things
like conditions where the first command to time out is to device A but device
B is currently wedging the scsi bus or 40 commands all time out at roughly the
same time, none of the aborts work, they all timeout at the same time again
and all of a sudden you throw 30 bus resets in an amazingly short period of
time (I turned a 2/4GB DAT drive into a paperweight this way). You gotta
handle all that stuff.

-- 
  Doug Ledford   <dledford@redhat.com>
   Opinions expressed are my own, but
      they should be everybody's.

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/