SCSI mlqueue related bug

Chris Loveland (cwl@fclinux-e.iol.unh.edu)
Wed, 14 Apr 1999 13:57:53 -0400 (EDT)


i was seeing a panic coming from the SCSI_SLEEP line in the following
chunk of code from scsi_do_cmd in scsi.c.

while (SCSI_BLOCK((Scsi_Device *) NULL, host)) {
spin_unlock(&io_request_lock); /* FIXME!!! */
SCSI_SLEEP(&host->host_wait, SCSI_BLOCK((Scsi_Device *) NULL, host));
spin_lock_irq(&io_request_lock); /* FIXME!!! */
}

the error was because SCSI_SLEEP was being called from within an irq
handler. this problem appeared to come about because rw_intr from sd.c
called requeue_sd_request which in turn called scsi_do_cmd. if this
sequence takes place while host->host_blocked is set, however, the code
will enter the loop and try to sleep which causes the error. when using
the new_eh_code host_blocked gets set when scsi_mlqueue_insert is called
so the following sequence can lead to a hang.
1) queuecommand function returns 1
2) scsi_mlqueue_insert is called and host_blocked is set
3) some other scsi command enters rw_intr
4) rw_intr tries to queue up another command but it tries to block

included below is a patch which just checks for this error condition and
if it is found inserts the command in the mlqueue instead of trying to
block.

chris

- - - Cut Here - - -
--- scsi.c~ Tue Apr 13 15:29:57 1999
+++ scsi.c Wed Apr 14 13:45:45 1999
@@ -1417,6 +1417,7 @@
{
struct Scsi_Host * host = SCpnt->host;
Scsi_Device * device = SCpnt->device;
+ int use_mlqueue = 0;

SCpnt->owner = SCSI_OWNER_MIDLEVEL;

@@ -1451,6 +1452,10 @@
SCpnt->pid = scsi_pid++;

while (SCSI_BLOCK((Scsi_Device *) NULL, host)) {
+ if (in_interrupt() && host->hostt->use_new_eh_code){
+ use_mlqueue = 1;
+ break;
+ }
spin_unlock(&io_request_lock); /* FIXME!!! */
SCSI_SLEEP(&host->host_wait, SCSI_BLOCK((Scsi_Device *) NULL, host));
spin_lock_irq(&io_request_lock); /* FIXME!!! */
@@ -1497,7 +1502,15 @@
SCpnt->internal_timeout = NORMAL_TIMEOUT;
SCpnt->abort_reason = 0;
SCpnt->result = 0;
- internal_cmnd (SCpnt);
+
+ if (use_mlqueue){
+ /* Assign a unique nonzero serial_number. */
+ if (++serial_number == 0) serial_number = 1;
+ SCpnt->serial_number = serial_number;
+ scsi_mlqueue_insert(SCpnt, SCSI_MLQUEUE_HOST_BUSY);
+ }
+ else
+ internal_cmnd (SCpnt);

SCSI_LOG_MLQUEUE(3,printk ("Leaving scsi_do_cmd()\n"));
}

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/