Re: SCSI issues

Shane Jensen (shane@sumus.com)
Wed, 25 Mar 1998 16:13:51 -0700


I patched the kernel (2.1.90 w/ latest aic drivers). The machine still locks up, but an error is logged to the console only. Here is part of the error, the rest scrolls off the screen (excuse any typos
since this is only logged on the screen).

CPU: 0
EIP: 0010:[<c0109f5e>]
EFLAGS: 000100082
eax: 00000010 ebx: 00000000 exc: ffffffff edx: 00000010
esi: 00000000 edi: c0108000 ebp: c0107ec4 esp: c0107e50
ds: 0018 es: 0018 ss: 0018
Process swapper (pid: 0, proccess nr: 0, stackpage=c0107000)
Stack: c01c0018 c0107ec4 ffffffff 00000000 00000000 c8800000 c9000000 c0000000
c01f0018 c0109fd8 c0107ec4 c01c1543 c01c2375 00000000 c0107ec4 c010efe9
c01c2375 c0107ec4 00000000 c0106000 00000297 00000000 c000c000 00000006
Call Trace: [<c01c0018>] [<c0107ec4>] [<c88000000>] [<c9000000>] [<c8800000>] [<c0109fd8>] [<c0107ec4>]
[<c01c1543>] [<c01c2375>] [<c0107ec4>] [<c010efe9>] [<c01c2375>] [<c0107ec4>] [<c0105000>] [<c0106000>]
[<c0109c2e>] [<c0107ec4>] [<c01a0018>] [<c01a636c>] [<c01a5ad4>] [<c0107f40>] [<c01a5b20>] [<c0111665>]
[<c0107f58>] [<c0116281>] [<c010aae2>] [<c0109bb4>] [<c0106000>] [<c01004bc>] [<c0106000>] [<c0107fdc>]
[<c0109af2>] [<c0108150>] [<c0107fdc>] [<c0108073>] [<c0106000>] [<c0100263>]
Code: 64 8a 04 0e 0f a1 88 c2 81 e2 ff 00 00 00 89 54 24 10 52 68
Aiee, killing interrupt handler
Kernel panic: Attempted to kill the idle task!
In swapper task - not syncing

Andreas Schwab wrote:

> Shane Jensen <shane@sumus.com> writes:
>
> |> Problem:
> |> A faulty cable connection was causing intermittent problems with the scsi bus. The 2.0.x kernels, the command would time out, bus would reset, log an error and the machine would come back to life and
> |> work for an undetermined amount of time and then repeat.
>
> |> With the 2.1.x kernels that I have tried (>2.1.80) so far, the system locks up solid. I've tried the standard aic7xxx drivers and the updated drivers. In any event, the lock up doesn't log any
> |> information to a file or the console.
>
> Please try this patch:
>
> --- drivers/scsi/scsi_obsolete.c.~1~ Wed Mar 18 11:00:35 1998
> +++ drivers/scsi/scsi_obsolete.c Wed Mar 18 10:59:15 1998
> @@ -152,6 +152,7 @@
>
> void scsi_old_times_out (Scsi_Cmnd * SCpnt)
> {
> + SCpnt->serial_number_at_timeout = SCpnt->serial_number;
>
> switch (SCpnt->internal_timeout & (IN_ABORT | IN_RESET | IN_RESET2 | IN_RESET3))
> {
> @@ -163,12 +164,12 @@
> }
>
> if (!scsi_abort (SCpnt, DID_TIME_OUT))
> - return;
> + break;
> case IN_ABORT:
> printk("SCSI host %d abort (pid %ld) timed out - resetting\n",
> SCpnt->host->host_no, SCpnt->pid);
> if (!scsi_reset (SCpnt, SCSI_RESET_ASYNCHRONOUS))
> - return;
> + break;
> case IN_RESET:
> case (IN_ABORT | IN_RESET):
> /* This might be controversial, but if there is a bus hang,
> @@ -182,7 +183,7 @@
> SCpnt->internal_timeout |= IN_RESET2;
> scsi_reset (SCpnt,
> SCSI_RESET_ASYNCHRONOUS | SCSI_RESET_SUGGEST_BUS_RESET);
> - return;
> + break;
> case (IN_ABORT | IN_RESET | IN_RESET2):
> /* Obviously the bus reset didn't work.
> * Let's try even harder and call for an HBA reset.
> @@ -194,16 +195,17 @@
> SCpnt->internal_timeout |= IN_RESET3;
> scsi_reset (SCpnt,
> SCSI_RESET_ASYNCHRONOUS | SCSI_RESET_SUGGEST_HOST_RESET);
> - return;
> + break;
>
> default:
> printk("SCSI host %d reset (pid %ld) timed out again -\n",
> SCpnt->host->host_no, SCpnt->pid);
> printk("probably an unrecoverable SCSI bus or device hang.\n");
> - return;
> + break;
>
> }
>
> + SCpnt->serial_number_at_timeout = 0;
> }
>
>
> --
> Andreas Schwab "And now for something
> schwab@issan.informatik.uni-dortmund.de completely different"
> schwab@gnu.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu