Re: Kernel panic: 2.1.121 with SCSI DAT drive

Kai M{kisara (makisara@metla.fi)
Tue, 15 Sep 1998 10:23:07 +0300 (EET DST)


On 14 Sep 1998, Philippe Troin wrote:
...
> I also have a lot of scsi tape weirdnesses on 2.1.121. Specifically,
> stupid mt tricks don't work anymore (mt bsfm gives I/O error
> sometimes). No panics though using vanilla 2.1.121 on AIC78xxx with
> Archive Python DAT drive.
>
The patch did not touch the bsfm command. If you change the line
'#define DEBUG 0' in linux/drivers/scsi/st.c to '#define DEBUG 1', the
driver writes to the console/log more information about the problems it
encounters. Enabling the verbose SCSI messages in kernel configuration
also helps.

> Plus if I try to dump some filesystems, the dump process hanges on
> down_failed forever:
>
> 100 0 367 366 0 0 1028 608 wait4 S p2 0:02 dump
> 140 0 368 367 0 0 1052 660 unix_data_w S p2 0:00 dump
> 44 0 369 368 0 0 0 0 do_exit Z p2 0:00 dump
> 44 0 370 368 0 0 0 0 down_failed DW p2 0:00 (dump)
> 40 0 371 368 0 0 1028 616 down_failed D p2 0:00 dump
>
This sounds like the problem some people encounter but I have never been
able to reconstruct (I will try again tonight with dump). The process is
hanging at down() which probably means that the tape driver is waiting for
the previously sent SCSI command to finish. There are at least the
following two possibilities:
1. There is a bug in the tape driver so that it will never call up() or
the SCSI interrupt is lost, or
2. The SCSI bus is hung.

The timeout in the tape driver is very long (900 seconds) and one needs a
lot of patience in order to find out if the system is waiting for a
timeout or is really hung. You can make the timeout shorter by either
editing the driver (change ST_TIMEOUT) or using mt (mt sttimeout xxx).
A timeout of 60 seconds would probably be enough for a DAT.

Kai

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/