Re: ncr53c8xx SCSI driver locks up computer after timeout message

Bradley M. Kuhn (bkuhn@ebb.org)
Wed, 24 Feb 1999 21:54:22 -0500


Gérard, I very much appreciate the time you are spending answering my
queries and posts. You are truly a kind-hearted person to assist me so
much! Thank you!

See below for comments...

Thus spoke Gerard Roudier:

> 1) - Use a setup with all features enabled: 80MB/s, disconnections, tagged
> commmads, etc ...
> - Only connect the most trustable SCSI device to the SCSI BUS.
> Undubitably it is the SEAGATE hard drive. Connect it using the Ultra2
> cable with the LVD/SE terminator at its end.
> - Remove all other SCSI cables.
> - Boot the system.
> - Stress the system using heavy IO load.

I tried this. In the middle of the stress test, I got some SCSI errors, and
then the root partition was completely and utterly corrupted.

So, I switched back to "safe:y,settle:2". I am stress testing this now (of
course, only the SEAGATE is one the chain). It seems to be surviving the
stress test.

I do get occasional timeouts and resets, but you had mentioned that this was
normal.

Now, can you give me any suggestions on what feature it might have been that
caused the ext2fs corruption, etc.

> If the system is not stable, then the device configuration that has the
> problem is now a lot simpler than the initial one.

Agreed.

> Let us assume that the system is stable. :)

It is not. :) Well, at least when I am not in safe mode.

BTW, The cabling all brand-new as is the terminator. I checked the
connections, etc. to make sure everything was tight and correct.

It appears that not in safe mode, there are real problems on the bus.

Do you have any idea what could be causing them?

> Example (hypothesis)
> - The SEAGATE is ok when it is alone on the BUS.

I was able to stop right here, as you can see above. The SEAGATE is not ok
on the bus, unless it is in safe mode.

> > > > > your system, give a try with the latest sym53c8xx-1.2a driver that has
> > > > > been recently enhanced for SCSI error recovery. Changes haven't yet been
> > > > > backported to stock ncr53c8xx driver.

> > I have a newer 895. I will try your new driver tommorrow evening.

> Yes, do it.

I got slowed down because of the ext2fs crash on the root partition :). My
plan now is to see if the system stays stable under "safe:y,settle:2" for a
while, and then move on to the newer drivers then.

> Nothing suspicious. But I would suggest you to use a normal driver set-up,
> if narrowing the features seem not to help.

The normal setup, even with just the SEAGATE on the bus made things very
worse.

> With your set-up, the driver suspends command processing for 10 seconds.
> This may perhaps confuse timeouts processing of the SCSI layer if a
> command times out within this delay or may trigger some new bug. 2 seconds
> is enough. Most O/Ses donnot use more than 3 seconds of such a delay after
> a SCSI reset.

Ah, I see. As I said, "safe:y,settle:2" works pretty well, but it is slow.
:)

-- 
      Bradley M. Kuhn   |     bkuhn@ebb.org    |   http://www.ebb.org/bkuhn

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/