Re: ncr53c8xx problem

From: Gerard Roudier (groudier@club-internet.fr)
Date: Sat Jan 29 2000 - 08:21:07 EST


Hello,

On Fri, 28 Jan 2000, Andrzej Krzysztofowicz wrote:

> Hi,
> I have a problem trying to use on old NCR53810 based controller with
> EPoX SMP Intel-LX based MoBo (2x PII 300). No matter UP or SMP kernel
> is tried. It works fine with same MoBo in W95 (so it doesn't seem to be
> a hardware conflict). And it works fine in Linux with another MoBo
> (Intel-HX based Asus, P166), when there's no shared interrupts.

I use an ASUS PL97 LX chipset based MB since 2 years and didn't experience
any problem using numerous kinds of 538XX chipsets on this board. But I
donnot have a 53C810 rev. 1.

> I've tested different kernel versions in range: 2.1.120 - 2.2.14.
> Same kernels with another MoBo work ...
>
> The driver enters an infinite(?) loop during initialization (when there is
> no shared PCI interrupt):
>
> ncr53c8xx: at PCI bus 0, device 10, function 0
> ncr53c8xx: 53c810 detected
> ncr53c810-0: rev=0x01, base=0xe8000000, io_port=0x6800, irq=17
> ncr53c810-0: ID 7, Fast-10, Parity Checking
> ncr53c810-0: restart (scsi reset).
> scsi1 : ncr53c8xx - version 3.2a-2
> scsi : 2 hosts.
> ncr53c810-0:0: ERROR (a0:0) (8-0-0) (0/3) @ (script 10:721a0000).

DSTAT=0xa0 (0x20=PCI BUS FAULT)
Bit 0x20 means that a PCI transaction issued by the 53C810 chip has been
terminated with abort condition by the PCI target (Host bridge here).
The problem seems to occur when the 53C810 fetches the next SCRIPTS
instruction from memory (using a PCI read transaction).

> ncr53c810-0: script cmd = 80080000
> ncr53c810-0: regdump: ca 00 00 03 47 00 00 1f 71 08 02 00 80 00 0f 02.
> ncr53c810-0: have to clear fifos.
> ncr53c810-0: restart (scsi reset).
> ncr53c810-0:0: ERROR (a0:0) (8-0-0) (0/3) @ (script 10:721a0000).
> ncr53c810-0: script cmd = 80080000
> ncr53c810-0: regdump: ca 00 00 03 47 00 00 1f 71 08 02 00 80 00 0f 02.
> ncr53c810-0: have to clear fifos.
> ncr53c810-0: restart (scsi reset).
> ncr53c810-0:0: ERROR (a0:0) (8-0-0) (0/3) @ (script 10:721a0000).
> ncr53c810-0: script cmd = 80080000
> ncr53c810-0: regdump: ca 00 00 03 47 00 00 1f 71 08 02 00 80 00 0f 02.
> ncr53c810-0: have to clear fifos.
> ncr53c810-0: restart (scsi reset).
> ncr53c810-0:0: ERROR (a0:0) (8-0-0) (0/3) @ (script 408:50000000).
> ncr53c810-0: script cmd = 80000000
> ncr53c810-0: regdump: ca 00 00 03 47 00 00 1f 71 08 02 00 80 00 0f 02.
> ncr53c810-0: have to clear fifos.
> scsi : aborting command due to timeout : pid 2834, scsi1, channel 0, id 0,
> lun 0 0x00 00 00 00 00 00
> ncr53c8xx_abort: pid=2834 serial_number=2853 serial_number_at_timeout=2853
> ncr53c810-0: abort ccb=c685cda0 (cancel)
> SCSI host 1 abort (pid 2834) timed out - resetting
> SCSI bus is being reset for host 1 channel 0.ncr53c8xx_reset: pid=2834
> reset_flags=2 serial_number=2853 serial_number_at_timeout=2853
> ncr53c810-0: restart (scsi reset).
>
> .. etc.
>
> When its iterrupt is shared with another device (AIC, 3C905, S3Virge),
> the behaviour is different:
>
> ncr53c8xx: at PCI bus 0, device 13, function 0
> ncr53c8xx: 53c810 detected
> ncr53c810-0: rev=0x01, base=0xe8001000, io_port=0x6c00, irq=16
> ncr53c810-0: ID 7, Fast-10, Parity Checking
> CACHE TEST FAILED: timeout.
> CACHE INCORRECTLY CONFIGURED.

The error is weird there. The SCRIPTS that probes against cache
configuration is pretty simple and sharing the interrupt should not
make differences since no interrupt should occur at this time.

W95 seeming to work does not prove anything. A hardware problem may only
affect a subset of the possible O/Ses since different O/Ses and drivers
behave differently. Btw, I only have some understanding of Linux and
FreeBSD. Comparisons against these 2 ones may help a bit more.

> ncr53c810-0: detaching...
> scsi : 1 host. <---- This is on-board AIC, which works...
>
> Same behaviour with SMP/UP kernel, module/compiled into kernel, compiled for
> i586/i686, different driver config settings, etc...

May-be it is a coincidence, but the 3 BUS FAULTs occur after the SCRIPTS
having switched the controller LED. My immediate suggestion is to check
that the MB configuration is correct, especially the PCI BUS clock that
shall be 33 MHz or less, and to make sure that the power supply is good,
then to give a try with the following as possible:

1) Tell the driver at boot to not drive the controller LED.
   Use the following boot options: ncr53c8xx=led:n

2) Remove physically the second processor and see if this makes
   differences.

3) Give a try with the minimum of PCI boards installed.

4) Move PCI boards to different connectors as possible.

5) Give a try with minimum of memory installed.
   If you system has several DIM, move them or try successively with
   only one of them installed.

> This is the device
>
> Non-VGA device: NCR 53c810 (rev 1).
> Medium devsel. IRQ 17. Master Capable. Latency=32.
> I/O at 0x6800 [0x6801].
> Non-prefetchable 32 bit memory at 0xe8000000 [0xe8000000].
>
> Is there any chance to get it working ?
> Could you, please, help me to find where the problem is ?

Regards,
   Gérard.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Jan 31 2000 - 21:00:23 EST