Re: qlogic 2312 problems on 22.214.171.124, 2.6.18rc4
From: Andrew Vasquez
Date: Wed Aug 16 2006 - 18:15:43 EST
On Wed, 16 Aug 2006, Arkadiusz Miskiewicz wrote:
> On Wednesday 16 August 2006 20:14, Andrew Vasquez wrote:
> > > Everything works fine on 32-bit dual xeon machine with card inserted into
> > > PCI slot. Here in new dual core opteron machine the card sits in PCI-X
> > > slot. It's Thunder K8SRE S2891 mainboard, Transport GT24 B2891 1U
> > > barebone, Tyan M2075 riser card, 2 x Opteron 270, 6GB ram.
> > Have you ruled out motherboard/memory issues? We've run 23xx and 24xx
> > boards on a variety of AMD64 motherboards with more than 4GB of
> > memory.
> Not exactly. I have three such servers (the same board, cpu, ram).
> One of these has PCI-E scsi raid controller and works without any problem.
> Second one (which I'm using for testing) has Adaptec SCSI controller
> in Tyan specific TARRO slot (which afaik sits on the same PCI-X bus as PCI-X slot).
> I had an array on that adaptec for some time and there were no problems.
> The second one also got own qlogic 2312 card.
> Third machine has other qlogic 2312 card and I tried it two times,
> too - the same issues observed.
> So mainboards and memory should be fine but I guess there is possibility
> that PCI-X slot is somehow broken there. I'll test that in several weeks
> (unfortunately I have no physical access to these now) by putting some
> typical scsi controller into that pci-x slot.
> > > Note that booting with 32-bit kernel (which works fine on Xeon system)
> > > doesn't cure the problem on Opteron system. Booting 64bit 2.6.18rc4
> > > kernel with mem=3G also doesn't fix anything.
> > Hmm... Can you send your dmesg output post boot and driver load?
> dmesg and lspci -vvv attached.
> There is one thing that worries me:
> QLogic QLA2340 -
> ISP2312: PCI-X (133 MHz) @ 0000:09:08.0 hdma+, host#=0, fw=3.03.20 IPX
> Why it's showing 133MHz here?
> Tyan manual ftp://ftp.tyan.com/manuals/m_s2891_100.pdf says on page 4
> ,,One PCI-X 100MHz slot (PCI-X B)''.
> Hm, while looking at that manual I see that on page 13 it says ,,one pci-x 100MHz device''.
> Maybe that's the problem since I didn't remove Adaptec TARO controller. I should be able to
> get this tested again without adaptec sooner than several weeks. Maybe I misunderstand
> what documentation says here.
> Anyway still don't know why driver displays 133Mhz while manual says about
> 100MHz one pci-x slot.
that is odd... I'm just dumping the bits from the control-register
which are set by hardware to indicate which type of slot the HBA is
> > > /t is on tmpfs, /dev/sda2 in on FC array. Reading the same data several
> > > times and I get different md5sum results each time, see below.
> > >
> > > How I can track where corruption occurs?
> I started dd'ing text file onto /dev/sda2, then back to tmpfs and diffing these. Corruption
> is one byte, several not corrupted bytes and again one corrupted byte. Some regular pattern.
Could you send me the a snippet (2 to 4KB) of the good and bad data?
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/