HPT372 DMA corruption

From: Chuck Berg
Date: Fri Jan 09 2004 - 15:08:27 EST


I have an HPT372 onboard a Soyo Dragon KT400 board. I get corruption when
reading from drives on this controller. (I don't dare write, so I don't
know if writes are affected as well). I'm using 2.6.1, but have experienced
this problem with older kernels (at least it's not crashing anymore).

Disabling DMA on these drives stops the corruption.

It only happens when I heavily exercise certain other devices on the system
at the same time. This includes both a network and a SCSI PCI card that
share an irq with the HPT372, but also a firewire card that does not. Load
on the other onboard IDE interfaces (which do not demonstrate the
corruption) does not trigger it.

I have four identical drives in the system. I've switched cables around to
verify that the problem follows the HPT372. I have no problems when I move
the drives to a Promise card. I have run memtest86 for over 24 hours with
no errors.

It happens whether I read the /dev/hd* devices, a file on the filesystem,
or using /dev/raw.

Here's an example of the corruption. (cmp -l output, left is bad right is
good). It always consists of bytes being replaced with 0. It's usually 4,
but sometimes 64 bytes at a time. On average, about 256 bytes per gigabyte
read are corrupt.

51642365 0 211
51642366 0 154
51642367 0 163
51642368 0 120
63700989 0 100
63700990 0 153
63700991 0 216
63700992 0 2
89260029 0 31
89260030 0 327
89260031 0 200
89260032 0 13

More information (.config, /proc/{interrupts,cpuinfo,ioports,iomem,ide}, bootup
messages, lspci -vvv) is at:
http://www.encinc.com/~chuck/kt400/2.6.1/

Thanks for any help.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/