oopsless lockup. need idedma and soundcard gurus.

David Mansfield (david@cobite.com)
Mon, 13 Sep 1999 11:09:20 -0400 (EDT)


I can reproduce at will a complete system lockup on kernels 2.2.12 and
2.3.17. The ingredients are:

Hard drive: Model=WDC AC34300L, FwRev=21.10N21
IDE interface: Intel 82371SB PIIX3 IDE (rev 0).

combined with two soundcards, a MSS (ad1848 driver) and a sb16. The crash
takes place when I am reading from the sb16, writind to the ad1848 and
trying to read and write data to the disk (in one multithreded app). The
disk access is via mmap, and the soundcard access has been via mmap and
read/write, both produce lockups. The soundcard fragments are very small
and thus there are a LOT of audio interrupts.

The lockup WILL NOT HAPPEN if the dma flag is not set on /dev/hdd,
however, I have used DMA on this drive for months with no problem. Without
the DMA flag, of course, disk I/O takes a lot of CPU cycles, making my app
break (it's a cpu hog).

Other HW info:

PPro 200, 96mb ram running for 2 years w/no hardware problems. Complete
hdparm output for /dev/hdd

Model=WDC AC34300L, FwRev=21.10N21, SerialNo=WD-WT4733143765
Config={ HardSect NotMFM HdSw>15uSec SpinMotCtl Fixed DTR>5Mbs FmtGapReq
}
RawCHS=8896/15/63, TrkSize=57600, SectSize=600, ECCbytes=22
BuffType=3(DualPortCache), BuffSize=256kB, MaxMultSect=16, MultSect=16
DblWordIO=no, maxPIO=2(fast), DMA=yes, maxDMA=0(slow)
CurCHS=8896/15/63, CurSects=8406720, LBA=yes
LBA CHS=523/255/63 Remapping, LBA=yes, LBAsects=8406720
tDMA={min:120,rec:120}, DMA modes: mword0 mword1 *mword2
IORDY=on/off, tPIO={min:160,w/IORDY:120}, PIO modes: mode3 mode4
UDMA modes: mode0 mode1 mode2

3 other IDE disks (not active during lockup):

/dev/hda: Model=WDC AC2540F, FwRev=12.08R33
/dev/hdb: Model=WDC AC31600H, FwRev=17.11P19
/dev/hdc: Model=WDC AC2540H, FwRev=12.08R30

Sw Info: Kernels compiled w gcc 2.7.2.3, on RedHat 5.1 plus package
upgrades.

Well, the system locks completely solid running direct to hard drive
recording software I am working on, often with the red HD light on. No
sysrq, and I don't have another computer to ping it.

I tried putting printk when entering and leaving the dma-intr, the adintr
and the sbintr interrput handlers, to see if it is deadlocking in one of
these routines, but it doesn't always do so (it USUALLY locks up in the
ad1848 handler, but sometimes immediately following).

The sb is io=0x220 irq=5 dma=1 dma16=5 mpu_io=0x330
The ad1848 is io=0x530 irq=10 dma=3 dma2=-1

I'll do the debugging if you point me in the right direction...

Thanks,
David

-- 
/==============================\
| David Mansfield              |
| david@cobite.com             |
\==============================/

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/