Re: CMD680, kernel 2.4.21, and heartache

From: jimbleferret
Date: Fri Oct 03 2003 - 20:58:35 EST


On Friday 03 October 2003 07:23 am, Erik Bourget wrote:
> Day 0: 8 new NFS servers go online, they are P4-2.4GHz boxes
> with two each 120GB Samsung drives attached to CMD680/SiI680
> IDE controllers. They run Debian stable on a 2.4.21 kernel,
> with SMP enabled though they are uniproc boxes, running
> NFSv3-via-TCP and reiserfs. CMD680/siimage support compiled
> in, obviously. Software RAID, mirroring drives.
>
> Out of 8 boxes:
>
> *) One has crashed hard. I'm about to drive to the datacenter
> to plug in a KVM and take a picture.
> *) Three have had DMA turned off and have given extremely
> spooky errors. Read below.

I've been having very similar problems with a box here. Just put
Gentoo on it, and started having weird errors almost immediately
- libraries not being found, all the way to gcc being unable to
make executables. Occasionally, I'd get a full lock - network
was dead from the outside, and locally everything was frozen.
The logs pointed to the same things - 'hda: status error:
status=0x58 { DriveReady SeekComplete DataRequest },' 'hda:
timeout waiting for DMA' and then a 'reset: success,' but it
didn't seem that way. Turning off DMA didn't seem to have any
effect on gcc problems or lockups.

The Maxtor utility said everything was ok, and I had been using
FreeBSD on it for months with no problems there. Heat isn't a
problem.

I then took out SiS5513 support (CONFIG_BLK_DEV_SIS5513), and I
haven't had any problems since. DMA is back on, and I've had a
couple of timeouts and resets, but only 3 or 4 in ~10 hours, and
no lockups or other noticeable weirdness.

Kernel version is gentoo-sources 2.4.20-r7, but I think that has
patches from >=2.4.21. I haven't tried any other sources.

Drive is a Maxtor 8 Gig, 90871U2.

Board is a PCChips M571.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/