Re: 2.6 more picky about IDE drives than 2.4 ?

From: Sven Köhler
Date: Sun Jan 23 2005 - 12:32:22 EST


i have many problems with kernel 2.6.10 since it won't run stable with
an IDE-device. It's an internal IDE-RAID subsystem. The DMA is
frequently disabled, and even writes/reads fail and the kernel reports
I/O-Errors for many sectors. The RAID-device doesn't report any errors
it it's own event-log. You can have a closer look at the error-messages
below.

I'm mailing to the LKML, since i haven't been abled to reproduce the
problem with a kernel 2.4 bases system, but it randomly happens with 2.6
kernels. Let's take the latest Knoppix as an example (it comes with both
kernels):
- if i boot kernel 2.4, i can stress test the harddisk as much as i
want. the kernel does report any problem and it doesn't disable DMA well
- if i boot kernel 2.6, after a while, there are the error-message below
in the log. "hdparm -k1" doesn't help, the kernel will disable DMA mode.
There was a also a bigger problems for two times now, where the kernel
refused to write to the devide, due to the I/O-Errors below. I'm very
sad, that i haven't the log-lines prior to the I/O-Errors.

You didn't give any information about your hardware (controller type,
drives used etc). Please read REPORTING-BUGS in the kernel source
directory. Also please find last working kernel version (2.5 or 2.6).

The hardware used was a Intel 440GX Chipset (PIIX southbridge, max UDMA33), and a Sis 755 (SiS964 soutbridge, max UDMA133). The drive is an EasyRAID R5A.

http://www.easyraid.com/index.php?Products:easyRAID_R5A

I testes the RAID-subsystem with two different PC-systems. Always the
same result: 2.4 works, 2.6 does not. It's hard for me to reproduce the
Errors through. I'm still writing an application to reliably reproduce
them :-( Does anybody know a good stress-test perhaps? Sequential
reading doesn't seem to do the trick.

What changes have been applied to the IDE subsystem from kernel 2.4 to
kernel 2.6? What may cause this different behaviour? What does
"status=0x51" mean? And why is "error=0x00" although the Error-Bit in
the status-byte has been set. (i guess this is what status=0x51 means).

How can the behaviour of kernel 2.6 be reverted to the behaviour of
kernel 2.4? I already tried "hda=nowerr" in the append-line, but it
doesn't help either. Is it a Bug of kernel 2.6, or should i smash the
manufactures doors, to make them release a firmware-update of the
RAID-subsystem since it reports strange values to the OS?

Dunno, I don't have a magic ball... ;)

So i guess you will not know The R5A, and the Problems of Kernel 2.6 seems to be controller independant (Intel and SiS testen).
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/