RE: ext2fs or IDE bug in 2.0.31

Rodney Barnett (RBarnett@us.teltech.com)
Fri, 7 Nov 1997 07:39:29 -0600


Marko.Makela@HUT.FI wrote:
> I have found a reproducable bug in Linux 2.0.31. When copying a
> large file (140 megabytes) from an ext2fs partition from one hard
> disk to an ext2fs partition to another disk (tried it both as
> /dev/hdb3 and as /dev/hdc3), the system will hang. Usually it hanged
> after 60-70 megabytes, but once it hanged at 15-20 megabytes. The
> system also hanged when I tried to fetch the file directly from the
> net to the /dev/hdb3 or /dev/hdc3 partition. Once I got the
> following message on the console:
>
> Kernel panic: skput:over: 00191fa8:246
> In swapper task - not syncing
[snip]

I'm having a problem with 2.0.31 which may be related. The following command
produced the following Aiee. However, in my case this command does not
always fail in any obvious manner.

/bin/sync;/usr/bin/sleep 2;/bin/dd if=/dev/hda of=/dev/hdc bs=64k

---------------------------------------------------------------------------

current->tss.cr3 = 00c4a000, %cr3 = 00c4a000
Unable to handle kernel paging request at virtual address c0c4ac0c
current->tss.cr3 = 00c4a000, %cr3 = 00c4a000
Unable to handle kernel paging request at virtual address c0c4ac0c
current->tss.cr3 = 00c4a000, %cr3 = 00c4a000
general protection: 0000
CPU: 0
EIP: 0010:[<00124e98>]
EFLAGS: 00010016
eax: 00000000 ebx: 01493918 ecx: 01493918 edx: 676e6972
esi: 001bb404 edi: 00003298 ebp: 001b2ea2 esp: 00b76e30
ds: 0018 es: 0018 fs: 002b gs: 002b ss: 0018
Corrupted stack page
Process dd (pid: 1880, process nr: 25, stackpage=00b79000)
Stack: 00170255 01493918 00000001 000000f2 001bb404 001cc594 ffffff01 00175149
00000001 00003298 00003298 001cc594 001750dc 001b2e24 00172296 001cc594
002b10d8 20000000 0000000f 0010c91a 0000000f 00003298 00000000 00001200
Call Trace: [<00170255>] [<00175149>] [<001750dc>] [<00172296>] [<0010c9a1>]
[<0010c771>] [<00113a28>]
[<0011117d>] [<00110ee8>] [<0010a7a0>] [<00111184>] [<00110ee8>]
[<0010a7a0>] [<00111184>] [<00110ee8>]
Code: f6 42 14 01 74 31 8b 52 10 85 d2 74 04 39 ca 75 ef 8b 41 24
Aiee, killing interrupt handler
release: dd kernel stack corruption. Aiee

----------------------------------------------------------------------------

As a result of this Aiee, the system is currently in a half-dead state.
Pings to it work, but telnets just hang. An attempt to login on the console
does not produce a password prompt, but does echo characters. One can switch
from one virtual console to another. Just hitting return at a login prompt
produces another login prompt. It appears that running processes can
continue to run, but new processes cannot be started. If there is additional
information I could collect while the system is in this state, let me know.

Prior to this particular instance, I've found that the command

time dd if=/dev/hda bs=512k | dd of=/dev/hdc bs=512k

will often, but not always, crash the system. With this command, I received
an OOPS once, but have no information about it except a vague, possibly
faulty, memory that ex2fs was mentioned somewhere. The majority of the time
that this command fails, it simply causes a reboot. I was only able to
experience a failure like this once while watching the console. At that
time, a lot of data streamed by before the reboot began. It appeared to
include a page-long hex dump, but it all went by too fast to describe in any
detail.

With the above command, I've also seen it crash with bs=128k and bs=32k if
that is relevant in any way.

System details:
motherboard: AOpen AP5VM-3
CPU: 200 MHz Pentium
memory: 64M
ide devices: hda: Seagate ST52520A
hdb: Creative CDROM drive
hdc: Seagate ST52520A

I haven't yet tried this with 2.0.30, but can do so if there's no useful
information to be obtained while the system is in its current state.

Thanks for any assistance anyone can provide.

Rodney
rbarnett@teltech.com