2.6.0: kjournald 100% cpu + block errors(?)

From: JG
Date: Fri Jan 23 2004 - 15:31:55 EST


hi,

i just checked some stuff on my server via vnc when suddenly the cpu load went to 100% because of kjournald. i could only reboot my server with sysrq keys, because no shell was working/accepting input.
this is the dmesg output, which also scrolled down very fast on all consoles (not on vnc, but i also looked at the monitor of my server).

[...]
block=4286578559, b_blocknr=18446744073701162879
b_state=0x00000010, b_size=4096
block=4286578559, b_blocknr=18446744073701162879
b_state=0x00000010, b_size=4096
block=4286578559, b_blocknr=18446744073701162879
b_state=0x00000010, b_size=4096
block=4286578559, b_blocknr=18446744073701162879
b_state=0x00000010, b_size=4096
block=4286578559, b_blocknr=18446744073701162879
b_state=0x00000010, b_size=4096
block=4286578559, b_blocknr=18446744073701162879
b_state=0x00000010, b_size=4096
block=4286578559, b_blocknr=18446744073701162879
[...]

i have no idea what could have caused this problem (the same problem, only other blocks, happend 11 days ago) and how i could debug this, since i have nothing else in the log files. when hitting the sysrq key to terminate the processes i got some other messages (which looked like oopses), but they were scrolling down so fast i couldn't read anything. is my OS hard disk (hda) dying (which is not older than 3 months)?

i also got this message on startup:
hda: set_drive_speed_status: status=0x58 { DriveReady SeekComplete DataRequest }
blk: queue e7dd1800, I/O limit 4095Mb (mask 0xffffffff)
hda: dma_timer_expiry: dma status == 0x60
hda: DMA timeout retry
hda: timeout waiting for DMA
blk: queue e7dd1400, I/O limit 4095Mb (mask 0xffffffff)
blk: queue e7c9b800, I/O limit 4095Mb (mask 0xffffffff)
blk: queue e7cbec00, I/O limit 4095Mb (mask 0xffffffff)
blk: queue e7cbe000, I/O limit 4095Mb (mask 0xffffffff)
blk: queue e7cafc00, I/O limit 4095Mb (mask 0xffffffff)
blk: queue e7caf400, I/O limit 4095Mb (mask 0xffffffff)
blk: queue e7caf000, I/O limit 4095Mb (mask 0xffffffff)
blk: queue e7ca6800, I/O limit 4095Mb (mask 0xffffffff)
blk: queue e7ca6400, I/O limit 4095Mb (mask 0xffffffff)
blk: queue e7c9bc00, I/O limit 4095Mb (mask 0xffffffff)
hda: status timeout: status=0xd0 { Busy }

hdb: DMA disabled
hda: drive not ready for command
ide0: reset: success
blk: queue e7dd1800, I/O limit 4095Mb (mask 0xffffffff)

i also attached the smartctl -a /dev/hda output, if it helps.

thanks very much for any info,
JG

Attachment: smartctl
Description: Binary data

Attachment: pgp00000.pgp
Description: PGP signature