- signalled errors like:
Nov 1 01:04:03 twinlark kernel: ks: Freeing blocks not in datazone - block = 4496384, count = 1
Nov 1 01:04:03 twinlark kernel: EXT2-fs error (device 16:02): ext2_free_blocks: Freeing blocks not in datazone - block = 12828672, count = 1
Nov 1 01:04:03 twinlark kernel: EXT2-fs error (device 16:02): ext2_free_blocks: Freeing blocks not in datazone - block = 4496384, count = 1
...
Nov 1 01:04:03 twinlark kernel: EXT2-fs warning (device 16:02): ext2_free_blocks: bit already cleared for block 606482
Nov 1 01:04:03 twinlark kernel: EXT2-fs warning (device 16:02): ext2_free_blocks: bit already cleared for block 606483
Nov 1 01:04:03 twinlark kernel: EXT2-fs warning (device 16:02): ext2_free_blocks: bit already cleared for block 606484
Nov 1 01:04:03 twinlark kernel: EXT2-fs warning (device 16:02): ext2_free_blocks: bit already cleared for block 606485
- then a task (part of a mirror) runs "rm -r" on a directory which
has been corrupted due to some of the above errors, and
syslog records
Nov 1 12:47:15 twinlark kernel: EXT2-fs warning (device 16:02): empty_dir: bad directory (dir #426006) - no `.' or `..'
- the rm task becomes unkillable, consuming as much CPU as it can
(it's stuck in the kernel somewhere I presume)
- when the system attempts to reboot it trips one of the deadlock
patch warnings:
NON-IRQ DEADLOCK DETECTED BY CPU n
As far as I've been following the traffic here I don't think this
was dealt with in the subsequent 2.0.31 patches. But I do plan
to update the system anyhow.
Some comments about reliability:
- it'd be nice if it were possible to kill off that rm, or at
least suspend it, even if it is stuck in the kernel
- it'd be nice if during a reboot attempt, the deadlock detection
didn't halt, it should force a reboot
Dean