Re: same ext4 file system corruption on different machines

From: Theodore Ts'o
Date: Thu Jan 30 2014 - 22:00:46 EST


On Thu, Jan 30, 2014 at 08:59:09AM +0100, Luca Ognibene wrote:
> Yes it's indeed very strange.. i tend to rule out application errors
> because i don't write directly to the device so i don't think i can
> break a filesystem from userspace. I've checked previous and next blocks
> and they seem ok, only the block 524320 is getting corrupted. Any idea
> on what should i look for now?

Are you willing to try 3.12.9 or 3.13.1 upstream kernel? Let's see if
changing the kernel makes any difference. I don't recall any ext4
problems like this, but maybe it's device driver problem.

The other thing I'd ask is whether you can swap out the hard drive
interface --- can you use a USB 3.0 attached drive, or something like
that?

One final thing that you could try doing, depending on how
easily/quickly you can reproduce the problem, is to use blktrace and
see if you can catch who or what is writing to that specific block
which is getting corrupted.

- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/