Re: [2.3.99pre2] [CORRUPTION] Doh! Corruption problems again...

From: Simon Kirby (sim@stormix.com)
Date: Mon Mar 20 2000 - 21:48:51 EST


On Mon, Mar 20, 2000 at 05:23:30PM -0800, Andre Hedrick wrote:

> PLease repeat on SCSI-SCSI, SCSI-IDE and IDE-SCSI.
> There has been nothing in the driver that has change as it relates to IO.
> I want to know if it is true for the subsystem or of it is changes in the
> ./fs/* code.........
>
> ATA has 512-byte block sizes.

But hmm, cmp reports the files differ at char 15099, and nothing even
close to that factors nicely:

[sroot@oof:/tmp]# cmp corrupted-pulsar.c original-pulsar.c
corrupted-pulsar.c original-pulsar.c differ: char 15099, line 516

factoring:
15085: 5 7 431
15086: 2 19 397
15097: 31 487
15098: 2 7549
15099: 3 7 719
15100: 2 2 5 5 151

...Well, the good news is I've just reproduced it four times with a test
thrashing script. It seems to be so easy to reproduce under memory
pressure that a simple copy of the kernel usually shows between one and
three files corrupted. I've reproduced it copying from a 1K blocksize
EXT2 filesystem to a 2K blocksize EXT2 filesystem. It might be worth
noting that I copied approximately 5 times as much data between two 4K
blocksize EXT2 filesystems and saw no corruption in any of the copied
data, but I saw about 6 files corrupted on the filesystems that weren't
4K. This could just be a coincidence, though, as the 4K filesystems were
on a different drive (and on the same IDE bus, now that I think of it).
Hmm...

In the case I reported the file corruption started at offset (zero-based)
15098 and ended at 16384, and in this case the file corruption started at
offset 15456 and stopped at 16384.

In the second reproduction I see the end of this 9849 byte file
overwritten with NULLs starting at offset 9216 (1024-aligned).

In the third I see it at 17408 (-> 1024) and ending at 20480 (-> 4096).

Maybe the strange unaligned blocks are the result of buffers being
modified at the time same as they're being written?

Hmm...I'll go back and try the earlier kernels where I saw no corruption
and see if I can reproduce it there.

Simon-

[ Stormix Technologies Inc. ][ NetNation Communcations Inc. ]
[ sim@stormix.com ][ sim@netnation.com ]
[ Opinions expressed are not necessarily those of my employers. ]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Mar 23 2000 - 21:00:31 EST