Re: Strange messages in my kernel log

From: Neil Brown (neilb@cse.unsw.edu.au)
Date: Wed Aug 30 2000 - 22:08:18 EST


On Wednesday August 30, avn@spylog.com wrote:
>
> I have strange messages in my kernel log (1-2 messages per day)
>
> raid5: bug: stripe->bh_new[2], sector 477640 exists
> raid5: bh 8ffd6da0, bh_new 8ffd63e0
> raid5: bug: stripe->bh_new[2], sector 6677320 exists
> raid5: bh aca1f060, bh_new aca1f720
> raid5: bug: stripe->bh_new[1], sector 6854048 exists
> raid5: bh b94c02c0, bh_new b94c0260
> raid5: bug: stripe->bh_new[2], sector 6889584 exists
> raid5: bh 833949c0, bh_new 83394d80
>
>
> What does it mean? Can this cause data corruption? Can i fix this problem?
> I use softvare RAID level 5, in kernel 2.2.16 with RH patches.

What does it means?

 It means that raid5 received an IO request for a block of data for
 which is already had an outstanding IO request. In these cases the
 two i/o requests had different buffer_head strutures.
 raid5 reports a bug, but actually handles the situation fairly
 gracefully: it blocks the second request until the first has
 finished.

Can this cause data corruption?
 
 Not directly, but it might be a symptom of something else that could
 cause corruption.

Can i fix this problem?
 
 Maybe, but first we need to understand it.
 The question is, how could there be two different buffer_heads for
 the same block on disk?
 My understanding of the 2.2 buffer cache is not very good, but I
 think that all filesystems and block device IO go through the buffer
 cache, so any of these accesses should never allow two buffer_heads
 to point to the same block.
 However I believe that swapping doesn't go through the buffer cache.

 So: what are you doing with the raid5 partition.
   What sort of file system?
   Are you swapping to a file on the filesystems?
   Is there any chance that a parity reconstruction is happening when
   you get these messages.
   Is there anything at all about your usage of the raid5 device that
   could possibly be at all out of the ordinary?

>
> Or tell me, who mantain software raid code, and may help me solve this
> problem.

Well, Ingo Molnar <mingo@redhat.com> is the official maintainer, and I
have been doing a fair bit of development lately.

>
> Also I have second question.
> How much stable software RAID in latset 2.4.0 test series?

quite stable, but not very fast. RAID5 in 2.4.0 in particular is much
slower than 2.2.xx with mingo's patches. I'm working on this from
time to time.

NeilBrown

>
> --
> With best regards
> Alexander V. Nikolaev
> System administrator of spylog.com
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> Please read the FAQ at http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Aug 31 2000 - 21:00:26 EST