RE: Software raid0 will crash the file-system, when each disk is 5TB

From: Jeff Zheng
Date: Thu May 17 2007 - 18:56:08 EST


Fix confirmed, filled the whole 11T hard disk, without crashing.
I presume this would go into 2.6.22

Thanks again.

Jeff

> -----Original Message-----
> From: linux-raid-owner@xxxxxxxxxxxxxxx
> [mailto:linux-raid-owner@xxxxxxxxxxxxxxx] On Behalf Of Jeff Zheng
> Sent: Thursday, 17 May 2007 5:39 p.m.
> To: Neil Brown; david@xxxxxxx; Michal Piotrowski; Ingo
> Molnar; linux-raid@xxxxxxxxxxxxxxx;
> linux-kernel@xxxxxxxxxxxxxxx; linux-fsdevel@xxxxxxxxxxxxxxx
> Subject: RE: Software raid0 will crash the file-system, when
> each disk is 5TB
>
>
> Yeah, seems you've locked it down, :D. I've written 600GB of
> data now, and anything is still fine.
> Will let it run overnight, and fill the whole 11T. I'll post
> the result tomorrow
>
> Thanks a lot though.
>
> Jeff
>
> > -----Original Message-----
> > From: Neil Brown [mailto:neilb@xxxxxxx]
> > Sent: Thursday, 17 May 2007 5:31 p.m.
> > To: david@xxxxxxx; Jeff Zheng; Michal Piotrowski; Ingo Molnar;
> > linux-raid@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> > linux-fsdevel@xxxxxxxxxxxxxxx
> > Subject: RE: Software raid0 will crash the file-system,
> when each disk
> > is 5TB
> >
> > On Thursday May 17, neilb@xxxxxxx wrote:
> > >
> > > Uhm, I just noticed something.
> > > 'chunk' is unsigned long, and when it gets shifted up, we
> > might lose
> > > bits. That could still happen with the 4*2.75T
> arrangement, but is
> > > much more likely in the 2*5.5T arrangement.
> >
> > Actually, it cannot be a problem with the 4*2.75T arrangement.
> > chuck << chunksize_bits
> >
> > will not exceed the size of the underlying device *in*kilobytes*.
> > In that case that is 0xAE9EC800 which will git in a 32bit long.
> > We don't double it to make sectors until after we add
> > zone->dev_offset, which is "sector_t" and so 64bit
> arithmetic is used.
> >
> > So I'm quite certain this bug will cause exactly the problems
> > experienced!!
> >
> > >
> > > Jeff, can you try this patch?
> >
> > Don't bother about the other tests I mentioned, just try this one.
> > Thanks.
> >
> > NeilBrown
> >
> > > Signed-off-by: Neil Brown <neilb@xxxxxxx>
> > >
> > > ### Diffstat output
> > > ./drivers/md/raid0.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff .prev/drivers/md/raid0.c ./drivers/md/raid0.c
> > > --- .prev/drivers/md/raid0.c 2007-05-17
> > 10:33:30.000000000 +1000
> > > +++ ./drivers/md/raid0.c 2007-05-17 15:02:15.000000000 +1000
> > > @@ -475,7 +475,7 @@ static int raid0_make_request (request_q
> > > x = block >> chunksize_bits;
> > > tmp_dev = zone->dev[sector_div(x, zone->nb_dev)];
> > > }
> > > - rsect = (((chunk << chunksize_bits) + zone->dev_offset)<<1)
> > > + rsect = ((((sector_t)chunk << chunksize_bits) +
> > > +zone->dev_offset)<<1)
> > > + sect_in_chunk;
> > >
> > > bio->bi_bdev = tmp_dev->bdev;
> >
> -
> To unsubscribe from this list: send the line "unsubscribe
> linux-raid" in the body of a message to
> majordomo@xxxxxxxxxxxxxxx More majordomo info at
> http://vger.kernel.org/majordomo-info.html
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/