Re: oops with dual xeon 2.8ghz 4gb ram +smp, software raid, lvm,and xfs

From: Andrew Morton
Date: Thu Nov 25 2004 - 02:16:58 EST


Jens Axboe <axboe@xxxxxxx> wrote:
>
> On Thu, Nov 25 2004, Neil Brown wrote:
> > On Wednesday November 24, akpm@xxxxxxxx wrote:
> > > Neil Brown <neilb@xxxxxxxxxxxxxxx> wrote:
> > > >
> > > > Would the following (untested-but-seems-to-compile -
> > > > explanation-of-concept) patch be at all reasonable to avoid stack
> > > > depth problems with stacked block devices, or is adding stuff to
> > > > task_struct frowned upon?
> > >
> > > It's always a tradeoff - we've put things in task_struct before to get
> > > around sticky situations. Certainly, removing potentially unbounded stack
> > > utilisation is a worthwhile thing to do.
> > >
> > > The patch bends my brain a bit.
> >
> > Recursion is like that (... like recursion, that is :-).
>
> Pardon my ignorance, but where is the bug that called for something like
> this?

Well there was an xfs-on-raid-on-lvm stack overrun reported, but the
general problem we're addressing here is that stacking drivers can cause
arbitrary amounts of kernel stack windup.

> I can't say I love the idea of adding a bio list structure to the
> tasklist, it feels pretty hacky. generic_make_request() doesn't really
> use that much stack, if you just kill the BDEVNAME_SIZE struct.

Looks like a sensible thing to do, although it would be tidier to move the
whole thing into a separate function, no?


--- 25/drivers/block/ll_rw_blk.c~generic_make_request-stack-savings 2004-11-24 23:03:06.347778648 -0800
+++ 25-akpm/drivers/block/ll_rw_blk.c 2004-11-24 23:07:39.798207864 -0800
@@ -2584,6 +2584,20 @@ static inline void block_wait_queue_runn
}
}

+static void handle_bad_sector(struct bio *bio)
+{
+ char b[BDEVNAME_SIZE];
+
+ printk(KERN_INFO "attempt to access beyond end of device\n");
+ printk(KERN_INFO "%s: rw=%ld, want=%Lu, limit=%Lu\n",
+ bdevname(bio->bi_bdev, b),
+ bio->bi_rw,
+ (unsigned long long)bio->bi_sector + bio_sectors(bio),
+ (long long)(bio->bi_bdev->bd_inode->i_size >> 9));
+
+ set_bit(BIO_EOF, &bio->bi_flags);
+}
+
/**
* generic_make_request: hand a buffer to its device driver for I/O
* @bio: The bio describing the location in memory and on the device.
@@ -2620,21 +2634,13 @@ void generic_make_request(struct bio *bi
if (maxsector) {
sector_t sector = bio->bi_sector;

- if (maxsector < nr_sectors ||
- maxsector - nr_sectors < sector) {
- char b[BDEVNAME_SIZE];
- /* This may well happen - the kernel calls
- * bread() without checking the size of the
- * device, e.g., when mounting a device. */
- printk(KERN_INFO
- "attempt to access beyond end of device\n");
- printk(KERN_INFO "%s: rw=%ld, want=%Lu, limit=%Lu\n",
- bdevname(bio->bi_bdev, b),
- bio->bi_rw,
- (unsigned long long) sector + nr_sectors,
- (long long) maxsector);
-
- set_bit(BIO_EOF, &bio->bi_flags);
+ if (maxsector < nr_sectors || maxsector - nr_sectors < sector) {
+ /*
+ * This may well happen - the kernel calls bread()
+ * without checking the size of the device, e.g., when
+ * mounting a device.
+ */
+ handle_bad_sector(bio);
goto end_io;
}
}
_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/