Re: [PATCH 1/2] Avoid bio_endio recursion

From: Mikulas Patocka
Date: Tue Jun 24 2008 - 10:28:40 EST




On Tue, 24 Jun 2008, Jens Axboe wrote:

On Tue, Jun 24 2008, Mikulas Patocka wrote:
Hi

bio_endio calls bi_end_io callback. In case of stacked devices (raid, dm),
bio_end_io may call bio_endio again, up to an unspecified length.

The crash because of stack overflow was really observed on sparc64. And
this recursion was one of the contributing factors (using 9 stack frames
--- that is 1728 bytes).

Looks good, I like the concept. Can you please make it a little less
goto driven, though? The next_bio and goto next_bio could just be a
while().

--
Jens Axboe


Hi.

This is the patch, slightly de-goto-ized. (it still contains one, I think that while (1) { ... break ... } is no better readable than goto).

I found another problem in my previous patch, I forgot about the "error" variable (it would cause misbehavior for example if disk fails, submits an error and raid driver turns this failure into success). We need to save the error variable somewhere in the bio, there is no other place where it could be placed. I temporarily saved it to bi_idx, because it's unused at this place.

Mikulas

--


Avoid recursion on bio_endio. bio_endio calls bio->bi_end_io which may in turn
call bio_endio again. When this recursion happens, put the new bio to the queue
and process it later, from the top-level bio_endio.

Signed-off-by: Mikulas Patocka <mpatocka@xxxxxxxxxx>

Index: linux-2.6.26-rc7-devel/fs/bio.c
===================================================================
--- linux-2.6.26-rc7-devel.orig/fs/bio.c 2008-06-23 13:49:45.000000000 +0200
+++ linux-2.6.26-rc7-devel/fs/bio.c 2008-06-24 11:17:24.000000000 +0200
@@ -1168,13 +1168,44 @@
**/
void bio_endio(struct bio *bio, int error)
{
+ static DEFINE_PER_CPU(struct bio **, bio_end_queue) = { NULL };
+ struct bio ***bio_end_queue_ptr;
+ struct bio *bio_queue;
+
+ unsigned long flags;
+
+ bio->bi_idx = error;
if (error)
clear_bit(BIO_UPTODATE, &bio->bi_flags);
else if (!test_bit(BIO_UPTODATE, &bio->bi_flags))
- error = -EIO;
+ bio->bi_idx = -EIO;
+
+ local_irq_save(flags);
+ bio_end_queue_ptr = &__get_cpu_var(bio_end_queue);
+
+ if (*bio_end_queue_ptr) {
+ **bio_end_queue_ptr = bio;
+ *bio_end_queue_ptr = &bio->bi_next;
+ bio->bi_next = NULL;
+ } else {
+ bio_queue = NULL;
+ *bio_end_queue_ptr = &bio_queue;
+
+next_bio:
+ if (bio->bi_end_io)
+ bio->bi_end_io(bio, (short)bio->bi_idx);
+
+ if (bio_queue) {
+ bio = bio_queue;
+ bio_queue = bio->bi_next;
+ if (!bio_queue)
+ *bio_end_queue_ptr = &bio_queue;
+ goto next_bio;
+ }
+ *bio_end_queue_ptr = NULL;
+ }

- if (bio->bi_end_io)
- bio->bi_end_io(bio, error);
+ local_irq_restore(flags);
}

void bio_pair_release(struct bio_pair *bp)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/