Re: Copying to loop device hangs up everything

From: Momchil Velikov (velco@fadata.bg)
Date: Sun Dec 16 2001 - 16:52:40 EST

Next message: antirez: "Re: Booting a modular kernel through a multiple streams file"
Previous message: Ville Herva: "Re: malloc 1GB on a 2GB ia64 box fails - 17rc1 woes w/ qla1280 an d reiserfs"
In reply to: Momchil Velikov: "Re: Copying to loop device hangs up everything"
Next in thread: Marcelo Tosatti: "Re: Copying to loop device hangs up everything"
Reply: Marcelo Tosatti: "Re: Copying to loop device hangs up everything"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

>>>>> "Momchil" == Momchil Velikov <velco@fadata.bg> writes:

>>>>> "David" == David Gomez <davidge@jazzfree.com> writes:
David> On 16 Dec 2001, Momchil Velikov wrote:

>>> [...]
>>>
David> Thanks ;), this patch solves the problem and copying a lot of data to the
David> loop device now doesn't hang the computer.
>>>
David> Is this patch going to be applied to the stable kernel ? Marcelo ?
>>>
>>> I've had exactly the same hangups with or without the patch.

David> I've tested several times after applying the loop-deadlock patch and the
David> bug seems to be fixed. No more hangups while copying a lot of data to
David> loopback devices. Post more info about your hangups, maybe is another
David> different loop device deadlock.

Momchil> Maybe it's different I don't know. Looks like I've found a fix and in
Momchil> a minute I'll test _without_ the Andrea's patch and post whatever
Momchil> comes out of it.

It turned out that Andrea's patch is needed and it needs to be
augmented slightly. The loop_thread can do the following:

loop_thread
-> do_bh_filebacked
-> lo_send
-> ...
-> kmem_cache_alloc
-> ...
-> shrink_cache
-> try_to_release_page
-> try_to_free_buffers
-> sync_page_buffers
-> __wait_on_buffer

And if the buffer must be flushed to the loopback device we deadlock.

The following patch is the Andrea's one + one additional change -- we
don't allow the loop_thread to wait in sync_page_buffers.

Regards,
-velco

diff -Nru a/drivers/block/loop.c b/drivers/block/loop.c
--- a/drivers/block/loop.c Sun Dec 16 23:50:25 2001
+++ b/drivers/block/loop.c Sun Dec 16 23:50:25 2001
@@ -578,6 +578,8 @@
         atomic_inc(&lo->lo_pending);
         spin_unlock_irq(&lo->lo_lock);

+ current->flags |= PF_NOIO;
+
         /*
          * up sem, we are running
          */
diff -Nru a/fs/buffer.c b/fs/buffer.c
--- a/fs/buffer.c Sun Dec 16 23:50:25 2001
+++ b/fs/buffer.c Sun Dec 16 23:50:25 2001
@@ -1045,7 +1045,7 @@

         /* First, check for the "real" dirty limit. */
         if (dirty > soft_dirty_limit) {
- if (dirty > hard_dirty_limit)
+ if (dirty > hard_dirty_limit && !(current->flags & PF_NOIO))
                         return 1;
                 return 0;
         }
@@ -2448,6 +2448,8 @@
                 /* Second time through we start actively writing out.. */
                 if (test_and_set_bit(BH_Lock, &bh->b_state)) {
                         if (!test_bit(BH_launder, &bh->b_state))
+ continue;
+ if (current->flags & PF_NOIO)
                                 continue;
                         wait_on_buffer(bh);
                         tryagain = 1;
diff -Nru a/include/linux/sched.h b/include/linux/sched.h
--- a/include/linux/sched.h Sun Dec 16 23:50:25 2001
+++ b/include/linux/sched.h Sun Dec 16 23:50:25 2001
@@ -426,6 +426,7 @@
#define PF_MEMALLOC 0x00000800 /* Allocating memory */
#define PF_MEMDIE 0x00001000 /* Killed for out-of-memory */
#define PF_FREE_PAGES 0x00002000 /* per process page freeing */
+#define PF_NOIO 0x00004000 /* avoid generating further I/O */

#define PF_USEDFPU 0x00100000 /* task used FPU this quantum (SMP) */

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: antirez: "Re: Booting a modular kernel through a multiple streams file"
Previous message: Ville Herva: "Re: malloc 1GB on a 2GB ia64 box fails - 17rc1 woes w/ qla1280 an d reiserfs"
In reply to: Momchil Velikov: "Re: Copying to loop device hangs up everything"
Next in thread: Marcelo Tosatti: "Re: Copying to loop device hangs up everything"
Reply: Marcelo Tosatti: "Re: Copying to loop device hangs up everything"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Sun Dec 23 2001 - 21:00:11 EST