Re: [Announce] BKL shifting into drivers and filesystems - beware

From: Andrea Arcangeli (andrea@suse.de)
Date: Fri Jul 14 2000 - 09:51:24 EST


On Fri, 14 Jul 2000, Andrea Arcangeli wrote:

>It should be fixed in VM-global-patch-2. Main issue was kpiod.

I just noticed VM-global-patch-2 could do I/O even if __GFP_IO wasn't set
in the gfp_mask (this isn't a big deal becuase all the deadlocks are now
handled by a current->fs_locks information). However I agreed in taking
the __GFP_IO semantics separated by the anti-deadlock logic.

GFP_BUFFER have __GFP_WAIT set so it _can_ still reschedule (but it will
be garanteed to not do very slow I/O).

In 2.2.17pre11aa1 I also put a coditional reschedule after the
try_to_free_pages logic _if_ __GFP_WAIT is set (so also in the GFP_BUFFER
case). We can do that without risking to run out of memory thanks to the
new per-process freelist. It's an extremely slow path so a conditional
schedule there can't hurt performance. The reason I put it there is to try
to avoid this scenario: the task could reschedule in entry.S before
returning to userspace before being able to do any progress in userspace
and the page could be dropped again before the task returns to run. If we
do the end of the pagin with current->need_resched not set, it's more
likely the task will have a chance to do some userspace progress during
trashing. I also added the coditional reschedule in the copy-user stuff to
fix the hang with 2giga of ram and 2giga of cache. Nothing else related to
low latency and all the reschedule patches that should apply cleanly on
top of VM-global-patch-3 are here:

        ftp://ftp.*.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.2/2.2.17pre11aa2/7*

The 70_cond-sched-1 and 71_copy-user-reschedule-1 are extracted from the
lowlatency patch from Ingo Molnar.

They're probably a good idea for 2.2.x too.

The fix for the __GFP_IO thing was this:

diff -urN 2.2.17pre11aa1/fs/buffer.c 2.2.17pre11aa2/fs/buffer.c
--- 2.2.17pre11aa1/fs/buffer.c Fri Jul 14 03:02:39 2000
+++ 2.2.17pre11aa2/fs/buffer.c Fri Jul 14 16:27:31 2000
@@ -1761,7 +1761,7 @@
 #define BUFFER_BUSY_BITS ((1<<BH_Dirty) | (1<<BH_Lock) | (1<<BH_Protected))
 #define buffer_busy(bh) ((bh)->b_count || ((bh)->b_state & BUFFER_BUSY_BITS))
 
-static int sync_page_buffers(struct buffer_head *bh)
+static int sync_page_buffers(struct buffer_head *bh, int gfp_mask)
 {
         struct buffer_head * tmp = bh;
 
@@ -1769,7 +1769,7 @@
                 struct buffer_head *p = tmp;
                 tmp = tmp->b_this_page;
                 if (buffer_dirty(p) || buffer_locked(p)) {
- if (test_and_set_bit(BH_Wait_IO, &p->b_state)) {
+ if (test_and_set_bit(BH_Wait_IO, &p->b_state) && (gfp_mask & __GFP_IO)) {
                                 if (buffer_dirty(p))
                                         ll_rw_block(WRITE, 1, &p);
                                 wait_on_buffer(p);
@@ -1797,7 +1797,7 @@
  * Wake up bdflush() if this fails - if we're running low on memory due
  * to dirty buffers, we need to flush them out as quickly as possible.
  */
-int try_to_free_buffers(struct page * page_map)
+int try_to_free_buffers(struct page * page_map, int gfp_mask)
 {
         struct buffer_head * tmp, * bh = page_map->buffers;
 
@@ -1828,7 +1828,7 @@
         return 1;
 
  busy:
- if (!sync_page_buffers(bh))
+ if (!sync_page_buffers(bh, gfp_mask))
                 /*
                  * We can jump after the busy check because
                  * we rely on the kernel lock.
diff -urN 2.2.17pre11aa1/include/linux/fs.h 2.2.17pre11aa2/include/linux/fs.h
--- 2.2.17pre11aa1/include/linux/fs.h Fri Jul 14 03:02:39 2000
+++ 2.2.17pre11aa2/include/linux/fs.h Fri Jul 14 16:27:31 2000
@@ -777,7 +777,7 @@
 
 extern void refile_buffer(struct buffer_head * buf);
 extern void set_writetime(struct buffer_head * buf, int flag);
-extern int try_to_free_buffers(struct page *);
+extern int try_to_free_buffers(struct page *, int);
 
 extern int nr_buffers;
 extern long buffermem;
diff -urN 2.2.17pre11aa1/mm/filemap.c 2.2.17pre11aa2/mm/filemap.c
--- 2.2.17pre11aa1/mm/filemap.c Fri Jul 14 03:02:39 2000
+++ 2.2.17pre11aa2/mm/filemap.c Fri Jul 14 16:27:31 2000
@@ -197,7 +197,7 @@
                          * throttling.
                          */
 
- if (!try_to_free_buffers(page))
+ if (!try_to_free_buffers(page, gfp_mask))
                                 goto refresh_clock;
                         return 1;
                 }

Andrea

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Jul 15 2000 - 21:00:21 EST