Re: [PATCH 10/10] mm: per device dirty threshold

From: Miklos Szeredi
Date: Sat Apr 21 2007 - 06:40:36 EST


> On Fri, 20 Apr 2007 17:52:04 +0200 Peter Zijlstra <a.p.zijlstra@xxxxxxxxx> wrote:
>
> > Scale writeback cache per backing device, proportional to its writeout speed.
> >
> > By decoupling the BDI dirty thresholds a number of problems we currently have
> > will go away, namely:
> >
> > - mutual interference starvation (for any number of BDIs);
> > - deadlocks with stacked BDIs (loop, FUSE and local NFS mounts).
> >
> > It might be that all dirty pages are for a single BDI while other BDIs are
> > idling. By giving each BDI a 'fair' share of the dirty limit, each one can have
> > dirty pages outstanding and make progress.
> >
> > A global threshold also creates a deadlock for stacked BDIs; when A writes to
> > B, and A generates enough dirty pages to get throttled, B will never start
> > writeback until the dirty pages go away. Again, by giving each BDI its own
> > 'independent' dirty limit, this problem is avoided.
> >
> > So the problem is to determine how to distribute the total dirty limit across
> > the BDIs fairly and efficiently. A DBI that has a large dirty limit but does
> > not have any dirty pages outstanding is a waste.
> >
> > What is done is to keep a floating proportion between the DBIs based on
> > writeback completions. This way faster/more active devices get a larger share
> > than slower/idle devices.
>
> This is a pretty major improvement to various nasty corner-cases, if it
> works.
>
> Does it work? Please describe the testing you did, and the results.
>
> Has this been confirmed to fix Miklos's FUSE and loopback problems?

I haven't yet tested it (will do), but I'm sure it does solve the
deadlock in balance_dirty_pages(), if for no other reason, that when
the queue is idle (no dirty or writeback pages), then it allowes the
caller to dirty some more pages.

The other deadlock, in throttle_vm_writeout() is still to be solved.

Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/