Re: msync() behaviour broken for MS_ASYNC, revert patch?

From: Andrew Morton
Date: Fri Feb 10 2006 - 15:10:22 EST


Linus Torvalds <torvalds@xxxxxxxx> wrote:
>
>
>
> On Fri, 10 Feb 2006, Oliver Neukum wrote:
> >
> > Am Freitag, 10. Februar 2006 20:05 schrieb Linus Torvalds:
> > > So we may have different expectations, because we've seen different
> > > patterns. Me, I've seen the "events are huge, and you stagger them", so
> > > that the previous event has time to flow out to disk while you generate
> > > the next one. There, MS_ASYNC starting IO is _wrong_, because the scale of
> > > the event is just huge, so trying to push it through the IO subsystem asap
> > > just makes everything suck.
> >
> > Isn't the benefit of starting writing immediately greater the smaller
> > the area in question? If so, couldn't a heuristic be found to decide whether
> > to initiate IO at once?
>
> Quite possibly. I suspect you could/should take other issues into account
> too (like whether the queue to the device is busy or bdflush is already
> working).
>

Yes, it would make sense to run balance_dirty_pages_ratelimited() inside
msync_pte_range(). So pdflush will get poked if we hit
background_dirty_ratio threshold, or we go into caller-initiated writeout
if we hit dirty_ratio.

But it's not completely trivial, because I don't think we want to be doing
blocking writeback with mmap_sem held.

The code under balance_dirty_pages() does pay attention to queue congestion
states, already-under-writeback pages and such things, but it could be
better, I guess. Starting some writeback earlier if the queue is deemed to
be idle could work.

(Hi, Stephen)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/