Re: RFC - how to balance Dirty+Writeback in the face of slowwriteback.

From: Neil Brown
Date: Thu Aug 17 2006 - 00:01:05 EST

On Tuesday August 15, akpm@xxxxxxxx wrote:
> > When Dirty hits 0 (and Writeback is theoretically 80% of RAM)
> > balance_dirty_pages will no longer be able to flush the full
> > 'write_chunk' (1.5 times number of recent dirtied pages) and so will
> > spin in a loop calling blk_congestion_wait(WRITE, HZ/10), so it isn't
> > a busy loop, but it won't progress.
> This assumes that the queues are unbounded. They're not - they're limited
> to 128 requests, which is 60MB or so.

Ahhh... so the limit on the requests-per-queue is an important part of
write-throttling behaviour. I didn't know that, thanks.

fs/nfs doesn't seem to impose a limit. It will just allocate as many
as you ask for until you start running out of memory. I've seen 60%
of memory (10 out of 16Gig) in writeback for NFS.

Maybe I should look there to address my current issue, though imposing
a system-wide writeback limit seems safer.

> Per queue. The scenario you identify can happen if it's spread across
> multiple disks simultaneously.
> CFQ used to have 1024 requests and we did have problems with excessive
> numbers of writeback pages. I fixed that in 2.6.early, but that seems to
> have got lost as well.

What would you say constitutes "excessive"? Is there any sense in
which some absolute number is excessive (as it takes too long to scan
some list) or is it just a percent-of-memory thing?

> Something like that - it'll be relatively simple.

Unfortunately I think it is also relatively simple to get it badly
wrong:-) Make one workload fast, and another slower.

But thanks, you've been very helpful (as usual). I'll ponder it a bit
longer and see what turns up.

