Re: 2.6.12-rc6-mm1 & 2K lun testing

From: Andrew Morton
Date: Thu Jun 16 2005 - 15:39:05 EST


Badari Pulavarty <pbadari@xxxxxxxxxx> wrote:
>
> >
> > We seem to be always ooming when allocating scsi command structures.
> > Perhaps the block-level request structures are being allocated with
> > __GFP_WAIT, but it's a bit odd. Which I/O scheduler? If cfq, does
> > reducing /sys/block/*/queue/nr_requests help?
>
> Yes. I am using CFQ scheduler. I changed nr_requests to 4 for all
> my devices. I also changed "min_free_kbytes" to 64M.

Yeah, that monster cfq queue depth continues to hurt in corner cases.

> Response time is still bad. Here is the vmstat, meminfo, slabinfo
> and profle output. I am not sure why profile output shows
> default_idle(), when vmstat shows 100% CPU sys.

(please inline text rather then using attachments)

> MemTotal: 7209056 kB
> ...
> Dirty: 5896240 kB

That's not going to help - we're way over 40% there, so the VM is getting
into some trouble.

Try reducing the dirty limits in /proc/sys/vm by a lot to confirm that it
helps.

There are various bits of slop and hysteresis and deliberate overshoot in
page-writeback.c which are there to enhance IO batching and to reduce CPU
consumption. A few megs here and there adds up when you multiply it by
2000...

Try this:

diff -puN mm/page-writeback.c~a mm/page-writeback.c
--- 25/mm/page-writeback.c~a Thu Jun 16 13:36:29 2005
+++ 25-akpm/mm/page-writeback.c Thu Jun 16 13:36:54 2005
@@ -501,6 +501,8 @@ void laptop_sync_completion(void)

static void set_ratelimit(void)
{
+ ratelimit_pages = 32;
+ return;
ratelimit_pages = total_pages / (num_online_cpus() * 32);
if (ratelimit_pages < 16)
ratelimit_pages = 16;
_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/