Re: [PATCH][2.6-mm] Readahead issues and AIO read speedup

From: Badari Pulavarty (pbadari@us.ibm.com)
Date: Thu Aug 07 2003 - 12:21:39 EST


On Thursday 07 August 2003 09:28 am, Andrew Morton wrote:
> Badari Pulavarty <pbadari@us.ibm.com> wrote:
> > I noticed the exact same thing while testing on database benchmark
> > on filesystems (without AIO). I added instrumentation in scsi layer to
> > record the IO pattern and I found that we are doing lots of (4million)
> > 4K reads, in my benchmark run. I was tracing that and found that all
> > those reads are generated by slow read path, since readahead window
> > is maximally shrunk. When I forced the readahead code to read 16k
> > (my database pagesize), in case ra window closed - I see 20% improvement
> > in my benchmark. I asked "Ramchandra Pai" (linuxram@us.ibm.com)
> > to investigate it further.
>
> But if all the file's pages are already in pagecache (a common case)
> this patched kernel will consume extra CPU pointlessly poking away at
> pagecache. Reliably shrinking the window to zero is important for this
> reason.

Yes !! I hardcoded it to 16k, since I know that all my reads will be 16k
(atleast). We should do readahead of actual pages required by the current
read would be correct solution. (like Suparna suggested).

>
> If the database pagesize is 16k then the application should be submitting
> 16k reads, yes?

Yes. Database always does IO in atleast 16k (in my case).

> If so then these should not be creating 4k requests at the
> device layer! So what we need to do is to ensure that at least those 16k
> worth of pages are submitted in a single chunk. Without blowing CPU if
> everything is cached. Tricky. I'll take a look at what's going on.

When readahead window is closed, slow read code will be submitting IO in 4k
chunks. Infact, it will wait for the IO to finish, before reading next page.
Isn't it ? How would you ensure atleast 16k worth of pages are submitted
in a sinle chunk here ?

I am hoping that forcing readhead code to read pages needed by current read
would address this problem.

> Another relevant constraint here (and there are lots of subtle constraints
> in readahead) is that often database files are fragmented all over the
> disk, because they were laid out that way (depends on the database and
> how it was set up). In this case, any extra readahead is a disaster
> because it incurs extra seeks, needlessly.

Agreed. In my case, I made sure that all the files are almost contiguous.
(I put one file per filesystem - and verified thro debugfs).

Thanks,
Badari

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Aug 07 2003 - 22:00:39 EST