Re: [PATCH/RFC] Simplified Readahead

From: Ram Pai
Date: Tue Oct 05 2004 - 12:59:12 EST


On Fri, 2004-10-01 at 14:02, Steven Pratt wrote:
> Ram Pai wrote:
>
> snip...
>
> >>>>>To summarize you noticed 3 problems:
> >>>>>
> >>>>>1. page cache hits not handled properly.
> >>>>>2. readahead thrashing not accounted.
> >>>>>3. read congestion not accounted.
> >>>>>
> >>>>>
> >
> >
> >I have enclosed 5 patches that address each of the issues.
> >
> >1 . Code is obtuse and hard to maintain
> >
> > The best I could do is update the comments to reflect the
> > current code. Hopefully that should help.
> >
> > attached patch 1_comment.patch takes care of that part to
> > some extent.
> >
> >
> >2. page cache hits not handled properly.
> >
> > I fixed this by decrementing the size of the next readahead window
> > by the number of pages hit in the page cache. Now it slowly
> > accomodates the page cache hits.
> >
> > attached patch 2_cachehits.patch takes care of this issue.
> >
> >3. queue congestion not handled.
> >
> > The fix is: call force_page_cache_readahead() if we are
> > populating pages in the current window.
> > And call do_page_cache_readahead() if we are populating
> > pages in the ahead window. However if do_page_cache_readahead()
> > return with congestion, the readahead window is collapsed back
> > to size zero. This will ensure that the next time ahead window
> > is attempted to populate.
> >
> > attached patch 3_queuecongestion.patch handles this issue.
> >
> >4. page thrash handled ineffectively.
> >
> > The fix is: on page thrash detection shutdown readahead.
> >
> > attached patch 4_pagethrash.patch handles this issue.
> >
> >5. slow read path is too slow.
> >
> > I could not figure out a way to atleast-read-the-requested-
> > number-of-pages if readahead is shutdown, without incorporating
> > the readsize parameter to page_cache_readahead(). So had
> > to pick some of your code in filemap.c to do that. Thanks!
> >
> > attached patch 5_fixedslowread.patch handles this issue.
> >
> >
> >Apart from this you have noticed other issues
> >
> >6. cache lookup done unneccessrily twice for pagecache_hits.
> >
> > I have not handled this issue currently. But should be doable
> > if I introducing a flag, which notes when readahead is
> > shutdown by pagecahche hits. And hence attempts to lookup
> > the page only once.
> >
> >
> >And you have other features in your patch which will be the real
> >differentiating factors.
> >
> >7. exponential expand and shrink of window sizes.
> >
> >8. overlapped read of current window and ahead window.
> >
> > ( I think both are desirable feature )
> >
> >I did run some premilinary tests using your patch and the above patches
> >and found
> >
> >your patch was doing slightly better on iozone and sysbench.
> >however the above patch were doing slightly better with DSS workload.
> >
> >
>
> Ok, I have re-run the Tiobench tests. On a single cpu ide based system
> you new patches have no noticable effect on sequential read performance
> (a good thing); but on random I/O things went bad :-(.
>
> Here are the random read results for 16k io with 4GB fileset on 256MB
> mem, single cpu IDE
>
> Stock w/ patches
>
> Threads MBs/sec MBs/sec %diff diff
> ---------- ------------ ------------ -------- ------------
> 1 1.73 1.72 -0.58 -0.01
> 4 1.70 1.56 -8.24 -0.14
> 16 1.66 0.81 -51.20 -0.85
> 64 1.49 0.68 -54.36 -0.81
>
> As you can see somewhere after 4 threads the new patches cause performance to tank.
>
> With 512k ios the problem kicks in with less than 4 threads.
>
> Stock w/ patches
> Threads MBs/sec MBs/sec %diff diff
> ---------- ------------ ------------ -------- ------------
> 1 18.50 18.55 0.27 0.05
> 4 8.55 6.59 -22.92 -1.96
> 16 8.40 5.18 -38.33 -3.22
> 64 7.34 4.76 -35.15 -2.58
>
>
> Unfortunately this is the _good_ news. The bad news is that this is much worse on SCSI.
> We lose a few percent on sequential reads for all block sizes and random is just totally screwed.
>
> Here is the same 16k io requests size with 4GB fileset on 1GB memory on 8way system on single scsi disk.
>
> stock w/ patch
> Threads MBs/sec MBs/sec %diff diff
> ---------- ------------ ------------ -------- ------------
> 1 3.43 3.03 -11.66 -0.40
> 4 4.51 1.06 -76.50 -3.45
> 16 5.86 1.43 -75.60 -4.43
> 64 6.13 1.66 -72.92 -4.47
>
> 11% degrade even on 1 thread, 75% degrade for 4 threads and above! This is horribly broken.
>
>
Sorry for the late response. Was out yesterday.

Yes something is broken horribly. Will look into what is broken.

RP


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/