Re: IO scheduler based IO controller V10

From: Jens Axboe
Date: Fri Oct 02 2009 - 14:19:10 EST


On Fri, Oct 02 2009, Mike Galbraith wrote:
> On Fri, 2009-10-02 at 19:37 +0200, Jens Axboe wrote:
> > On Fri, Oct 02 2009, Ingo Molnar wrote:
> > >
> > > * Jens Axboe <jens.axboe@xxxxxxxxxx> wrote:
> > >
> > > > On Fri, Oct 02 2009, Ingo Molnar wrote:
> > > > >
> > > > > * Jens Axboe <jens.axboe@xxxxxxxxxx> wrote:
> > > > >
> > > > > > It's not _that_ easy, it depends a lot on the access patterns. A
> > > > > > good example of that is actually the idling that we already do.
> > > > > > Say you have two applications, each starting up. If you start them
> > > > > > both at the same time and just care for the dumb low latency, then
> > > > > > you'll do one IO from each of them in turn. Latency will be good,
> > > > > > but throughput will be aweful. And this means that in 20s they are
> > > > > > both started, while with the slice idling and priority disk access
> > > > > > that CFQ does, you'd hopefully have both up and running in 2s.
> > > > > >
> > > > > > So latency is good, definitely, but sometimes you have to worry
> > > > > > about the bigger picture too. Latency is more than single IOs,
> > > > > > it's often for complete operation which may involve lots of IOs.
> > > > > > Single IO latency is a benchmark thing, it's not a real life
> > > > > > issue. And that's where it becomes complex and not so black and
> > > > > > white. Mike's test is a really good example of that.
> > > > >
> > > > > To the extent of you arguing that Mike's test is artificial (i'm not
> > > > > sure you are arguing that) - Mike certainly did not do an artificial
> > > > > test - he tested 'konsole' cache-cold startup latency, such as:
> > > >
> > > > [snip]
> > > >
> > > > I was saying the exact opposite, that Mike's test is a good example of
> > > > a valid test. It's not measuring single IO latencies, it's doing a
> > > > sequence of valid events and looking at the latency for those. It's
> > > > benchmarking the bigger picture, not a microbenchmark.
> > >
> > > Good, so we are in violent agreement :-)
> >
> > Yes, perhaps that last sentence didn't provide enough evidence of which
> > category I put Mike's test into :-)
> >
> > So to kick things off, I added an 'interactive' knob to CFQ and
> > defaulted it to on, along with re-enabling slice idling for hardware
> > that does tagged command queuing. This is almost completely identical to
> > what Vivek Goyal originally posted, it's just combined into one and uses
> > the term 'interactive' instead of 'fairness'. I think the former is a
> > better umbrella under which to add further tweaks that may sacrifice
> > throughput slightly, in the quest for better latency.
> >
> > It's queued up in the for-linus branch.
>
> FWIW, I did a matrix of Vivek's patch combined with my hack. Seems we
> do lose a bit of dd throughput over stock with either or both.
>
> dd pre 65.1 65.4 67.5 64.8 65.1 65.5 fairness=1 overload_delay=1
> perf stat 1.70 1.94 1.32 1.89 1.87 1.7
> dd post 69.4 62.3 69.7 70.3 69.6 68.2
>
> dd pre 67.0 67.8 64.7 64.7 64.9 65.8 fairness=1 overload_delay=0
> perf stat 4.89 3.13 2.98 2.71 2.17 3.1
> dd post 67.2 63.3 62.6 62.8 63.1 63.8
>
> dd pre 65.0 66.0 66.9 64.6 67.0 65.9 fairness=0 overload_delay=1
> perf stat 4.66 3.81 4.23 2.98 4.23 3.9
> dd post 62.0 60.8 62.4 61.4 62.2 61.7
>
> dd pre 65.3 65.6 64.9 69.5 65.8 66.2 fairness=0 overload_delay=0
> perf stat 14.79 9.11 14.16 8.44 13.67 12.0
> dd post 64.1 66.5 64.0 66.5 64.4 65.1

I'm not too worried about the "single IO producer" scenarios, and it
looks like (from a quick look) that most of your numbers are within some
expected noise levels. It's the more complex mixes that are likely to
cause a bit of a stink, but lets worry about that later. One quick thing
would be to read eg 2 or more files sequentially from disk and see how
that performs.

If you could do a cleaned up version of your overload patch based on
this:

http://git.kernel.dk/?p=linux-2.6-block.git;a=commit;h=1d2235152dc745c6d94bedb550fea84cffdbf768

then lets take it from there.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/