Re: [RFC]cfq-iosched: quantum check tweak

From: Vivek Goyal
Date: Thu Jan 14 2010 - 06:31:32 EST


On Thu, Jan 14, 2010 at 12:16:24PM +0800, Shaohua Li wrote:
> On Wed, Jan 13, 2010 at 07:18:07PM +0800, Vivek Goyal wrote:
> > On Wed, Jan 13, 2010 at 04:17:35PM +0800, Shaohua Li wrote:
> > [..]
> > > > > static bool cfq_may_dispatch(struct cfq_data *cfqd, struct cfq_queue *cfqq)
> > > > > {
> > > > > unsigned int max_dispatch;
> > > > > @@ -2258,7 +2273,10 @@ static bool cfq_may_dispatch(struct cfq_
> > > > > if (cfqd->sync_flight && !cfq_cfqq_sync(cfqq))
> > > > > return false;
> > > > >
> > > > > - max_dispatch = cfqd->cfq_quantum;
> > > > > + max_dispatch = cfqd->cfq_quantum / 2;
> > > > > + if (max_dispatch < CFQ_SOFT_QUANTUM)
> > > >
> > > > We don't have to hardcode CFQ_SOFT_QUANTUM or in fact we don't need it. We can
> > > > derive the soft limit from hard limit (cfq_quantum). Say soft limit will be
> > > > 50% of cfq_quantum value.
> > > I'm hoping this doesn't give user a surprise. Say cfq_quantum sets to 7, then we
> > > start doing throttling from 3 requests. Adding the CFQ_SOFT_QUANTUM gives a compatibility
> > > against old behavior at least. Am I over thinking?
> > >
> >
> > I would not worry too much about that. If you are really worried about
> > that, then create one Documentation/block/cfq-iosched.txt and document
> > how cfq_quantum works so that users know that cfq_quantum is upper hard
> > limit and internal soft limit is cfq_quantum/2.
> Good idea. Looks we don't document cfq tunnables, I'll try to do it later.
>
> Currently a queue can only dispatch up to 4 requests if there are other queues.
> This isn't optimal, device can handle more requests, for example, AHCI can
> handle 31 requests. I can understand the limit is for fairness, but we could
> do a tweak: if the queue still has a lot of slice left, sounds we could
> ignore the limit.

Hi Shaohua,

This looks much better. Though usage of "slice_idle" as measure of service
times, I find little un-intutive. Especially, I do some testing with
slice_idle=0, in that case, we will be allowing dispatch of 8 requests
from each queue even if slice is about to expire.

But I guess that's fine for the time being as upper limit is still
controlld by cfq_quantum.

> Test shows this boost my workload (two thread randread of a SSD) from 78m/s
> to 100m/s.

Are these deep queue random reads (with higher iodepths, using libaio)?

Have you done similar test on some slower NCQ rotational hardware also and
seen the impact on throughput and *max latency* of readers, especially in
the presence of buffered writers.

Thanks
Vivek

>
> Signed-off-by: Shaohua Li <shaohua.li@xxxxxxxxx>
> ---
> block/cfq-iosched.c | 30 ++++++++++++++++++++++++++----
> 1 file changed, 26 insertions(+), 4 deletions(-)
>
> Index: linux-2.6/block/cfq-iosched.c
> ===================================================================
> --- linux-2.6.orig/block/cfq-iosched.c
> +++ linux-2.6/block/cfq-iosched.c
> @@ -19,7 +19,7 @@
> * tunables
> */
> /* max queue in one round of service */
> -static const int cfq_quantum = 4;
> +static const int cfq_quantum = 8;
> static const int cfq_fifo_expire[2] = { HZ / 4, HZ / 8 };
> /* maximum backwards seek, in KiB */
> static const int cfq_back_max = 16 * 1024;
> @@ -2215,6 +2215,19 @@ static int cfq_forced_dispatch(struct cf
> return dispatched;
> }
>
> +static inline bool cfq_slice_used_soon(struct cfq_data *cfqd,
> + struct cfq_queue *cfqq)
> +{
> + /* the queue hasn't finished any request, can't estimate */
> + if (cfq_cfqq_slice_new(cfqq))
> + return 1;
> + if (time_after(jiffies + cfqd->cfq_slice_idle * cfqq->dispatched,
> + cfqq->slice_end))
> + return 1;
> +
> + return 0;
> +}
> +
> static bool cfq_may_dispatch(struct cfq_data *cfqd, struct cfq_queue *cfqq)
> {
> unsigned int max_dispatch;
> @@ -2231,7 +2244,7 @@ static bool cfq_may_dispatch(struct cfq_
> if (cfqd->sync_flight && !cfq_cfqq_sync(cfqq))
> return false;
>
> - max_dispatch = cfqd->cfq_quantum;
> + max_dispatch = max_t(unsigned int, cfqd->cfq_quantum / 2, 1);
> if (cfq_class_idle(cfqq))
> max_dispatch = 1;
>
> @@ -2248,13 +2261,22 @@ static bool cfq_may_dispatch(struct cfq_
> /*
> * We have other queues, don't allow more IO from this one
> */
> - if (cfqd->busy_queues > 1)
> + if (cfqd->busy_queues > 1 && cfq_slice_used_soon(cfqd, cfqq))
> return false;
>
> /*
> * Sole queue user, no limit
> */
> - max_dispatch = -1;
> + if (cfqd->busy_queues == 1)
> + max_dispatch = -1;
> + else
> + /*
> + * Normally we start throttling cfqq when cfq_quantum/2
> + * requests have been dispatched. But we can drive
> + * deeper queue depths at the beginning of slice
> + * subjected to upper limit of cfq_quantum.
> + * */
> + max_dispatch = cfqd->cfq_quantum;
> }
>
> /*
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/