Re: [PATCH RFC - TAKE TWO - 10/12] block, bfq: add Early Queue Merge (EQM)

From: Tejun Heo
Date: Tue Jun 03 2014 - 12:28:54 EST


Hello,

On Mon, Jun 02, 2014 at 11:46:45AM +0200, Paolo Valente wrote:
> > I don't really follow the last part. So, the difference is that
> > cooperating queue setup also takes place during bio merge too, right?
>
> Not only, in bfq an actual queue merge is performed in the bio-merge hook.

I think I'm a bit confused because it's named "early" queue merge
while it actually moves queue merging later than cfq - set_request()
happens before bio/rq merging. So, what it tries to do is
compensating for the lack of cfq_rq_close() preemption at request
issue time, right?

> > cfq does it once when allocating the request. That seems a lot more
> > reasonable to me. It's doing that once for one start sector. I mean,
> > plugging is usually extremely short compared to actual IO service
> > time. It's there to mask the latencies between bio issues that the
> > same CPU is doing. I can't see how this earliness can be actually
> > useful. Do you have results to back this one up? Or is this just
> > born out of thin air?
>
> Arianna added the early-queue-merge part in the allow_merge_fn hook
> about one year ago, as a a consequence of a throughput loss of about
> 30% with KVM/QEMU workloads. In particular, we ran most of the tests
> on a WDC WD60000HLHX-0 Velociraptor. That HDD might not be available
> for testing any more, but we can reproduce our results for you on
> other HDDs, with and without early queue merge. And, maybe through
> traces, we can show you that the reason for the throughput loss is
> exactly that described (in a wordy way) in this patch. Of course
> unless we have missed something.

Oh, as long as it makes measureable difference, I have no objection;
however, I do think more explanation and comments would be nice. I
still can't quite understand why retrying on each merge attempt would
make so much difference. Maybe I just failed to understand what you
wrote in the commit message. Is it because the cooperating tasks
issue IOs which grow large and close enough after merges but not on
the first bio issuance? If so, why isn't doing it on rq merge time
enough? Is the timing sensitive enough for certain workloads that
waiting till unplug time misses the opportunity? But plugging should
be relatively short compared to the time actual IOs take, so why would
it be that sensitive? What am I missing here?

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/