Re: [PATCHSET v4] blk-mq-scheduling framework

From: Bart Van Assche
Date: Thu Dec 22 2016 - 12:40:12 EST


On Thu, 2016-12-22 at 09:12 -0800, Omar Sandoval wrote:
> On Thu, Dec 22, 2016 at 04:57:36PM +0000, Bart Van Assche wrote:
> > On Thu, 2016-12-22 at 08:52 -0800, Omar Sandoval wrote:
> > > This approach occurred to us, but we couldn't figure out a way to make
> > > blk_mq_tag_to_rq() work with it. From skimming over the patches, I
> > > didn't see a solution to that problem.
> >
> > Can you clarify your comment? Since my patches initialize both tags->rqs[]
> > and sched_tags->rqs[] the function blk_mq_tag_to_rq() should still work.
>
> Sorry, you're right, it does work, but tags->rqs[] ends up being the
> extra lookup table. I suspect that the runtime overhead of keeping that
> up to date could be worse than copying the rq fields if you have lots of
> CPUs but only one hardware queue.

Hello Omar,

I'm not sure that anything can be done if the number of CPUs that is submitting
I/O is large compared to the queue depth so I don't think we should spend our
time on that case. If the queue depth is large enough then the sbitmap code will
allocate tags such that different CPUs use different rqs[] elements.

The advantages of the approach I proposed are such that I am convinced that is
what we should start from and address contention on the tags->rqs[] array if it
measurements show that it is necessary to address it.

Bart.