Re: [PATCH 3/4] blk-mq: establish new mapping before cpu starts handling requests

From: Ming Lei
Date: Thu Jun 25 2015 - 04:07:41 EST


On Thu, Jun 25, 2015 at 10:56 AM, Akinobu Mita <akinobu.mita@xxxxxxxxx> wrote:
> 2015-06-25 1:24 GMT+09:00 Ming Lei <tom.leiming@xxxxxxxxx>:
>> On Wed, Jun 24, 2015 at 10:34 PM, Akinobu Mita <akinobu.mita@xxxxxxxxx> wrote:
>>> Hi Ming,
>>>
>>> 2015-06-24 18:46 GMT+09:00 Ming Lei <tom.leiming@xxxxxxxxx>:
>>>> On Sun, Jun 21, 2015 at 9:52 PM, Akinobu Mita <akinobu.mita@xxxxxxxxx> wrote:
>>>>> ctx->index_hw is zero for the CPUs which have never been onlined since
>>>>> the block queue was initialized. If one of those CPUs is hotadded and
>>>>> starts handling request before new mappings are established, pending
>>>>
>>>> Could you explain a bit what the handling request is? The fact is that
>>>> blk_mq_queue_reinit() is run after all queues are put into freezing.
>>>
>>> Notifier callbacks for CPU_ONLINE action can be run on the other CPU
>>> than the CPU which was just onlined. So it is possible for the
>>> process running on the just onlined CPU to insert request and run
>>> hw queue before blk_mq_queue_reinit_notify() is actually called with
>>> action=CPU_ONLINE.
>>
>> You are right because blk_mq_queue_reinit_notify() is alwasy run after
>> the CPU becomes UP, so there is a tiny window in which the CPU is up
>> but the mapping is updated. Per current design, the CPU just onlined
>> is still mapped to hw queue 0 until the mapping is updated by
>> blk_mq_queue_reinit_notify().
>>
>> But I am wondering why it is a problem and why you think flush_busy_ctxs
>> can't find the requests on the software queue in this situation?
>
> The problem happens when the CPU has just been onlined first time
> since the request queue was initialized. At this time ctx->index_hw
> for the CPU is still zero before blk_mq_queue_reinit_notify is called.
>
> The request can be inserted to ctx->rq_list, but blk_mq_hctx_mark_pending()
> marks busy for wrong bit position as ctx->index_hw is zero.

It isn't wrong bit since the CPU onlined just is still mapped to hctx 0 at that
time .

>
> flush_busy_ctxs() only retrieves the requests from software queues
> which are marked busy. So the request just inserted is ignored as
> the corresponding bit position is not busy.

Before making the remap in blk_mq_queue_reinit() for the CPU topo change,
the request queue will be put into freezing first and all requests
inserted to hctx 0
should be retrieved and scheduled out. So can the request be igonred by
flush_busy_ctxs()?

--
Ming Lei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/