Re: blk-mq: takes hours for scsi scanning finish when thousands of LUNs

From: Ming Lei
Date: Thu Oct 22 2015 - 20:58:14 EST


On Thu, Oct 22, 2015 at 5:15 PM, jason <zhangqing.luo@xxxxxxxxxx> wrote:
>
>
> On Thursday, October 22, 2015 04:47 PM, Tejun Heo wrote:
>>
>> Hello,
>>
>> On Mon, Oct 19, 2015 at 07:40:13AM -0700, Zhangqing Luo wrote:
>> ....
>> > So every time blk_mq_freeze_queue_start, it runs in this way
>> >
>> > blk_mq_freeze_queue_start
>> > ->percpu_ref_kill->percpu_ref_kill_and_confirm
>> > ->__percpu_ref_switch_to_atomic
>> > ->call_rcu_sched(&ref->rcu,percpu_ref_switch_to_atomic_rcu)
>> >
>> > and blk_mq_freeze_queue_wait blocks on queue->mq_usage_counter
>> > as it is not zero, and wake up by percpu_ref_switch_to_atomic_rcu
>> > after a grace period
>> >
>> >
>> > My question here is why should we change ref to PERCPU at
>> > blk_mq_finish_init?
>> > because of this changing, delay appears.
>>
>> Because percpu operation is way cheaper than atomic ones and we want
>> to optimize hot paths (request issue and completion) over cold paths
>> (init and config changes). That's the whole point of percpu
>> refcnting.
>>
>> The reason why percpu ref starts in atomic mode is to avoid expensive
>> percpu freezing if the queue is created and abandoned in quick
>> succession as SCSI does during LUN scanning. If percpu freezing is
>> happening during that, the right solution is moving finish_init to
>> late enough point so that percpu switching happens only after it's
>> known that the queue won't be abandoned.
>>
>> Thanks.
>>
> I agree with the optimizing hot paths by cheaper percpu operation,
> but how much does it affect the performance?
>
> as you know the switching causes delay, when the the LUN number is
> increasing
> the delay is becoming higher, so do you have any idea
> about the problem?

For a normal SCSI disk, add_disk() is called in sd probe path, during that
time, the reuquest queue should have been initialized completely, that
said blk_mq_add_queue_tag_set() should be run in atomic mode
of the usage ref counter.

Not sure how can you observe the ref counter is running at percpu mode
during queue initialization.

Care to share what your disk/driver is? I doubt it might be one SCSI LLD's
issue.

Thanks,

>
> /Thanks
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/



--
Ming Lei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/