Re: [PATCH] blk-mq: Fix blk_mq_tagset_busy_iter() for shared tags

From: Ming Lei
Date: Mon Oct 18 2021 - 05:08:27 EST


On Mon, Oct 18, 2021 at 09:08:57AM +0100, John Garry wrote:
> On 13/10/2021 16:13, John Garry wrote:
> > > diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
> > > index 72a2724a4eee..2a2ad6dfcc33 100644
> > > --- a/block/blk-mq-tag.c
> > > +++ b/block/blk-mq-tag.c
> > > @@ -232,8 +232,9 @@ static bool bt_iter(struct sbitmap *bitmap,
> > > unsigned int bitnr, void *data)
> > >       if (!rq)
> > >           return true;
> > > -    if (rq->q == hctx->queue && rq->mq_hctx == hctx)
> > > -        ret = iter_data->fn(hctx, rq, iter_data->data, reserved);
> > > +    if (rq->q == hctx->queue && (rq->mq_hctx == hctx ||
> > > +                blk_mq_is_shared_tags(hctx->flags)))
> > > +        ret = iter_data->fn(rq->mq_hctx, rq, iter_data->data, reserved);
> > >       blk_mq_put_rq_ref(rq);
> > >       return ret;
> > >   }
> > > @@ -460,6 +461,9 @@ void blk_mq_queue_tag_busy_iter(struct
> > > request_queue *q, busy_iter_fn *fn,
> > >           if (tags->nr_reserved_tags)
> > >               bt_for_each(hctx, &tags->breserved_tags, fn, priv, true);
> > >           bt_for_each(hctx, &tags->bitmap_tags, fn, priv, false);
> > > +
> > > +        if (blk_mq_is_shared_tags(hctx->flags))
> > > +            break;
> > >       }
> > >       blk_queue_exit(q);
> > >   }
> > >
> >
> > I suppose that is ok, and means that we iter once.
> >
> > However, I have to ask, where is the big user of
> > blk_mq_queue_tag_busy_iter() coming from? I saw this from Kashyap's
> > mail:
> >
> > > 1.31%     1.31%  kworker/57:1H-k  [kernel.vmlinux]
> > >       native_queued_spin_lock_slowpath
> > >       ret_from_fork
> > >       kthread
> > >       worker_thread
> > >       process_one_work
> > >       blk_mq_timeout_work
> > >       blk_mq_queue_tag_busy_iter
> > >       bt_iter
> > >       blk_mq_find_and_get_req
> > >       _raw_spin_lock_irqsave
> > >       native_queued_spin_lock_slowpath
> >
> > How or why blk_mq_timeout_work()?
>
> Just some update: I tried hisi_sas with 10x SAS SSDs, megaraid sas with 1x
> SATA HDD (that's all I have), and null blk with lots of devices, and I still
> can't see high usage of blk_mq_queue_tag_busy_iter().

It should be triggered easily in case of heavy io accounting:

while true; do cat /proc/diskstats; done


> So how about we get this patch processed (to fix blk_mq_tagset_busy_iter()),
> as it is independent of blk_mq_queue_tag_busy_iter()? And then wait for some
> update or some more info from Kashyap regarding blk_mq_queue_tag_busy_iter()

Looks fine:

Reviewed-by: Ming Lei <ming.lei@xxxxxxxxxx>


Thanks,
Ming