Re: [RFC PATCH v2 2/2] blk-mq: Lockout tagset iter when freeing rqs

From: John Garry
Date: Fri Dec 18 2020 - 05:02:51 EST


On 18/12/2020 03:31, Ming Lei wrote:
3 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index a6df2d5df88a..853ed5b889aa 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -358,10 +358,19 @@ void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tagset,
{
int i;
for (i = 0; i < tagset->nr_hw_queues; i++) {
- if (tagset->tags && tagset->tags[i])
- __blk_mq_all_tag_iter(tagset->tags[i], fn, priv,
+ if (tagset->tags && tagset->tags[i]) {
+ struct blk_mq_tags *tags = tagset->tags[i];
+
+ if (!atomic_inc_not_zero(&tags->iter_usage_counter))
+ continue;

Hi Ming,

When 'continue' is run, blk_mq_tagset_busy_iter()'s semantic may be
broken.

Yeah, I did consider this, and thought that we're ok since we iter only started tags here (and there would be none for the queue). But then that would not work for other queues with active tags associated.


+
+ __blk_mq_all_tag_iter(tags, fn, priv,
BT_TAG_ITER_STARTED);
+
+ atomic_dec(&tags->iter_usage_counter);
+ }
}
}
EXPORT_SYMBOL(blk_mq_tagset_busy_iter);
@@ -435,9 +444,14 @@ void blk_mq_queue_tag_busy_iter(struct request_queue *q, busy_iter_fn *fn,
if (!blk_mq_hw_queue_mapped(hctx))
continue;
+ if (!atomic_inc_not_zero(&tags->iter_usage_counter))
+ continue;
Same with above comment.

Right, similar to above.


+
if (tags->nr_reserved_tags)
bt_for_each(hctx, tags->breserved_tags, fn, priv, true);
bt_for_each(hctx, tags->bitmap_tags, fn, priv, false);
+
+ atomic_dec(&tags->iter_usage_counter);
}
blk_queue_exit(q);
}
@@ -461,6 +475,8 @@ static int blk_mq_init_bitmap_tags(struct blk_mq_tags *tags,
round_robin, node))
goto free_bitmap_tags;
+ atomic_set(&tags->iter_usage_counter, 1);
+
tags->bitmap_tags = &tags->__bitmap_tags;
tags->breserved_tags = &tags->__breserved_tags;
diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h
index 7d3e6b333a4a..563019d60f05 100644
--- a/block/blk-mq-tag.h
+++ b/block/blk-mq-tag.h
@@ -11,6 +11,9 @@ struct blk_mq_tags {
atomic_t active_queues;
+ /* Only interesting for driver tags */
+ atomic_t iter_usage_counter;
+
struct sbitmap_queue *bitmap_tags;
struct sbitmap_queue *breserved_tags;
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 8465d7c5ebf0..a61279be0120 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2315,7 +2315,9 @@ void __blk_mq_free_rqs_ext(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
void blk_mq_free_rqs_ext(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
unsigned int hctx_idx, struct blk_mq_tags *ref_tags)
{
+ while (atomic_cmpxchg(&ref_tags->iter_usage_counter, 1, 0) != 1);
__blk_mq_free_rqs_ext(set, tags, hctx_idx, ref_tags);
+ atomic_set(&ref_tags->iter_usage_counter, 1);
}
I guess it is simpler to sync the two code paths by adding mutex to 'ref_tags' and
holding it in both __blk_mq_free_rqs_ext() and the above two iterator helpers.

But are we allowed to always sleep in the iter calling context?

And that will also lockout parallel iterations, which is less than ideal.

So I could look to address the issues you mention above with atomics still, but maybe Bart has some better idea regarding RCU.

Thanks,
John