Re: [PATCH v2] blk-mq: Fix stall due to recursive flush plug

From: Bart Van Assche
Date: Fri Jul 14 2023 - 12:04:11 EST


On 7/14/23 03:11, Ross Lagerwall wrote:
diff --git a/block/blk-core.c b/block/blk-core.c
index 99d8b9812b18..90de50082146 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1144,8 +1144,7 @@ void __blk_flush_plug(struct blk_plug *plug, bool from_schedule)
{
if (!list_empty(&plug->cb_list))
flush_plug_callbacks(plug, from_schedule);
- if (!rq_list_empty(plug->mq_list))
- blk_mq_flush_plug_list(plug, from_schedule);
+ blk_mq_flush_plug_list(plug, from_schedule);
/*
* Unconditionally flush out cached requests, even if the unplug
* event came from schedule. Since we know hold references to the
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 5504719b970d..e6bd9c5f42bb 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2742,7 +2742,14 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule)
{
struct request *rq;
- if (rq_list_empty(plug->mq_list))
+ /*
+ * We may have been called recursively midway through handling
+ * plug->mq_list via a schedule() in the driver's queue_rq() callback.
+ * To avoid mq_list changing under our feet, clear rq_count early and
+ * bail out specifically if rq_count is 0 rather than checking
+ * whether the mq_list is empty.
+ */
+ if (plug->rq_count == 0)
return;
plug->rq_count = 0;

Reviewed-by: Bart Van Assche <bvanassche@xxxxxxx>