[PATCH 1/2] blk-mq: fix race between timeout and queue_rq

From: Ming Lei
Date: Wed Sep 17 2014 - 05:48:19 EST


Either the request is from requeue or just being allocated from
tag pool, its REQ_ATOM_STARTED flag has been cleared already, so
don't test it in blk_mq_start_request().

One memory barrier is needed between writing rq->deadline and
setting REQ_ATOM_STARTED so that timeout can't happen too early
if timeout handler reads obsolete rq->deadline caused by out
of order of the two writes.

Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxxxxx>
---
block/blk-mq.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 47f3938..957815e 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -377,13 +377,20 @@ static void blk_mq_start_request(struct request *rq, bool last)
blk_add_timer(rq);

/*
+ * Order writing rq->deadline and setting the rq's
+ * REQ_ATOM_STARTED flag so that timeout handler can see
+ * correct rq->deadline once the rq's REQ_ATOM_STARTED flag
+ * is observed.
+ */
+ smp_mb__before_atomic();
+
+ /*
* Mark us as started and clear complete. Complete might have been
* set if requeue raced with timeout, which then marked it as
* complete. So be sure to clear complete again when we start
* the request, otherwise we'll ignore the completion event.
*/
- if (!test_bit(REQ_ATOM_STARTED, &rq->atomic_flags))
- set_bit(REQ_ATOM_STARTED, &rq->atomic_flags);
+ set_bit(REQ_ATOM_STARTED, &rq->atomic_flags);
if (test_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags))
clear_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags);

--
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/