Re: [PATCH v2 2/2] nbd: convert to use blk_mq_get_rq_by_tag()

From: yukuai (C)
Date: Mon Aug 09 2021 - 10:04:40 EST


On 2021/08/09 17:46, Ming Lei wrote:
On Mon, Aug 09, 2021 at 03:08:26PM +0800, yukuai (C) wrote:
On 2021/08/09 14:28, Ming Lei wrote:
On Mon, Aug 09, 2021 at 11:09:27AM +0800, Yu Kuai wrote:
blk_mq_tag_to_rq() might return freed request, use
blk_mq_get_rq_by_tag() instead.

Signed-off-by: Yu Kuai <yukuai3@xxxxxxxxxx>
---
drivers/block/nbd.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index c38317979f74..9e56975a8eee 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -713,11 +713,10 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
tag = nbd_handle_to_tag(handle);
hwq = blk_mq_unique_tag_to_hwq(tag);
if (hwq < nbd->tag_set.nr_hw_queues)
- req = blk_mq_tag_to_rq(nbd->tag_set.tags[hwq],
- blk_mq_unique_tag_to_tag(tag));
- if (!req || !blk_mq_request_started(req)) {
- dev_err(disk_to_dev(nbd->disk), "Unexpected reply (%d) %p\n",
- tag, req);
+ req = blk_mq_get_rq_by_tag(nbd->tag_set.tags[hwq],
+ blk_mq_unique_tag_to_tag(tag));
+ if (!req) {
+ dev_err(disk_to_dev(nbd->disk), "Unexpected reply %d\n", tag);
return ERR_PTR(-ENOENT);
}
trace_nbd_header_received(req, handle);
@@ -779,6 +778,8 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
}
out:
trace_nbd_payload_received(req, handle);
+ if (req)
+ blk_mq_put_rq_ref(req);
mutex_unlock(&cmd->lock);
return ret ? ERR_PTR(ret) : cmd;

After blk_mq_put_rq_ref() returns, this request may have been freed,
so the returned 'cmd' may have been freed too.

As I replied in your another thread, it is driver's responsibility to
cover race between normal completion and timeout/error handling, that
means the caller of blk_mq_tag_to_rq need to make sure that the request
represented by the passed 'tag' can't be freed.

Hi, Ming

There are two problems here in nbd, both reported by our syzkaller.

The first is that blk_mq_tag_to_rq() returned a freed request, which is
because tags->static_rq[] is freed without clearing tags->rq[].
Syzkaller log shows that a reply package is sent to client without
the client's request package. And this patch is trying to solve this
problem.

It is still driver's problem:

->static_rq is freed in blk_mq_free_tag_set() which is called after
blk_cleanup_disk() returns. Once blk_cleanup_disk() returns, there
shouldn't be any driver activity, including calling blk_mq_tag_to_rq()
by passing one invalid tag.


Hi, Ming

I understand if static_rq is freed through blk_mq_free_tag_set(),
drivers should not use static_rq anymore.

By the way, I was thinking about another path:

blk_mq_update_nr_requests
if (!hctx->sched_tags) -> if this is true
ret = blk_mq_tag_update_depth(hctx, &hctx->tags, nr, false)
blk_mq_free_rqs -> static_rq is freed here

If this path concurrent with nbd_read_stat(), nbd_read_stat() can
get a freed request by blk_mq_tag_to_rq(), since tags->lock is not
held.

t1: nbd_read_stat t2: blk_mq_update_nr_requests
rq = blk_mq_tag_to_rq()
blk_mq_free_rqs

By holding tags->lock, we can check that rq state is idle, and it's
ref is 0.

Thanks
Kuai