Re: [PATCH v2 2/2] nbd: convert to use blk_mq_get_rq_by_tag()

From: Ming Lei
Date: Mon Aug 09 2021 - 21:49:10 EST


On Mon, Aug 09, 2021 at 10:04:32PM +0800, yukuai (C) wrote:
> On 2021/08/09 17:46, Ming Lei wrote:
> > On Mon, Aug 09, 2021 at 03:08:26PM +0800, yukuai (C) wrote:
> > > On 2021/08/09 14:28, Ming Lei wrote:
> > > > On Mon, Aug 09, 2021 at 11:09:27AM +0800, Yu Kuai wrote:
> > > > > blk_mq_tag_to_rq() might return freed request, use
> > > > > blk_mq_get_rq_by_tag() instead.
> > > > >
> > > > > Signed-off-by: Yu Kuai <yukuai3@xxxxxxxxxx>
> > > > > ---
> > > > > drivers/block/nbd.c | 11 ++++++-----
> > > > > 1 file changed, 6 insertions(+), 5 deletions(-)
> > > > >
> > > > > diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> > > > > index c38317979f74..9e56975a8eee 100644
> > > > > --- a/drivers/block/nbd.c
> > > > > +++ b/drivers/block/nbd.c
> > > > > @@ -713,11 +713,10 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
> > > > > tag = nbd_handle_to_tag(handle);
> > > > > hwq = blk_mq_unique_tag_to_hwq(tag);
> > > > > if (hwq < nbd->tag_set.nr_hw_queues)
> > > > > - req = blk_mq_tag_to_rq(nbd->tag_set.tags[hwq],
> > > > > - blk_mq_unique_tag_to_tag(tag));
> > > > > - if (!req || !blk_mq_request_started(req)) {
> > > > > - dev_err(disk_to_dev(nbd->disk), "Unexpected reply (%d) %p\n",
> > > > > - tag, req);
> > > > > + req = blk_mq_get_rq_by_tag(nbd->tag_set.tags[hwq],
> > > > > + blk_mq_unique_tag_to_tag(tag));
> > > > > + if (!req) {
> > > > > + dev_err(disk_to_dev(nbd->disk), "Unexpected reply %d\n", tag);
> > > > > return ERR_PTR(-ENOENT);
> > > > > }
> > > > > trace_nbd_header_received(req, handle);
> > > > > @@ -779,6 +778,8 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
> > > > > }
> > > > > out:
> > > > > trace_nbd_payload_received(req, handle);
> > > > > + if (req)
> > > > > + blk_mq_put_rq_ref(req);
> > > > > mutex_unlock(&cmd->lock);
> > > > > return ret ? ERR_PTR(ret) : cmd;
> > > >
> > > > After blk_mq_put_rq_ref() returns, this request may have been freed,
> > > > so the returned 'cmd' may have been freed too.
> > > >
> > > > As I replied in your another thread, it is driver's responsibility to
> > > > cover race between normal completion and timeout/error handling, that
> > > > means the caller of blk_mq_tag_to_rq need to make sure that the request
> > > > represented by the passed 'tag' can't be freed.
> > >
> > > Hi, Ming
> > >
> > > There are two problems here in nbd, both reported by our syzkaller.
> > >
> > > The first is that blk_mq_tag_to_rq() returned a freed request, which is
> > > because tags->static_rq[] is freed without clearing tags->rq[].
> > > Syzkaller log shows that a reply package is sent to client without
> > > the client's request package. And this patch is trying to solve this
> > > problem.
> >
> > It is still driver's problem:
> >
> > ->static_rq is freed in blk_mq_free_tag_set() which is called after
> > blk_cleanup_disk() returns. Once blk_cleanup_disk() returns, there
> > shouldn't be any driver activity, including calling blk_mq_tag_to_rq()
> > by passing one invalid tag.
> >
>
> Hi, Ming
>
> I understand if static_rq is freed through blk_mq_free_tag_set(),
> drivers should not use static_rq anymore.
>
> By the way, I was thinking about another path:
>
> blk_mq_update_nr_requests
> if (!hctx->sched_tags) -> if this is true
> ret = blk_mq_tag_update_depth(hctx, &hctx->tags, nr, false)
> blk_mq_free_rqs -> static_rq is freed here
>
> If this path concurrent with nbd_read_stat(), nbd_read_stat() can
> get a freed request by blk_mq_tag_to_rq(), since tags->lock is not
> held.
>
> t1: nbd_read_stat t2: blk_mq_update_nr_requests
> rq = blk_mq_tag_to_rq()
> blk_mq_free_rqs

t1 isn't supposed to happen when t2 is running.

blk_mq_update_nr_requests() is only called by nbd_start_device().

nbd_start_device():
if (nbd->task_recv)
return -EBUSY;
...
nbd->recv_workq = alloc_workqueue()

That means nbd_config_put() has been called and ->config_refs has
dropped to zero, so socket has been shutdown, and ->recv_workq has
been destroyed, so t1 isn't supposed to happen when t2 is running.

>
> By holding tags->lock, we can check that rq state is idle, and it's
> ref is 0.

Firstly tags->lock can't fix the race[1], secondly it should be addressed
in driver.

[1] https://lore.kernel.org/linux-block/20210809030927.1946162-2-yukuai3@xxxxxxxxxx/T/#m6651289c5718b45a8ae8a7efc889248f8cb904a3


Thanks,
Ming