Re: [PATCH RFC] blk-mq: fix potential uaf for 'queue_hw_ctx'

From: Ming Lei
Date: Wed Feb 23 2022 - 21:16:21 EST


On Thu, Feb 24, 2022 at 09:29:09AM +0800, yukuai (C) wrote:
> 在 2022/02/23 22:30, Ming Lei 写道:
> > On Wed, Feb 23, 2022 at 07:26:01PM +0800, Yu Kuai wrote:
> > > blk_mq_realloc_hw_ctxs() will free the 'queue_hw_ctx'(e.g. undate
> > > submit_queues through configfs for null_blk), while it might still be
> > > used from other context(e.g. switch elevator to none):
> > >
> > > t1 t2
> > > elevator_switch
> > > blk_mq_unquiesce_queue
> > > blk_mq_run_hw_queues
> > > queue_for_each_hw_ctx
> > > // assembly code for hctx = (q)->queue_hw_ctx[i]
> > > mov 0x48(%rbp),%rdx -> read old queue_hw_ctx
> > >
> > > __blk_mq_update_nr_hw_queues
> > > blk_mq_realloc_hw_ctxs
> > > hctxs = q->queue_hw_ctx
> > > q->queue_hw_ctx = new_hctxs
> > > kfree(hctxs)
> > > movslq %ebx,%rax
> > > mov (%rdx,%rax,8),%rdi ->uaf
> > >
> >
> > Not only uaf on queue_hw_ctx, but also other similar issue on other
> > structures, and I think the correct and easy fix is to quiesce request
> > queue during updating nr_hw_queues, something like the following patch:
> >
> > diff --git a/block/blk-mq.c b/block/blk-mq.c
> > index a05ce7725031..d8e7c3cce0dd 100644
> > --- a/block/blk-mq.c
> > +++ b/block/blk-mq.c
> > @@ -4467,8 +4467,10 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set,
> > if (set->nr_maps == 1 && nr_hw_queues == set->nr_hw_queues)
> > return;
> > - list_for_each_entry(q, &set->tag_list, tag_set_list)
> > + list_for_each_entry(q, &set->tag_list, tag_set_list) {
> > blk_mq_freeze_queue(q);
> > + blk_mq_quiesce_queue(q);
> > + }
> > /*
> > * Switch IO scheduler to 'none', cleaning up the data associated
> > * with the previous scheduler. We will switch back once we are done
> > @@ -4518,8 +4520,10 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set,
> > list_for_each_entry(q, &set->tag_list, tag_set_list)
> > blk_mq_elv_switch_back(&head, q);
> > - list_for_each_entry(q, &set->tag_list, tag_set_list)
> > + list_for_each_entry(q, &set->tag_list, tag_set_list) {
> > + blk_mq_unquiesce_queue(q);
> > blk_mq_unfreeze_queue(q);
> > + }
> > }
> > void blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, int nr_hw_queues)
> Hi, Ming
>
> If blk_mq_quiesce_queue() is called from __blk_mq_update_nr_hw_queues()
> first, and then swithing elevator to none won't trigger the problem.
> However, what if blk_mq_unquiesce_queue() from switching elevator
> decrease quiesce_depth to 0 first, and then blk_mq_quiesce_queue() is
> called from __blk_mq_update_nr_hw_queues(), it seems to me such
> concurrent scenarios still exist.

No, the scenario won't exist, once blk_mq_quiesce_queue() returns, it is
guaranteed that:

- in-progress run queue is drained
- no new run queue can be started

Thanks,
Ming