Re: [PATCH -next 0/4] blk-cgroup: synchronize del_gendisk() with configuring cgroup policy

From: Tejun Heo
Date: Mon Dec 19 2022 - 15:56:17 EST


Hello,

On Sat, Dec 17, 2022 at 11:09:04AM +0800, Yu Kuai wrote:
> From: Yu Kuai <yukuai3@xxxxxxxxxx>
>
> iocost is initialized when it's configured the first time, and iocost
> initializing can race with del_gendisk(), which will cause null pointer
> dereference:
>
> t1 t2
> ioc_qos_write
> blk_iocost_init
> rq_qos_add
> del_gendisk
> rq_qos_exit
> //iocost is removed from q->roqs
> blkcg_activate_policy
> pd_init_fn
> ioc_pd_init
> ioc = q_to_ioc(blkg->q)
> //can't find iocost and return null
>
> And iolatency is about to switch to the same lazy initialization.
>
> This patchset fix this problem by synchronize rq_qos_add() and
> blkcg_activate_policy() with rq_qos_exit().

So, the patchset seems a bit overly complicated to me. What do you think
about the following?

* These init/exit paths are super cold path, just protecting them with a
global mutex is probably enough. If we encounter a scalability problem,
it's easy to fix down the line.

* If we're synchronizing this with a mutex anyway, no need to grab the
queue_lock, right? rq_qos_add/del/exit() can all just hold the mutex.

* And we can keep the state tracking within rq_qos. When rq_qos_exit() is
called, mark it so that future adds will fail - be that a special ->next
value a queue flag or whatever.

Thanks.

--
tejun