Re: [PATCH -next 0/4] blk-cgroup: synchronize del_gendisk() with configuring cgroup policy

From: Yu Kuai
Date: Tue Dec 20 2022 - 04:19:24 EST


Hi,

在 2022/12/20 4:55, Tejun Heo 写道:
Hello,

On Sat, Dec 17, 2022 at 11:09:04AM +0800, Yu Kuai wrote:
From: Yu Kuai <yukuai3@xxxxxxxxxx>

iocost is initialized when it's configured the first time, and iocost
initializing can race with del_gendisk(), which will cause null pointer
dereference:

t1 t2
ioc_qos_write
blk_iocost_init
rq_qos_add
del_gendisk
rq_qos_exit
//iocost is removed from q->roqs
blkcg_activate_policy
pd_init_fn
ioc_pd_init
ioc = q_to_ioc(blkg->q)
//can't find iocost and return null

And iolatency is about to switch to the same lazy initialization.

This patchset fix this problem by synchronize rq_qos_add() and
blkcg_activate_policy() with rq_qos_exit().

So, the patchset seems a bit overly complicated to me. What do you think
about the following?

* These init/exit paths are super cold path, just protecting them with a
global mutex is probably enough. If we encounter a scalability problem,
it's easy to fix down the line.

* If we're synchronizing this with a mutex anyway, no need to grab the
queue_lock, right? rq_qos_add/del/exit() can all just hold the mutex.

* And we can keep the state tracking within rq_qos. When rq_qos_exit() is
called, mark it so that future adds will fail - be that a special ->next
value a queue flag or whatever.

Yes, that sounds good. BTW, queue_lock is also used to protect
pd_alloc_fn/pd_init_fn,and we found that blkcg_activate_policy() is
problematic:

blkcg_activate_policy
spin_lock_irq(&q->queue_lock);
list_for_each_entry_reverse(blkg, &q->blkg_list
pd_alloc_fn(GFP_NOWAIT | __GFP_NOWARN,...) -> failed

spin_unlock_irq(&q->queue_lock);
// release queue_lock here is problematic, this will cause
pd_offline_fn called without pd_init_fn.
pd_alloc_fn(__GFP_NOWARN,...)

If we are using a mutex to protect rq_qos ops, it seems the right thing
to do do also using the mutex to protect blkcg_policy ops, and this
problem can be fixed because mutex can be held to alloc memroy with
GFP_KERNEL. What do you think?

Thanks,
Kuai

Thanks.