Re: [PATCH v2] workqueue: Add rcu lock check after work execute end

From: Lai Jiangshan
Date: Wed Jan 10 2024 - 04:08:54 EST


On Wed, Jan 10, 2024 at 11:27 AM Xuewen Yan <xuewen.yan@xxxxxxxxxx> wrote:
>
> Now the workqueue just check the atomic and lock after
> work execute end. However, sometimes, drivers's work
> may don't unlock rcu after call rcu_read_lock().
> And as a result, it would cause rcu stall, but the rcu stall warning
> can not dump the work func, because the work has finished.
>
> In order to quickly discover those works that do not call
> rcu_read_unlock after rcu_read_lock(). Add the rcu lock check.
>
> Use rcu_preempt_depth() to check the work's rcu status,
> Normally, this value is 0. If this value is bigger than 0,
> it means the work are still holding rcu lock.
> At this time, we print err info and print the work func.
>
> Signed-off-by: Xuewen Yan <xuewen.yan@xxxxxxxxxx>
> ---
> V2:
> - move check to unlikely() helper (Longman)
> ---
> kernel/workqueue.c | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 2989b57e154a..c2a73364f5ad 100644

Reviewed-by: Lai Jiangshan <jiangshanlai@xxxxxxxxx>

> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -2634,11 +2634,12 @@ __acquires(&pool->lock)
> lock_map_release(&lockdep_map);
> lock_map_release(&pwq->wq->lockdep_map);
>
> - if (unlikely(in_atomic() || lockdep_depth(current) > 0)) {
> - pr_err("BUG: workqueue leaked lock or atomic: %s/0x%08x/%d\n"
> + if (unlikely(in_atomic() || lockdep_depth(current) > 0 ||
> + rcu_preempt_depth() > 0)) {
> + pr_err("BUG: workqueue leaked lock or atomic: %s/0x%08x/%d/%d\n"
> " last function: %ps\n",
> - current->comm, preempt_count(), task_pid_nr(current),
> - worker->current_func);
> + current->comm, preempt_count(), rcu_preempt_depth(),
> + task_pid_nr(current), worker->current_func);
> debug_show_held_locks(current);
> dump_stack();
> }
> --
> 2.25.1
>