Re: [PATCH] workqueue: don't skip lockdep wq dependency in cancel_work_sync()

From: Tetsuo Handa
Date: Thu Jul 28 2022 - 21:49:23 EST


Johannes, why did you think that flagging it as if cancel_work_sync()
was flush_work() is a problem?

Unconditionally recording

"struct mutex" mutex->lockdep_map => "struct work_struct" work1->lockdep_map
"struct mutex" mutex->lockdep_map => "struct work_struct" work2->lockdep_map

chains has zero problem.

Unconditionally recording

"struct mutex" mutex->lockdep_map => "struct workqueue_struct" ordered_wq->lockdep_map

chain when ordered_wq can process only one work item at a time
in order to indicate that the ordered_wq is currently unable to process
other works has zero problem.

The example shown in commit d6e89786bed977f3 ("workqueue: skip lockdep wq
dependency in cancel_work_sync()") is nothing but violation of a rule that
"Do not hold a lock from a work callback function (do not record

"struct work_struct" work1->lockdep_map => "struct mutex" mutex->lockdep_map
"struct workqueue_struct" ordered_wq->lockdep_map => "struct mutex" mutex->lockdep_map

chain) if somebody might wait for completion of that callback function with
that lock held (might record

"struct mutex" mutex->lockdep_map => "struct work_struct" work1->lockdep_map
"struct mutex" mutex->lockdep_map => "struct workqueue_struct" ordered_wq->lockdep_map

chain)."

Which in-tree ordered workqueue instance is hitting this problem?