Re: [PATCH] workqueue: don't skip lockdep wq dependency in cancel_work_sync()

From: Lai Jiangshan
Date: Thu Jul 28 2022 - 22:50:24 EST


+CC Byungchul Park <byungchul.park@xxxxxxx>

On Fri, Jul 29, 2022 at 10:38 AM Lai Jiangshan <jiangshanlai@xxxxxxxxx> wrote:

> > +bool flush_work(struct work_struct *work)
> > {
> > struct wq_barrier barr;
> >
> > @@ -3066,12 +3075,10 @@ static bool __flush_work(struct work_struct *work, bool from_cancel)
> > if (WARN_ON(!work->func))
> > return false;
> >
> > - if (!from_cancel) {
> > - lock_map_acquire(&work->lockdep_map);
> > - lock_map_release(&work->lockdep_map);
> > - }
> > + lock_map_acquire(&work->lockdep_map);
> > + lock_map_release(&work->lockdep_map);
>
>
> IIUC, I think the change of these 5 lines of code (-3+2) is enough
> to fix the problem described in the changelog.
>
> If so, could you make a minimal patch?
>
> I believe what the commit d6e89786bed977f3 ("workqueue: skip lockdep
> wq dependency in cancel_work_sync()") fixes is real. It is not a good
> idea to revert it.
>
> P.S.
>
> The commit fd1a5b04dfb8("workqueue: Remove now redundant lock
> acquisitions wrt. workqueue flushes") removed this lockdep check.
>
> And the commit 87915adc3f0a("workqueue: re-add lockdep
> dependencies for flushing") added it back for non-canceling cases.
>
> It seems the commit fd1a5b04dfb8 is the culprit and 87915adc3f0a
> didn't fixes all the problem of it.
>
> So it is better to complete 87915adc3f0a by making __flush_work()
> does lock_map_acquire(&work->lockdep_map) for both canceling and
> non-canceling cases.

The cross-release locking check is reverted by the commit e966eaeeb623
("locking/lockdep: Remove the cross-release locking checks").

So fd1a5b04dfb8 was a kind of hasty. What it changed should be added back.