Re: [NFS] 2.6.23-rc1-mm2

From: Johannes Berg
Date: Mon Aug 06 2007 - 04:08:12 EST


On Fri, 2007-08-03 at 21:21 +0400, Oleg Nesterov wrote:

> To avoid a possible confusion: it is still OK if work->func() flushes
> its own workqueue, so strictly speaking this trace is false positive,
> but it would be very nice if we can get rid of this practice.

I just had a thought: we could get rid of this warning by using a
read-lock here. That way, flushing from within a work function (which
would be seen as read-after-read recursive lock) won't trigger this
warning. Patch below. This would, however, also get rid of any warnings
for run_workqueue recursion. Which again we may or may not want, the
code inidicates that it should be allowed up to a depth of three.

However, the question whether we should allow flush_workqueue from
within a struct work is mainly an API policy issue; it doesn't hurt to
flush a workqueue from within a work, but it is probably nearer the
intent to use targeted cancel_work_sync() or such. OTOH, one could
imagine situations where multiple different work structs are on that
workqueue belonging to the same subsystem and then the general
flush_scheduled_work() call is the only way to guarantee nothing is on
scheduled at a given point... I don't feel qualified to make the
decision for or against allowing this use of the API at this point.

Marc, do you have an easy way to trigger this warning? Could you verify
that it goes away with the patch below applied?

johannes

---
kernel/workqueue.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

--- wireless-dev.orig/kernel/workqueue.c 2007-08-06 08:11:23.297846657 +0200
+++ wireless-dev/kernel/workqueue.c 2007-08-06 08:19:54.727846657 +0200
@@ -272,7 +272,7 @@ static void run_workqueue(struct cpu_wor

BUG_ON(get_wq_data(work) != cwq);
work_clear_pending(work);
- lock_acquire(&cwq->wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
+ lock_acquire(&cwq->wq->lockdep_map, 0, 0, 1, 2, _THIS_IP_);
lock_acquire(&lockdep_map, 0, 0, 0, 2, _THIS_IP_);
f(work);
lock_release(&lockdep_map, 1, _THIS_IP_);
@@ -395,7 +395,7 @@ void fastcall flush_workqueue(struct wor
int cpu;

might_sleep();
- lock_acquire(&wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
+ lock_acquire(&wq->lockdep_map, 0, 0, 1, 2, _THIS_IP_);
lock_release(&wq->lockdep_map, 1, _THIS_IP_);
for_each_cpu_mask(cpu, *cpu_map)
flush_cpu_workqueue(per_cpu_ptr(wq->cpu_wq, cpu));
@@ -779,7 +779,7 @@ static void cleanup_workqueue_thread(str
if (cwq->thread == NULL)
return;

- lock_acquire(&cwq->wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
+ lock_acquire(&cwq->wq->lockdep_map, 0, 0, 1, 2, _THIS_IP_);
lock_release(&cwq->wq->lockdep_map, 1, _THIS_IP_);

flush_cpu_workqueue(cwq);

Attachment: signature.asc
Description: This is a digitally signed message part