Re: [Patch] rwsem: fix rwsem_is_locked() bug

From: Amerigo Wang
Date: Sun Oct 04 2009 - 23:23:00 EST


Andrew Morton wrote:
On Tue, 29 Sep 2009 23:19:02 -0400
Amerigo Wang <amwang@xxxxxxxxxx> wrote:

rwsem_is_locked() tests ->activity without locks, so we should always
keep ->activity consistent. However, the code in __rwsem_do_wake()
breaks this rule, it updates ->activity after _all_ readers waken up,
this may give some reader a wrong ->activity value, thus cause
rwsem_is_locked() behaves wrong.

Brian has a kernel module to reproduce this, I can include it
if any of you need. Of course, with Brian's approval.

With this patch applied, I can't trigger that bug any more.


Changelog doesn't describe the bug well.

Sorry for my English. :-/


---
diff --git a/lib/rwsem-spinlock.c b/lib/rwsem-spinlock.c
index 9df3ca5..44e4484 100644
--- a/lib/rwsem-spinlock.c
+++ b/lib/rwsem-spinlock.c
@@ -49,7 +49,6 @@ __rwsem_do_wake(struct rw_semaphore *sem, int wakewrite)
{
struct rwsem_waiter *waiter;
struct task_struct *tsk;
- int woken;
waiter = list_entry(sem->wait_list.next, struct rwsem_waiter, list);
@@ -78,24 +77,21 @@ __rwsem_do_wake(struct rw_semaphore *sem, int wakewrite)
/* grant an infinite number of read locks to the front of the queue */
dont_wake_writers:
- woken = 0;
while (waiter->flags & RWSEM_WAITING_FOR_READ) {
struct list_head *next = waiter->list.next;
+ sem->activity++;
list_del(&waiter->list);
tsk = waiter->task;
smp_mb();
waiter->task = NULL;
wake_up_process(tsk);
put_task_struct(tsk);
- woken++;
if (list_empty(&sem->wait_list))
break;
waiter = list_entry(next, struct rwsem_waiter, list);
}
- sem->activity += woken;
-
out:
return sem;
}

So if I understand this correctly

- we have one or more processes sleeping in down_read(), waiting for access.

- we wake one or more processes up without altering ->activity

- they start to run and they do rwsem_is_locked(). This incorrectly
returns "false", because the waker process is still crunching away in
__rwsem_do_wake().

- the waker now alters ->activity, but it was too late.

And the patch fixes this by updating ->activity prior to waking the
sleeping processes. So when they run, they'll see a non-zero value of
->activity.

Fair enough, I guess.


Yes, exactly.

But after reading David's comments, I realized that rwsem_is_locked()
has more problems, this only fixes one of them.

I will try another fix.


I don't know if we really need this in -stable. Do we expect that
there will be any real runtime bugs arising from this?

Not sure, I need an extra kernel module to trigger this bug,
so probably it doesn't affect the real kernel.

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/