[PATCH] del_timer() vs. mod_timer() SMP race

From: Benjamin Herrenschmidt
Date: Sun Nov 21 2004 - 21:47:25 EST


Hi !

We just spent some days fighting a rare race in one of the distro's who backported
some of timer.c from 2.6 to 2.4 (though they missed a bit).

The actual race we found didn't happen in 2.6 _but_ code inspection showed that a
similar race is still present in 2.6, explanation below:

Code removing a timer from a list (run_timers or del_timer) takes that CPU list
lock, does list_del, then timer->base = NULL.

It is mandatory that this timer->base = NULL is visible to other CPUs only after
the list_del() is complete. If not, then mod timer could see it NULL, thus take it's
own CPU list lock and not the one for the CPU the timer was beeing removed from the
list, and thus the list_add in mod_timer() could race with the list_del() from
run_timers() or del_timer().

Our race happened with run_timers(), which _DOES_ contain a proper smp_wmb() in the
right spot in 2.6, but didn't in the "backport" we were fighting with.

However, del_timer() doesn't have such a barrier, and thus is subject to this race in
2.6 as well. This patch fixes it.

Signed-off-by: Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx>

Index: linux-work/kernel/timer.c
===================================================================
--- linux-work.orig/kernel/timer.c 2004-11-22 11:50:59.000000000 +1100
+++ linux-work/kernel/timer.c 2004-11-22 13:35:38.928448032 +1100
@@ -308,6 +308,7 @@
goto repeat;
}
list_del(&timer->entry);
+ smp_wmb();
timer->base = NULL;
spin_unlock_irqrestore(&base->lock, flags);



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/