[PATCH] VFS: br_write_lock locks on possible CPUs other than onlineCPUs

From: mengcong
Date: Sun Dec 18 2011 - 22:36:37 EST


In a heavily loaded system, when frequently turning on and off CPUs, the
kernel will detect soft-lockups on multiple CPUs. The detailed bug report
is at https://lkml.org/lkml/2011/8/24/185.

The root cause is that brlock functions, i.e. br_write_lock() and
br_write_unlock(), only locks/unlocks the per-CPU spinlock of CPUs that
are online, which means, if one online CPU is locked and then goes
offline, any later unlocking operation happens during its offline state
will not touch it; and when it goes online again, it has the incorrect
brlock state. This has been verified in current kernel.

I can reproduce this bug on the intact 3.1 kernel. After my patch applied,
I've ran an 8-hours long test(test script provided by the bug reporter),
and no soft lockup happened again.

Signed-off-by: Cong Meng <mc@xxxxxxxxxxxxxxxxxx>
Reported-by: Srivatsa S. Bhat <srivatsa.bhat@xxxxxxxxxxxxxxxxxx>
---
include/linux/lglock.h | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/lglock.h b/include/linux/lglock.h
index f549056..08b9e84 100644
--- a/include/linux/lglock.h
+++ b/include/linux/lglock.h
@@ -27,8 +27,8 @@
#define br_lock_init(name) name##_lock_init()
#define br_read_lock(name) name##_local_lock()
#define br_read_unlock(name) name##_local_unlock()
-#define br_write_lock(name) name##_global_lock_online()
-#define br_write_unlock(name) name##_global_unlock_online()
+#define br_write_lock(name) name##_global_lock()
+#define br_write_unlock(name) name##_global_unlock()

#define DECLARE_BRLOCK(name) DECLARE_LGLOCK(name)
#define DEFINE_BRLOCK(name) DEFINE_LGLOCK(name)
--
1.7.5.4



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/