[patch 21/40] sgi-xp: eliminate false detection of no heartbeat

From: Greg KH
Date: Fri Jan 23 2009 - 01:24:22 EST


2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Dean Nelson <dcn@xxxxxxx>

commit 158bc69effbf96f59c01cdeb20f8d4c184e59f8e upstream.

After XPC has been up and running on multiple partitions for any length of
time, if XPC on one of the partitions is stopped and restarted (either by
a rmmod/insmod or a system restart), it is possible for the XPCs running
on the other partitions to falsely detect a lack of heartbeat from the XPC
that was just restarted. This false detection will occur if the restarted
XPC comes up within the five-seconds preceding one of the other XPC's
heartbeat check (which occurs once every twenty seconds).

The detection of no heartbeat results in the detecting XPC deactivating
from the just restarted XPC. The only remedy is to restart one of the
XPCs and hope that one doesn't hit this five-second window on any of the
other partitions.

Signed-off-by: Dean Nelson <dcn@xxxxxxx>
Signed-off-by: Robin Holt <holt@xxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxx>

---
drivers/misc/sgi-xp/xpc_sn2.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/misc/sgi-xp/xpc_sn2.c
+++ b/drivers/misc/sgi-xp/xpc_sn2.c
@@ -904,7 +904,7 @@ xpc_update_partition_info_sn2(struct xpc
dev_dbg(xpc_part, " remote_vars_pa = 0x%016lx\n",
part_sn2->remote_vars_pa);

- part->last_heartbeat = remote_vars->heartbeat;
+ part->last_heartbeat = remote_vars->heartbeat - 1;
dev_dbg(xpc_part, " last_heartbeat = 0x%016lx\n",
part->last_heartbeat);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/