[patch amendment] ieee1394: survive a few seconds connection loss

From: Stefan Richter
Date: Fri Oct 10 2008 - 13:13:55 EST


On 19 Aug, Stefan Richter wrote:
> There are situations when nodes vanish from the bus and come back in
> quickly thereafter:
> - When certain bus-powered hubs are plugged in,
> - when certain disk enclosures are switched from self-power to bus
> power or vice versa and break the daisy chain during the transition,
> - when the user plugs a cable out and quickly plugs it back in, e.g.
> to reorder a daisy chain (works on Mac OS X if done quickly enough),
> - when certain hubs temporarily malfunction during high bus traffic.
>
> The ieee1394 driver's nodemgr already contained a function to set
> vanished nodes aside into "limbo"; i.e. they wouldn't actually be
> deleted right away. (In fact, only unloading the driver or writing into
> an obscure sysfs attribute would delete them eventually.) If nodes
> reappeared later, they would be resurrected out of limbo.
>
> Moving nodes into and out of limbo was accompanied with calling the
> .suspend() and .resume() driver methods of the drivers which were bound
> to a respective node's unit directories. Not only is this somewhat
> strange due to the primary use of these drivers for power management,
> also the sbp2 driver in particular does not implement .suspend() and
> .resume().
>
> Hence sbp2 would be disconnected from devices in situations as listed
> above.
>
> We now:
> - leave drivers bound when nodes go into limbo,
> - call the drivers' .update() when nodes come out of limbo,
> - automatically delete in-limbo nodes 5 seconds after the last
> bus reset and bus rescan.
> - Because of the automatic removal, the now obsolete bus attribute
> /sys/bus/ieee1394/destroy_node is removed.
>
> This especially lets sbp2 survive brief disconnections. You can for
> example yank a disk's cable and plug it back in while reading the
> respective disk with dd, but dd will happily continue as if nothing
> happened.

Amendment: Reduce timeout from 5 to 3 seconds. This is enough because
the timeout is restarted if another bus reset happened during the
timeout.

Signed-off-by: Stefan Richter <stefanr@xxxxxxxxxxxxxxxxx>
---
drivers/ieee1394/nodemgr.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)

Index: linux/drivers/ieee1394/nodemgr.c
===================================================================
--- linux.orig/drivers/ieee1394/nodemgr.c
+++ linux/drivers/ieee1394/nodemgr.c
@@ -1726,18 +1726,17 @@ static int nodemgr_host_thread(void *dat
/* Update some of our sysfs symlinks */
nodemgr_update_host_dev_links(host);

- /* Sleep 5 seconds */
- for (i = 0; i < 5000/100 ; i++) {
- msleep_interruptible(100);
+ /* Sleep 3 seconds */
+ for (i = 3000/200; i; i--) {
+ msleep_interruptible(200);
if (kthread_should_stop())
goto exit;

if (generation != get_hpsb_generation(host))
break;
}
-
/* Remove nodes which are gone, unless a bus reset happened */
- if (i == 5000/100)
+ if (!i)
nodemgr_remove_nodes_in_limbo(host);
}
exit:


--
Stefan Richter
-=====-==--- =-=- -=-=-
http://arcgraph.de/sr/

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/