Re: Failover in NFS

From: Bill Davidsen (davidsen@tmr.com)
Date: Thu Nov 21 2002 - 15:58:29 EST


On Mon, 18 Nov 2002, Jesse Pollard wrote:

> It would actually be better to use two floating IP numbers. That way during
> normal operation, both servers would be functioning simultaneously
> (based on the shared storage on two nodes).
>
> Then during failover, the floating IP of the failed node is activated on the
> remaining node (total of 3 IP numbers now, one real, two floating). The NFS
> recovery cycle should then cause the clients to remount the filesystem from
> the backup server.
>
> When the failed node is recovered, the active server should then disable the
> floating IP associated with the recovered server, causing only the mounts
> using that IP number to fall back to the proper node, balancing the load
> again.

That works for stateless connections, but for stateful connections like
POP, NNTP, SMTP, etc, you will lose all the connections currently
actively.

A proper solution is the have the recovered server accept ESTABLISHED and
--syn packets, then DNAT the rest to the fallback server, while the
fallback server takes and new (--syn) packets and does DNAT to the
recovered server.

I'm not sure iptables can do this right, you probably need a program to
get the DNAT part just correct. There may be some one of the experimental
patches which adds that capability, since people do load balancing with
Linux. It might take source routing, and certainly will be harder than
just turning off the alias ;-)

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Nov 23 2002 - 22:00:38 EST