Re: Failover in NFS

From: Ragnar Kjørstad (kernel@ragnark.vestdata.no)
Date: Mon Nov 18 2002 - 17:22:30 EST


On Mon, Nov 18, 2002 at 04:11:06PM -0600, Jesse Pollard wrote:
> > No, you need to move the IP-address from the old nfs-server to the new
> > one. Then to the clients it will look like a regular reboot. (Check out
> > heartbeat, at http://www.linux-ha.org/)
> >
> > You need to make sure that NFS is using the shared ip (the one you move
> > around) rather than the fixed ip. (I assume you will have a fixed ip on
> > each host in addition to the one you move around). Also, you need to put
> > /var/lib/nfs on shared stoarage. See the archive for more details.
>
> It would actually be better to use two floating IP numbers. That way during
> normal operation, both servers would be functioning simultaneously
> (based on the shared storage on two nodes).
>
> Then during failover, the floating IP of the failed node is activated on the
> remaining node (total of 3 IP numbers now, one real, two floating). The NFS
> recovery cycle should then cause the clients to remount the filesystem from
> the backup server.

Yes, that would be better.

But it would not work as described above. There are some important
limitations here:

- I assumed that /var/lib/nfs is shared. If you want two servers to
  be active at once you need a different way to share lock-data.

- AFAIK there is no way for statd to service 2 IP's at once.
  It will (AFAIK) bind to both adresses, but the problem is the
  message that is sent out at startup and includes the ip of
  the local host.

Neither limitation is a law of nature. They can be fixed. I think there
is work going on to change the way locks are stored, and I'm sure the
second problem can be solved as well.

There may be solutions out there already. E.g. maybe Lifekeeper or
Convolo include better support for this?

-- 
Ragnar Kjørstad
Big Storage
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Nov 23 2002 - 22:00:25 EST