Re: [NFS] NFS clients behind a masqueraded gateway

From: Rob Landley (landley@trommello.org)
Date: Thu Apr 25 2002 - 10:22:17 EST


On Thursday 25 April 2002 01:24 pm, Steffen Persvold wrote:
> Hi again,
>
> I hate to bother you guys again with this problem, but do you have any
> ideas (haven't received any response so far) ?
>
> Answers highly appreciated.
>
> On Thu, 18 Apr 2002, Steffen Persvold wrote:
> > Hi all,
> >
> > I'm experiencing some problems with a cluster setup. The cluster is set
> > up in a way that you have a frontend machine configured as a masquerading
> > gateway and all the compute nodes behind it on a private network (i.e the
> > frontend has two network interfaces). User home directories and also
> > other data directories which should be available to the cluster (i.e
> > statically mounted in the same location on both frontend and nodes) are
> > located on external NFS servers (IRIX and Linux servers). This seems to
> > work fine when the cluster is in use, but if the cluster is idle for some
> > time (e.g over night), the NFS directories has become unavailable and
> > trying to reboot the frontend results in a complete hang when it tries to
> > unmount the NFS directories (it hangs in a fuser command). The frontend
> > and all the nodes are running RedHat 7.2, but with a stock 2.4.18 kernel
> > (plus Trond's seekdir patch, thanks for the help BTW).
> >
> > Ideas anyone ?
> >
> > Thanks in advance,
>
> Regards,

I've run into a similar problem combining ssh with IP Masquerading.
Specificallly, IP Masquerading has a timeout: after a certain period of
inactivity it forgets about a connection. (It doesn't seem to clean up the
connection and send close packets, it just silently drops it from its
internal routing tables.)

The really FUN part is that, depending on how your firewall rules are set up,
when you try to send a packet along an expired NAT connection it may just
silently drop it (rather than sending back a "what, are you NUTS?" packet to
let you know that's not a good connection anymore).

I believe the NAT Masquerading timeout defaults to 15 minutes (it probably
has a configuration entry under /proc somewhere, but am not caffienated
enough to remember where. Try Documentation/filesystems/procfs.txt in the
linux source tarball.)

It's also possible that something like "echo 600 >
/proc/sys/net/ipv4/tcp_keepalive_time" might help. (Make keepalive timeout
happen faster than IP masquerading timeout.) You never know. :)

Rob
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Apr 30 2002 - 22:00:11 EST