Re: NFS in Linux 2.2.1

Trond Myklebust (trond.myklebust@fys.uio.no)
02 Feb 1999 00:45:51 +0100


hjl@lucon.org (H.J. Lu) writes:

> The bad news is when the sync write enabled for the Linux kernel
> NFS server, the NFS write performance on the Linux server drops
> dramatically. The client write speed slows down from around 900KB/s
> to around 60KB/s. It seems something is wrong in the Linux kernel
> when the kernel NFS server does sync write. Does anyone have any
> ideas on that?
>

It could be related to the RPC congestion avoidance algorithm. It
reduces the number of available RPC slots far too rapidly in my
opinion, halving the number of free slots for each message
timeout. This would certainly fit the numbers you quote.

In writing the 8k-wsize patch, I found an enormous improvement when I
set the minimum number of free slots to 4 (this apparently is the
'rule of thumb' minimum number of nfsiod threads as practiced by most
Unices).

In addition, there is a bug in the timeout handling code: the timeout
of a backlogged event (one that has not yet been allocated a slot)
still causes a full reconnection (via the portmapper) to the server.

Please try the appended patch, and see if it helps.

Cheers,
Trond

diff -urN linux-2.2.1/include/linux/sunrpc/xprt.h linux/include/linux/sunrpc/xprt.h
--- linux-2.2.1/include/linux/sunrpc/xprt.h Thu Jan 28 21:41:21 1999
+++ linux/include/linux/sunrpc/xprt.h Tue Feb 2 00:39:15 1999
@@ -42,7 +42,7 @@
#define RPC_MAXREQS (RPC_MAXCONG + 1)
#define RPC_CWNDSCALE 256
#define RPC_MAXCWND (RPC_MAXCONG * RPC_CWNDSCALE)
-#define RPC_INITCWND RPC_CWNDSCALE
+#define RPC_INITCWND (RPC_MAXCONG * RPC_CWNDSCALE >> 1)
#define RPCXPRT_CONGESTED(xprt) \
((xprt)->cong >= ((xprt)->nocong? RPC_MAXCWND : (xprt)->cwnd))

diff -urN linux-2.2.1/net/sunrpc/clnt.c linux/net/sunrpc/clnt.c
--- linux-2.2.1/net/sunrpc/clnt.c Tue Oct 6 18:39:44 1998
+++ linux/net/sunrpc/clnt.c Tue Feb 2 00:38:10 1999
@@ -600,7 +600,7 @@
printk("%s: task %d can't get a request slot\n",
clnt->cl_protname, task->tk_pid);
}
- if (clnt->cl_autobind)
+ if (req && clnt->cl_autobind)
clnt->cl_port = 0;

minor_timeout:
diff -urN linux-2.2.1/net/sunrpc/xprt.c linux/net/sunrpc/xprt.c
--- linux-2.2.1/net/sunrpc/xprt.c Wed Jan 20 22:44:53 1999
+++ linux/net/sunrpc/xprt.c Tue Feb 2 00:40:13 1999
@@ -306,8 +306,8 @@
"time %ld ms\n", xprt->cong, xprt->cwnd, cwnd,
(xprt->congtime-jiffies)*1000/HZ);
} else if (result == -ETIMEDOUT) {
- if ((cwnd >>= 1) < RPC_CWNDSCALE)
- cwnd = RPC_CWNDSCALE;
+ if ((cwnd >>= 1) < 4*RPC_CWNDSCALE)
+ cwnd = 4*RPC_CWNDSCALE;
xprt->congtime = jiffies + ((cwnd * HZ) << 3) / RPC_CWNDSCALE;
dprintk("RPC: cong %ld, cwnd was %ld, now %ld, "
"time %ld ms\n", xprt->cong, xprt->cwnd, cwnd,
@@ -1325,6 +1325,7 @@
xprt->prot = proto;
xprt->stream = (proto == IPPROTO_TCP)? 1 : 0;
xprt->cwnd = RPC_INITCWND;
+ xprt->congtime = jiffies;
#ifdef SOCK_HAS_USER_DATA
inet->user_data = xprt;
#else

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/