Re: bdflush/rpciod high CPU utilization, profile does not make sense

From: Jakob Oestergaard
Date: Tue Apr 19 2005 - 14:49:09 EST


On Tue, Apr 12, 2005 at 11:28:43AM +0200, Jakob Oestergaard wrote:
...
>
> But still, guys, it is the *same* server with tg3 that runs well with a
> 2.4 client but poorly with a 2.6 client.
>
> Maybe I'm just staring myself blind at this, but I can't see how a
> general problem on the server (such as packet loss, latency or whatever)
> would cause no problems with a 2.4 client but major problems with a 2.6
> client.

Another data point;

I upgraded my mom's machine from an earlier 2.6 (don't remember which,
but I can find out) to 2.6.11.6.

It mounts a home directory from a 2.6.6 NFS server - the client and
server are on a hub'ed 100Mbit network.

On the earlier 2.6 client I/O performance was as one would expect on
hub'ed 100Mbit - meaning, not exactly stellar, but you'd get around 4-5
MB/sec and decent interactivity.

The typical workload here is storing or retrieving large TIFF files on
the client, while working with other things in KDE. So, if the
large-file NFS I/O causes NFS client stalls, it will be noticable on the
desktop (probably as Konqueror or whatever is accessing configuration or
cache files).

With 2.6.11.6 the client is virtually unusable when large files are
transferred. A "df -h" will hang on the mounted filesystem for several
seconds, and I have my mom on the phone complaining that various windows
won't close and that her machine is too slow (*again* it's no more than
half a year ago she got the new P4) ;)

Now there's plenty of things to start optimizing; RPC over TCP, using a
switch or crossover cable instead of the hub, etc. etc.

However, what changed here was the client kernel going from en earlier
2.6 to 2.6.11.6.

A lot happened to the NFS client in 2.6.11 - I wonder if there's any of
these patches that are worth trying to revert? I have several setups
that suck currently, so I'm willing to try a thing or two :)

I would try
---
<trond.myklebust@xxxxxxxxxx>
RPC: Convert rpciod into a work queue for greater flexibility.
Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxx>
---
if noone has a better idea... But that's just a hunch based solely on
my observation of rpciod being a CPU hog on one of the earlier client
systems. I didn't observe large 'sy' times in vmstat on this client
while it hung on NFS though...

Any suggestions would be greatly appreciated,

--

/ jakob

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/