Re: VM issue causing high CPU loads

From: Trond Myklebust
Date: Thu Sep 03 2009 - 09:01:42 EST


On Wed, 2009-09-02 at 17:06 -0700, Andrew Morton wrote:
> On Mon, 31 Aug 2009 22:39:20 +0200
> Yohan <ytordjman@xxxxxxxxxxxx> wrote:
>
> > Yohan wrote:
> > > Andrew Morton wrote:
> > >> On Mon, 24 Aug 2009 16:23:22 +0200
> > >> Yohan <kernel@xxxxxxxxxxxxxxxxxxxxxx> wrote:
> > >>> Hi,
> > >>>
> > >>> Is someone have an idea for that :
> > >>>
> > >>> http://bugzilla.kernel.org/show_bug.cgi?id=14024
> > >>>
> > >> Please generate a kernel profile to work out where all the CPU tie is
> > >> being spent. Documentation/basic_profiling.txt is a starting point.
> > >>
> > > I post some new reports, it seems that the problem is in
> > > rpcauth_lookup_credcache ...
>
> Thanks, that helps a lot.
>
> > > for information, this is an imap mail server that mounts ~10 netapp
> > > over ~300 mountpoints..
> > I saw that : http://patchwork.kernel.org/patch/24747/
>
> I wonder what happened with Miquel's patch?

At the time, I asked him to split out the various changes into several
patches.

His patch did a lot of different things that would impact workloads in
different ways. For instance, while increasing the hash table size is
not likely to have a huge performance degradation for most people, the
change that decreases the garbage collection timeout is very likely to
cause issues (particularly with RPCSEC_GSS setups)...

> > I did only:
> >
> > --- linux-2.6.27.21/include/linux/sunrpc/auth.h 2009-03-23 23:04:09.000000000 +0100
> > +++ linux-2.6.27.21/include/linux/sunrpc/auth.h 2009-05-19 16:02:35.000000000 +0200
> > @@ -62,8 +62,12 @@
> > */
> > - #define RPC_CREDCACHE_HASHBITS 4
> > + #define RPC_CREDCACHE_HASHBITS 12
> >
> >
> > And i test it in prod since sunday: i only have 36% of one core used by
> > system
> > versus more than 3 cores used by system in another server that did a
> > drop_caches at morning...
> >
>
> OK, but it's still pretty bad. Let's tell the NFS guys.
>
> In http://bugzilla.kernel.org/show_bug.cgi?id=14024 we appear to have a
> major meltdown caused by the linear search in
> rpcauth_lookup_credcache() with Yohan's workload.
>

OK. Could we please have some more details about the actual workload
involved here?
As far as I can see, there is no RPCSEC_GSS involved, so credentials
should never expire. They will be reused as long as processes aren't
switching between thousands and thousands of different combinations of
uid, gid and groups.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/