Re: [3.0-rc0 Regression]: legacy vsyscall emulation increases userCPU time by 20%

From: Dave Chinner
Date: Mon Aug 01 2011 - 09:25:57 EST


On Mon, Aug 01, 2011 at 08:29:47AM -0400, Andrew Lutomirski wrote:
> On Sun, Jul 31, 2011 at 7:01 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> > On Fri, Jul 29, 2011 at 09:26:19AM -0400, Andrew Lutomirski wrote:
> >> On Fri, Jul 29, 2011 at 8:17 AM, Andrew Lutomirski <luto@xxxxxxx> wrote:
> >> > On Fri, Jul 29, 2011 at 3:24 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> >> >> On Thu, Jul 28, 2011 at 11:30:49PM -0400, Andrew Lutomirski wrote:
> >> >>> >
> >> >>> > Assuming this is the problem, can this be fixed without requiring
> >> >>> > the whole world having to wait for the current glibc dev tree to
> >> >>> > filter down into distro repositories?
> >> >>>
> >> >>> How old is your glibc?  gettimeofday has used the vdso since:
> >> >>
> >> >> It's 2.11 on the test machine, whatever that translates to. I
> >> >> haven't really changed the base userspace for about 12 months
> >> >> because if I do I invalidate all my historical benchmark results
> >> >> that I use for comparisons.
> >> >
> >> > 2.11 is from 2009 and appears to contain that commit.  Does your
> >> > workload call time() very frequently?  That's the largest slowdown.
> >> > With the old code, time() took 4-5 ns and with the new code time() is
> >> > about as slow as gettimeofday().  I suggested having a config option
> >> > to allow time() to stay fast until glibc 2.14 became widespread, but a
> >> > few other people disagreed.
> >>
> >> *sigh*
> >>
> >> fs_mark: fs_mark.o lib_timing.o
> >>         ${CC} -static -o fs_mark fs_mark.o lib_timing.o
> >>
> >> Even brand-new glibc still issues vsyscalls when statically linked,
> >> and Ulrich has said [1] that he doesn't care that much about
> >> performance of statically linked code.
> >>
> >> How bad would it be to just remove the -static from the makefile?
> >
> > Results in 270s +-5s user CPU time, so user CPU time is still ~10%
> > up on 3.0 numbers.  IOWs, a non-static link roughly halves the
> > regression but doesn't get rid of it.
>
> Are you sure? I stuck a trace event in do_emulate_vsyscall and it's
> not getting hit at all in fs_mark, at least on my system. I'll send
> out the patch tomorrow.

It may be other changes to kernel code that are causing the rest of
the regresssion. Kernel code that blows the CPU caches (e.g. direct
reclaim LRU scanning) can have a major effect of the userspace CPU
time, so it's probably some secondary effect like that I'm seeing.

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/