Re: [RFC 4/5] x86, perf: implements lwp-perf-integration (rc1)

From: Vince Weaver
Date: Tue Dec 20 2011 - 17:48:02 EST


On Tue, 20 Dec 2011, Ingo Molnar wrote:

>
> * Vince Weaver <vweaver1@xxxxxxxxxxxx> wrote:
>
> > On Tue, 20 Dec 2011, Ingo Molnar wrote:
>
> > > Granted, LWP was mis-designed to quite a degree, those AMD
> > > chip engineers should have talked to people who understand
> > > how modern PMU abstractions are added to the OS kernel
> > > properly.
> >
> > You do realize that LWP was probably in design 5+ years ago,
> > at a time when most Linux kernel developers wanted nothing to
> > do with perf counters, and thus anyone they did contact for
> > help would have been from the since-rejected perfctr or
> > perfmon2 camp.
>
> That does not really contradict what i said.

Well I'm just assuming that when you say "people who understand
how modern PMU abstractions are added to the OS kernel properly"
you mean yourself and the perf_event crew.

There are many other schools of thought on what kernel PMU abstractions
should look like, and I'm sure AMD conferred with them.


> > Running LWP through the kernel is a foolish idea. Does anyone
> > have any numbers on what that would do to overhead?
>
> At most an LLWPCB instruction is needed.

you're saying that all the crazy kernel stuff you're proposing will have
no extra overhead when compared to just implementing the proper xsave
context switch code?

> > perf_events creates huge overhead when doing self monitoring.
> > For simple self-monintoring counter reads it is an *order of
> > magnitude* worse than doing the same thing with perfctr.
>
> Only if you are comparing apples to oranges: if you compare a
> full kernel based read of self-profiling counters with an RDPMC
> instruction.

The benchmarks I posted show measurements getting *real data* from the
counters. Yes, on perfctr this is mostly just a rdpmc call plus a quick
access to some mmap'd memory to make sure the context is valid.

perfctr is an order of magnitude less overhead because it was designed
from the beginning to be a very low-overhead way to get self-monitoring
data. A lot of time and tuning was spent getting it that fast.

perf_event throws everything and the kitchen sink in the the kernel. I'm
guessing low-overhead self-monitoring was not really one of your primary
design goals, and it shows.

> But as we told you previously, you could use RDPMC under perf as
> well, last i checked PeterZ posted experimental patches for
> that. Peter, what's the status of that?

yes. If you checked the benchmark results I showed, you'd have seen that
I run tests against that patchset too, and it's really only marginally
better that the current perf_event stuff. I might have written the
benchmark poorly, but that's mainly because as-posted the documentation
for how to use that patchset is a bit unclear.

Vince
vweaver1@xxxxxxxxxxxx

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/