Re: [patch] Performance Counters for Linux, v3

From: Robert Richter
Date: Fri Dec 12 2008 - 05:22:22 EST


On 12.12.08 10:23:54, Peter Zijlstra wrote:
> On Fri, 2008-12-12 at 09:59 +0100, stephane eranian wrote:
> > For instance, people have pointed out that your design necessarily implies
> > pulling into the kernel the event table for all PMU models out there. This
> > is not just data, this is also complex algorithms to assign events to counters.
> > The constraints between events can be very tricky to solve. If you get this
> > wrong, this leads to silent errors, and that is really bad.
>
> (well, its not my design - I'm just trying to see how far we can push it
> out of sheer curiosity)
>
> This has to be done anyway, and getting it wrong in userspace is just as
> bad no?
>
> The _ONLY_ technical argument I've seen to do this in userspace is that
> these tables and text segments are unswappable in-kernel - which doesn't
> count too heavily in my book.

But, there are also no arguments to implement it not in userspace.

> > Looking at Intel Core, Nehalem, or AMD64 does not reflect the reality of
> > the complexity of this. Paul pointed out earlier the complexity on Power.
> > I can relate to the complexity on Itanium (I implemented all the code in
> > the user level libpfm for them). Read the Itanium PMU description and I
> > hope you'll understand.
>
> Again, I appreciate the fact that multi-dimensional constraint solving
> isn't easy. But any which way we turn this thing, it still needs to be
> done.

I agree with Stephane. There are already many different PMU
descriptions depending on family, model and steppping and with *every*
new cpu revision you will get one more update. Implementing this in
the kernel would require kernel updates where otherwise no changes
would be necessary.

If you look at current pmu implementations, there are tons of
descriptions files and code you don't want to have in the kernel.

Also, a profiling tool that needs a certain pmu feature would depend
then on its kernel implementation. (Actually, it is impossible to have
a 100% implementation coverage.) If the pmu could be programmed from
userspace, the tool could provide the feature itself.

> > Events constraints are not going away anytime soon, quite the contrary.
> >
> > Furthermore, event tables are not always correct. In fact, they are
> > always bogus.
> > Event semantics varies between steppings. New events shows up, others
> > get removed.
> > Constraints are discovered later on.
> >
> > If you have all of that in the kernel, it means you'll have to
> > generate a kernel patch each
> > time. Even if that can be encapsulated into a kernel module, you will
> > still have problems.
>
> How is updating a kernel module (esp one that only contains constraint
> tables) more difficult than upgrading a user-space library? That just
> doesn't make sense.

At least this would require a kernel with modules enabled.

> > Furthermore, Linux commercial distribution release cycles do not
> > align well with new processor
> > releases. I can boot my RHEL5 kernel on a Nehalem system and it would
> > be nice not to have to
> > wait for a new kernel update to get the full Nehalem PMU event table,
> > so I can program more than
> > the basic 6 architected events of Intel X86.
>
> Talking with my community hat on, that is an artificial problem created
> by distributions, tell them to fix it.

It does not make sense to close the eyes to reality. There are systems
where it is not possible to update the kernel frequently. Probably you
have one running yourself.

-Robert

--
Advanced Micro Devices, Inc.
Operating System Research Center
email: robert.richter@xxxxxxx

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/