Re: perfmon2 merge news

From: Philip Mucci
Date: Fri Nov 16 2007 - 04:19:01 EST


Just getting back to this now that SC07 is finally over...

On Nov 13, 2007, at 5:52 PM, Andi Kleen wrote:

On Tue, Nov 13, 2007 at 04:28:52PM -0800, Philip Mucci wrote:
I know you don't want to hear this, but we actually use all of the
features of perfmon, because a) we wanted to use the best methods

That is hard to believe.


You are welcome to download the code and some of the tools and verify the functionality yourself. It might be a good exercise.

But let's go for it temporarily for the argument.

Can you instead prioritize features. What is most essential, what is
important, what is just nice to have, what is rarely used?

Yes, although this has been done before. You've got the list below in the previous
emails which should be considered the absolute minimum.

- A feature which was dropped earlier by Stefane (only to satiate LKML), we consider
very important. Allowing one tomapping of the kernels view of the PMD's, allowing
user-space access to full 64-bit counts, if the architecture
supports a user-level read instruction. Getting the counts in a couple of dozen cycles
is ALWAYS a win for us. This is because the HPC community is mainly interested in
self-monitoring, not third-party, because the former can be easily associated with
context in the app through instrumentation in various forms.

- Kernel multiplexing is very nice to have, saves you tremendous overhead at user
level. PAPI has an implementation in user-space for the platforms that don't support
this. The flexibility of the current implementation is not exploited, here I'm
referring to the concept of eventsets. Having multiplexing is important. Being able
to allocate/reallocate eventsets and the threshold of individual eventsets is just nice
to have.

- Custom sample formats would be considered not often used in our community, largely
because the tools run on all HPC/Linux architectures. PAPI uses the default sample
format which has been sufficient for our needs. However, the lack of custom sample
formats preclude the dev of the specialized tools that access the sampling
hardware as found on the IA64, PPC64, the Barcelona and the SiCortex node chip.
pfmon exports this functionality quite well, and it does get used.


- providing virtualized 64-bit counters per-thread
- providing notification (buffered or non) on interrupt/overflow of
the above.

Ok that makes sense and should be possible with a reasonable simple
interface.

Well that's good news. The above is what we have used via the PerfCtr set of
patches for a long time. It wasn't quite enough, but it got the job done.

If you'd like to outline further what you'd like to hear from the
community, I can arrange that. I seem to remember going through this
once before, but I'd be happy to do it again. For reference, here's a
quick list from memory of some of the tools in active use and built
on this infrastructure. These are used heavily around the globe.

Please list concrete features, throwing around random names is not useful.


This is kind of comment that makes the Linux/HPC folks 'somber'. What isn't useful, is being dismissive of an entire community that moves a heck of a lot of Linux DVD's. >80% of the top500 list is Linux these days (compared to < 10% just a few years back), and so is the bulk of the HPC clusters in the marketplace, large and small. (ref those expensive IDC reports) These are tools used daily in HPC centers and industry around the globe, doing real work for folks that buy a lot of hardware and actually pay for Linux distributions. These tools seem random to you, because you haven't spent any time educating yourself about this community since we first talked about this >3 years back when considering PerfCtr. Really, there are dozens of HPC/ Linux events held every year around the world of varying sizes; should you ever attend one, this all might not seem so 'random'. And yes, you would be warmly welcomed.

-Phil
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/