Re: [RFC] [PATCH 1/1] perf: add support for arch-dependent symbolicevent names to "perf stat"

From: Ingo Molnar
Date: Tue Mar 16 2010 - 05:41:00 EST



* Corey Ashford <cjashfor@xxxxxxxxxxxxxxxxxx> wrote:

> On 3/11/2010 12:46 PM, Corey Ashford wrote:
> >
> >
> >On 3/11/2010 11:14 AM, Ingo Molnar wrote:
> >>
> >>* Corey Ashford<cjashfor@xxxxxxxxxxxxxxxxxx> wrote:
> >[snip]
> >>>I'm not sure how that would work. The issue I am trying to solve
> >>>here is that Power arch chips have a large number of very
> >>>hardware-specific events that are not generalizable. Many of these
> >>>events not only have names, but other user-configurable bits as well
> >>>that select or narrow the scope of which exact events are recorded.
> >>>This issue is dealt with nicely in libpfm4, as it has mechanisms for
> >>>parsing event names and attributes (aka modifiers or unit masks),
> >>>and then produces a usable config field for the perf_events_attr
> >>>struct.
> >>>
> >>>Should I take it from the above that you are completely against the
> >>>idea of using an external library for hardware-specific event and
> >>>attribute naming?
> >>
> >>Could you give a few relevant examples of events in question, and the
> >>kind of
> >>configurability/attributes they have on Power?
> >
> >Here are a few examples for the Power A2 processor. I've distorted the
> >names because PMU architecture isn't publicly released yet.
> >
> >PM_DE_PMC_9:hrd_mask=0xff:hrd=0x22:pma_mask=0x3fff:pma=0x1b2d:culling_mode=3
> >
> >PM_EX_0x03:lane=2:vlane=1
> >PM_OWE_ENG_MAC_FULL:usu=3
>
> Just a follow-up note to this...
>
> I learned that the much of the high-level architecture of the new
> chip that IBM is working on has been publicly released recently, so
> I have "undistorted" the event names below:
>
> PM_DC_PMC_9:lpid_mask=0xff:lpid=0x22:pid_mask=0x3fff:pid=0x1b2d:marking_mode=3
> PM_REGX_0x03:lane=2:vlane=1
> PM_XML_ENG_MAC_FULL:sus=3
>
>
> DC = Decompression/Compression accelerator
> PMC_9 = Peformance monitoring event 9
> REGX = Regular eXpression accelerator
> XML = XML parsing accelerator
> pid = process id to match
> pid_mask = process id match mask
> lpid = logical partition id
> lpid_mask = logical partition id mask
> sus = source unit select
> lane, vlane = signal routing fields
> marking_mode = used to determine which accelerator work units to
> mark for performance monitoring

Are these special-purpose instructions for compression/regex/xml-parsing
speedups?

I think it would be rather useful to merge the hw (and sw) perf events with
the ftrace/tracepoints symbolic events space. That would be a one-stop-shop
for both perf and other tools to figure out the events we offer, their
characteristics, format, relationship to other events, etc.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/