Re: [patch 0/3] [Announcement] Performance Counters for Linux

From: Ingo Molnar
Date: Mon Dec 08 2008 - 06:12:36 EST



* stephane eranian <eranian@xxxxxxxxxxxxxx> wrote:

> Let me explain the HW complexity a bit. It's all a matter of tradeoffs.
> I have regular discussions with the PMU design architects about this.
> If you talk to them, then you understand the environment they have to
> live in and you understand why those constraints are there. The key
> point to understand is that the PMU is never critical to the chip. The
> chip can work well without. The real-estate on the chip is always very
> tight. PMU is a 2nd class citizen, thus low in the priority list. [...]

The chip designers i talk to with my scheduler maintainer hat on do point
out that performance monitoring is (of course) in the critical path of
any chip, and hence its overhead and impact on the gate count of various
critical components of the CPU core and its impact on the power envelope
must be kept very low.

Nevertheless, the same chip designers rely on performance counters on a
daily basis to plan their next-gen chip. They very much want them to work
fine, and they work hard on making them relevant and easy to use. Often
the performance counters are the _only_ real cheap hands-on insight into
the dynamic situation of a modern CPU core, even for hw designers.

And all the current hw trends show that it's not just talk but action as
well: the Core2 PMCs are already much saner (less constrained) than the
P4 ones, and now they even expanded on them: Nehalem / Core i7 doubled
the number of generic PMCs from two to four.

So, contrary to your suggestion, chip designers very much care about
performance counters and they are working very hard to make this stuff
useful to us. [ Yes, there are constraints even with generic counters
(for example you only want a single line towards a PMC register from
divider units), but the number of cross-counter constraints and their
relevance is decreasing, not increasing. ]

Anyway ... i think your reply highlights why the fundamental premise of
your patchset is so wrong: i believe you have designed your code and APIs
at the wrong level by (paradoxically) assuming in essence that
performance counters do not matter in the general scheme of things. (!)

So you introduced limited, special-purpose but still quite complex APIs
that tailored the ABIs to intricate low level details of PMUs. I see an
explosion in complexity due to that incorrect design choice: too many
syscalls, too broad interaction between core code and architecture code,
and too little practical utility in the end.

We did what we believe to be the right thing: we gave performance
counters the proper high-level abstraction they _deserve_, and we made
performance counters a prime-time Linux citizen as well.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/