Re: [RFC][PATCH 5/5] perfcounter: Add support for kernel hardwarebreakpoints

From: Peter Zijlstra
Date: Tue Jul 28 2009 - 12:38:57 EST


On Tue, 2009-07-28 at 21:42 +0530, K.Prasad wrote:

> > Firstly, you seem to have this weird split of kernel/userspace
> > breakpoints. Perf counters looks at things in a per-cpu fashion, so the
> > all-cpus kernel breakpoint stuff is useless. Also, from perf counters'
> > POV its perfectly reasonable to have a per-task kernel breakpoint.
> >
>
> Although the existing implementation of hw-breakpoint API doesn't
> support per-task kernel-space breakpoints, it isn't very difficult to
> extend it to do so.
>
> We could change the breakpoint infrastructure to something like this:
>
> kernel-space breakpoints:
> kernel-space addresses, system-wide i.e. on all CPUs, persist till explicit
> unregistration, consume 1 debug register always.
>
> New per-task breakpoints (i.e. modified user-space breakpoints):
> accepts kernel- or user-space addresses, enabled per-task, consumes 1 debug
> register (only when task is scheduled on the CPU), releases debug register
> when yielding the CPU.

That still doesn't provide per-cpu breakpoints.

> > Secondly, perf counters wants to schedule the per task breakpoints
> > because we can optimize the context switch, saving lots of these MSR
> > writes under some common scenarios.
> >
>
> perf counters can continue to schedule per-task breakpoints -
> enabling/disabling a breakpoint would require a call to the
> 'register'/'unregister' interface and since it is per-cpu it is
> light-weight when compared to system-wide breakpoints (that require IPIs
> for propagation).
>
> The common breakpoints can be identified and exempted from yielding the
> debug registers (i.e. from the unregister-->register cycle) in the
> perf-counter code.

If you want to implement it that way.. looking for duplicates is bound
to result in something O(n^2), but with n=4 that's manageable.

Again, you seem to be missing per-cpu breakpoints.

> As a side note, I'm not sure if extra-polating (linearly?) the debug
> register's "hit counter" value is a good idea. While a function may cause
> several 'write' operations on a variable (say due to a loop statement) for
> once, it may not exhibit similar behaviour throughout the time-slice of the
> program's execution. Scaling the values may lead to incorrect results.

Sure, it won't be perfect, but if you assume the RR interval is
decoupled from the task you can get statistically relevant information.

> > Like I said, please use the raw per-cpu breakpoint interface for perf
> > counters and connect that with the minimally required reservation you
> > need to make your other thing work.
> >
> > You simply cannot put perf-counter breakpoints on top of whatever virt
> > layer you created going by what you say it is.
> >
>
> One of the design goals of the hw-breakpoint API is to provide a layer
> of arbitration between various consumers of the physical debug register.
> We should be able to extend the API to meet the demands of new users
> with unique requirements (if not supported already), and the description
> above broadly describe them for perf-counters.

Sure, but currently it does too much.

All you need for perf counter support is a per-cpu interface, no
per-task, no system-wide.

But you want to mix that up with your per-task interface, which will
complicate matters.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/