Re: [PATCH 0/3 v2] new nmi_watchdog using perf events

From: Stephane Eranian
Date: Fri Feb 12 2010 - 12:13:01 EST


Don,

On Fri, Feb 12, 2010 at 5:59 PM, Don Zickus <dzickus@xxxxxxxxxx> wrote:
> On Fri, Feb 12, 2010 at 05:12:38PM +0100, Stephane Eranian wrote:
>> Don,
>>
>> How is this new NMI watchdog code going to work when you also have OProfile
>> enabled in your kernel?
>>
>> Today, perf_event disables the NMI watchdog while there is at least one event.
>> By releasing the PMU registers, it also allows for Oprofile to work.
>>
>> But now with this new NMI watchdog code, perf_event never releases the PMU.
>> Thus, I suspect Oprofile will not work anymore, unless the NMI watchdog is
>> explicitly disabled. Up until now OProfile could co-exist with the NMI watchdog.
>
> You are right. ÂOriginally when I read the code I thought perf_event just
> grabbed all the PMUs in reserve_pmc_init(). ÂBut I see that only happens
> when someone actually creates a PERF_TYPE_HARDWARE event, which the new
> nmi watchdog does. ÂThose PMUs only get released when the event is
> destroyed which my new code only does when the cpu disappears.
>
> So yeah, I have effectively blocked oprofile from working. ÂI can change
> my code such that when you disable the nmi_watchdog, you can release the
> PMUs and let oprofile work.
>
> But then I am curious, considering that perf and oprofile do the same
> thing, how much longer do we let competing subsystems control the same
> hardware? ÂI thought the point of the perf_event subsystem was to have a
> proper framework on top of the PMUs such that anyone who wants to use it
> just registers themselves, which is what the new nmi_watchdog is doing.
>
> I can add code that allows oprofile and the new nmi watchdog to coexist,
> but things get a little ugly to maintain. ÂJust wondering what the
> gameplan is here?
>
I believe OProfile should eventually be removed from the kernel. I suspect
much of the functionalities it needs are already provided by perf_events.
But that does not mean the OProfile user level tool must disappear. There is
a very large user community. I think it could and should be ported to use
perf_events instead. Given that the Oprofile users only interact through
opcontrol, opreport, opannotate and such, they never "see" the actual kernel
API. Thus by re-targeting the scripts, this should be mostly transparent to
end-users.

But for now, I believe the most practical solution is to release the perf_event
event when you disable the NMI watchdog. That would at least provide a
way to run OProfile.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/