Re: [V2 PATCH 0/6] x86, NMI: give NMI handler a face-lift

From: Cyrill Gorcunov
Date: Thu Nov 18 2010 - 15:29:00 EST


On Thu, Nov 18, 2010 at 03:08:07PM -0500, Don Zickus wrote:
> On Thu, Nov 18, 2010 at 01:51:44PM -0600, Jason Wessel wrote:
> > > So the problem is when the nmi watchdog is enabled, the perf event is
> > > 'active' and thus tries to read the counter value. Because it is always
> > > zero, perf just assumes the counter overflowed and the NMI is his.
> > >
> > > Not sure how to fix it yet, other than include the logic that detects we
> > > are on a guest and disable perf??
> > >
> > >
> >
> > I highly doubt we want to disable perf. I would rather use the source
> > and fix the nmi emulation in KVM/Qemu after we hear back the results
>
> Well I think Peter does not have a positive opinion about emulating perf
> inside a guest. Nor are the KVM folks having much success in doing so.
>
> Just to clarify, perf counter emulation is _not_ implemented in kvm.
> Therefore disabling perf in the guest makes sense until someone gets
> around to actually writing the emulation code for perf in a guest. :-)
>
> Cheers,
> Don

Don, Robert,

I still have suspicious on ours 'pending' nmi handler. Look what I mean --
(keep in mind that p4 has a a way more counters than others).

So lets consider the situation counters 1,2,3 activated, so we have
in 'active_mask' them set (lets say they are bits 1,2,3) and same time
the bits 1,2,3 is set in 'running' mask (they are set in x86_pmu_start).

So then after small time period when no counters overflowed, the counter
2 were diactivated (for any reason), so that

active_mask 1,3
running 1,2,3

Then nmi happens from counter 1, which calls for perf nmi handler,
which goes over all counters trying to figure out which counter just
being oveflowed. And when the cycle reaches counter 2 the interesting
thing happens

for (idx = 0; idx < x86_pmu.num_counters; idx++) {
int overflow;

if (!test_bit(idx, cpuc->active_mask)) {

---> for counter 2 there is no active_mask bit set

/* catch in-flight IRQs */
if (__test_and_clear_bit(idx, cpuc->running))

---> but it still set in running
regardless that the counter were
already freed and it give us false
positive

handled++;
continue;
}

Guys, I have a gut feeling that I'm missing something obvious, no?

Cyrill
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/