Re: [RFC PATCH 2/3 v2] perf: Implement Nehalem uncore pmu

From: Lin Ming
Date: Thu Dec 02 2010 - 00:24:08 EST


On Wed, 2010-12-01 at 21:04 +0800, Stephane Eranian wrote:
> On Wed, Dec 1, 2010 at 4:21 AM, Lin Ming <ming.m.lin@xxxxxxxxx> wrote:
> >
> > On Fri, 2010-11-26 at 18:06 +0800, Stephane Eranian wrote:
> > > On Fri, Nov 26, 2010 at 10:00 AM, Lin Ming <lin@xxxxxxx> wrote:
> > > > On Fri, Nov 26, 2010 at 4:33 PM, Stephane Eranian <eranian@xxxxxxxxxx> wrote:
> > > >> Lin,
> > > >>
> > > >> Looked at the perfmon code, and it seems the mask is actual
> > > >> cores, not threads:
> > > >> rdmsrl(MSR_NHM_UNC_GLOBAL_CTRL, val);
> > > >> val |= 1ULL << (48 + cpu_data(smp_processor_id()).cpu_core_id);
> > > >> wrmsrl(MSR_NHM_UNC_GLOBAL_CTRL, val);
> > > >>
> > > >> That seems to imply both threads will get the interrupt.
> > > >>
> > > >> In the the overflowed event was programmed from on of the two threads, that
> > > >> means one will process the overflow, the other will get spurious.
> > > >>
> > > >> On the cores where no uncore was programmed, then both threads will have
> > > >> a spurious interrupt.
> > > >
> > > > But in my test, if HT is on, only the 2 theads in one of the four cores
> > > > will receive the interrupt. Even worse, we don't know which core will
> > > > receive the interrupt
> > > > when overflow happens.
> > > >
> > > The MSR_NHM_UNC_GLOBAL_CTRL is per socket not per core.
> >
> > Understood.
> >
> > >
> > > > I'll do more tests to verify this.
> > >
> > > In your tests, are your programming the same uncore event
> > > across all CPUs? If so then you may have a race condition
> > > setting the MSR because it read-modify-write.
> > >
> > > What about you program only one uncore event from one CPU?
> >
> > This is what I tested, programming only one uncore event from one CPU.
>
> > When HT is off, all four cores in the socket receive the interrupt.
>
> If the value of the MSR is 0xf << 48?

Yes, the EN_PMI_CORE* bits are set to 0xf.

>
> > When HT is on, only the 2 threads in one of the four cores receive the
> > interrupt.
> Something is not right here. Next week, I may be able to run some tests
> on a Nehalem using perfmon to compare. Could you also send me your
> latest uncore patch against tip-x86?
> Thanks.

I just send it out.
http://lkml.org/lkml/2010/12/2/4

Thanks,
Lin Ming




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/