Re: [RFC][PATCH 6/6] perf, tools: X86 RDPMC, RDTSC test

From: Stephane Eranian
Date: Mon Nov 21 2011 - 12:42:57 EST


On Mon, Nov 21, 2011 at 5:59 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Mon, 2011-11-21 at 16:37 +0100, Peter Zijlstra wrote:
>> On Mon, 2011-11-21 at 16:29 +0100, Stephane Eranian wrote:
>> > Peter,
>> >
>> > I don't see how this test and infrastructure handles the case where the event
>> > is multiplexed. I know there is time_enabled and time_running. But those are
>> > not sync'd to the moment of the rdpmc(). I think there needs to be some other
>> > timestamp in the mmap struct so the user can compute a delta to then add to
>> > time_enabled and time_running.
>>
>> When the counter isn't actually on the PMU, ->index will be 0 and rdpmc
>> should not be attempted.
>>
>> > Unless, we assume the two time metrics are there ONLY to compute a scaling
>> > ratio. In which case, I think, we don't need the delta because if we
>> > can do rdpmc()
>> > it means the event is currently scheduled and thus time_enabled and time_running
>> > are both ticking which means the scaling ratio does not change since the moment
>> > the event was scheduled in.
>>
>> Right, you don't need delta to compute the scale, but its useful for
>> user-space time based measurements, Arun wanted to do something like
>> that.
>
> I'm full of crap, of course that makes a difference :-)
>
> Even when both running and enabled are incremented, the scaling does
> still change: 3/2 != 4/3 etc..
>
You're right!

count = raw_count * time_enabled / time_running

If we add 1s to time_enabled and time_running, it's not the
same scaling. We're not multiplying, we're adding.

To do correct scaling, we need to figure out how many ns have elapsed
since time_enabled and time_running were last updated, i.e., when the
counter was last scheduled.

count = raw_count * (time_enabled + x) / (time_running + x)

That's what I was suggesting initially.

I you use rdtsc() + a timestamp in the mmapped page, you can
determine x.

Personally, I never understood how this mmapped count could even work
on PPC in the case of multiplexing. The problem above is not specific
to X86.

> Using that we can actually deal with the whole multiplexing thing
> without ever having to fall back to read(), something like:
>
>
> static u64 mmap_read_self(void *addr)
> {
> Â Â Â Âstruct perf_event_mmap_page *pc = addr;
> Â Â Â Âu32 seq, idx, time_mult, time_shift;
> Â Â Â Âu64 count, cyc, time_offset, enabled, running, delta;
>
> Â Â Â Âdo {
> Â Â Â Â Â Â Â Âseq = pc->lock;
> Â Â Â Â Â Â Â Âbarrier();
>
> Â Â Â Â Â Â Â Âenabled = pc->time_enabled;
> Â Â Â Â Â Â Â Ârunning = pc->time_running;
>
> Â Â Â Â Â Â Â Âif (enabled != running) {
> Â Â Â Â Â Â Â Â Â Â Â Âcyc = rdtsc();
> Â Â Â Â Â Â Â Â Â Â Â Âtime_mult = pc->time_mult;
> Â Â Â Â Â Â Â Â Â Â Â Âtime_shift = pc->time_shift;
> Â Â Â Â Â Â Â Â Â Â Â Âtime_offset = pc->time_offset;
> Â Â Â Â Â Â Â Â}
>
> Â Â Â Â Â Â Â Âidx = pc->index;
> Â Â Â Â Â Â Â Âcount = pc->offset;
> Â Â Â Â Â Â Â Âif (idx)
> Â Â Â Â Â Â Â Â Â Â Â Âcount += rdpmc(idx - 1);
>
> Â Â Â Â Â Â Â Âbarrier();
> Â Â Â Â} while (pc->lock != seq);
>
> Â Â Â Âif (enabled != running) {
> Â Â Â Â Â Â Â Âu64 quot, rem;
>
> Â Â Â Â Â Â Â Âquot = (cyc >> time_shift);
> Â Â Â Â Â Â Â Ârem = cyc & ((1 << time_shift) - 1);
> Â Â Â Â Â Â Â Âdelta = time_offset + quot * time_mult +
> Â Â Â Â Â Â Â Â Â Â Â Â((rem * time_mult) >> time_shift);
>
> Â Â Â Â Â Â Â Âenabled += delta;
> Â Â Â Â Â Â Â Âif (idx)
> Â Â Â Â Â Â Â Â Â Â Â Ârunning += delta;
>
> Â Â Â Â Â Â Â Âquot = count / running;
> Â Â Â Â Â Â Â Ârem = count % running;
> Â Â Â Â Â Â Â Âcount = quot * enabled + (rem * enabled) / running;
> Â Â Â Â}
>
> Â Â Â Âreturn count;
> }
>
> Now all I need to do is make sure pc->offset actually makes sense,
> because currently it looks like we're off by a factor
> event->hw.prev_count when idx is set.
>
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/