RE: [RFC PATCH 2/2] tools/perf: Make group_fd static and move itsplace in __perf_evsel__open()

From: Zhu, DengCheng
Date: Wed Oct 26 2011 - 09:15:23 EST


> ________________________________________
> From: Arnaldo Carvalho de Melo [arnaldo.melo@xxxxxxxxx] on behalf of Arnaldo Carvalho de Melo [acme@xxxxxxxxxxxxxxxxxx]
> Sent: Wednesday, October 26, 2011 7:18 PM
> To: Zhu, DengCheng
> Cc: linux-kernel@xxxxxxxxxxxxxxx; Peter Zijlstra; Paul Mackerras; Ingo Molnar
> Subject: Re: [RFC PATCH 2/2] tools/perf: Make group_fd static and move its place in __perf_evsel__open()
>
> Em Wed, Oct 26, 2011 at 02:26:03PM +0800, Deng-Cheng Zhu escreveu:
>> On 10/25/2011 11:23 PM, Arnaldo Carvalho de Melo wrote:
>> >Em Mon, Oct 24, 2011 at 12:03:28PM -0200, Arnaldo Carvalho de Melo escreveu:
>> >>Em Mon, Oct 24, 2011 at 06:57:00PM +0800, Deng-Cheng Zhu escreveu:
>> >>>__perf_evsel__open() is called per event, it does not work for all the
>> >>>grouped events at one time. So, currently group_fd will alway be -1 for
>
> <SNIP>
>
>> >>>+ static int group_fd = -1;
>> >>> if (!evsel->cgrp)
>> >>> pid = threads->map[thread];
>
>> >>Lets not do it that way, using statics for this is humm, ugly, IMHO.
>
>> >Can you try this patch?
>
>> >I tested it with:
>
>> >[root@emilia ~]# perf top -e cycles -e instructions --group
>
>> Your patch does fix the group fd issue. But to get event grouping
>> workable, the event state fix is still needed. Please see the discussion
>
> Can I have your "Reviewed-by:" or "Tested-by:" tag for this patch then?

Yes, you may add my:

Tested-by: Deng-Cheng Zhu <dczhu@xxxxxxxx>

>> here: http://www.spinics.net/lists/mips/msg42190.html
>
>> With _only_ your patch applied, I tested with the following commands on
>> MIPS 74K (4 counters available in total):
>>
>> perf stat -g -e
>> L1-dcache-load-misses,cycles,LLC-load-misses,iTLB-loads,instructions
>> find / >/dev/null
>>
>> I tried to group up to 5 events in the hope of seeing NOSPC error. But
>> the command didn't fail and output:
>>
>> Performance counter stats for 'find /':
>>
>> 9300823 L1-dcache-load-misses
>> <not counted> cycles
>> <not counted> LLC-load-misses
>> <not counted> iTLB-loads
>> <not counted> instructions
>>
>> 8.463207591 seconds time elapsed
>>
>> This is due to the event state check in validate_group() filtering out
>> the grouped events in OFF state. They are in OFF state because we are
>> running the command with the perf tool as opposed to attaching to an
>> existing task:
>>
>> builtin-stat.c:create_perf_stat_counter():
>>
>> if (target_pid == -1 && target_tid == -1) {
>> attr->disabled = 1;
>> attr->enable_on_exec = 1;
>> }
>
> But they should be in OFF state only till the target program gets
> exec'ed, right?
>

Yes, or else event timestamps won't be updated on exec. That's why I came
to a new idea in one of my MIPS perf-events patches (link provided above).

>> I suppose X86 has this issue too -- collect_events() in validate_group()
>> won't do real work in the bottom half of the function.
>
> I'm testing that now.

As you can see from the test results in your another post, I think we see
"not counted" because collect_events() in validate_group() does not do real
work on events in OFF state. On x86 a patch may be needed in event init,
but on MIPS (or ARM) I suppose the pmu and event state checks in
validate_event() perhaps could be deleted...


Deng-Cheng
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/