Re: [PATCH] perf test: Fix test_arm_coresight.sh failures on Juno

From: James Clark
Date: Mon Oct 10 2022 - 05:21:38 EST




On 10/10/2022 08:41, Leo Yan wrote:
> On Thu, Oct 06, 2022 at 04:11:05PM +0100, James Clark wrote:
>
> [...]
>
>>>> Before:
>>>>
>>>> sudo ./perf test coresight -vvv
>>>> ...
>>>> Recording trace with system wide mode
>>>> Looking at perf.data file for dumping branch samples:
>>>> Looking at perf.data file for reporting branch samples:
>>>> Looking at perf.data file for instruction samples:
>>>> CoreSight system wide testing: FAIL
>>>> ...
>>>>
>>>> After:
>>>>
>>>> sudo ./perf test coresight -vvv
>>>> ...
>>>> Recording trace with system wide mode
>>>> Looking at perf.data file for dumping branch samples:
>>>> Looking at perf.data file for reporting branch samples:
>>>> Looking at perf.data file for instruction samples:
>>>> CoreSight system wide testing: PASS
>>>> ...
>>>
>>> Since Arm Juno board has zero timestamp for CoreSight, I don't think
>>> now arm_cs_etm.sh can really work on it.
>>>
>>> If we want to pass the test on Juno board, we need to add option
>>> "--itrace=Zi1000i" for "perf report" and "perf script"; but seems
>>> to me "--itrace=Z..." is not a general case for testing ...
>>
>> Unfortunately I now think that adding the Z option didn't improve
>> anything in Coresight decoding other than removing the warning. I've
>> never seen the zero timestamp issue on Juno though. I thought that was
>> on some Qualcomm device? I'm not getting the warning on this test anyway.
>
> No, on my Juno-r2 board I can observe the timestamp is always zero
> from CoreSight trace data, this is why everytime I must use
> "--itrace=Zi1000i" for reporting results.

Ah I have r0 which could explain it. But it's good to know that r2 has
that issue. I still wouldn't expect you to have to use the option
though, because it should only make the warning go away.

>
>> The problem is that timeless mode assumes per thread mode, and in per
>> thread mode there is a separate buffer per thread, so the Coresight
>> channel IDs are ignored. In systemwide mode the channel ID is important
>> to know which CPU the trace came from. If this info is thrown away then
>> not much works correctly.
>>
>> I plan to overhaul the whole decoder and remove all the assumptions
>> about per-thread and timeless mode. It would be better if they were
>> completely separate concepts.
>
> Okay, good to know this.
>
> [...]
>
>>> So here I am suspect that changing to "--itrace=i20i" can allow the test
>>> to pass on Juno board. Could you confirm for this?
>>
>> On Juno:
>>
>> ./perf record -e cs_etm// -a -- ls
>>
>> With interval 20, 23 instruction samples are generated:
>>
>> ./perf report --stdio --itrace=i20i | egrep " +[0-9]+\.[0-9]+% +perf "
>> | wc -l
>>
>> 23
>>
>> With interval 1000, 0 are generated:
>>
>> ./perf report --stdio --itrace=i1000i | egrep " +[0-9]+\.[0-9]+% +perf
>> " | wc -l
>>
>> Error:
>> The perf.data data has no samples!
>> 0
>
> Thanks for confirmation. It's a bit weird that your Juno board doesn't
> produce all zeros for timestamp packets.
>
>> I think the issue is that ls is quite quick to run, so not much trace is
>> generated for Perf. And it just depends on the scheduling which is
>> slightly different on Juno. I don't think it's a bug. On N1SDP there are
>> only 134 samples generated with i1000i, so it could probably end up with
>> a random run generating 0 there too.
>
> Agreed, changing to smaller interval makes sense for me.
>
> Reviewed-by: Leo Yan <leo.yan@xxxxxxxxxx>

Thanks for the review Leo

>
> Thanks,
> Leo