Re: [PATCH v1 04/20] perf jevents: Add tsx metric group for Intel models

From: Liang, Kan
Date: Fri Mar 01 2024 - 09:52:39 EST




On 2024-02-29 8:01 p.m., Ian Rogers wrote:
> On Thu, Feb 29, 2024 at 1:15 PM Liang, Kan <kan.liang@xxxxxxxxxxxxxxx> wrote:
>>
>>
>>
>> On 2024-02-28 7:17 p.m., Ian Rogers wrote:
>>> Allow duplicated metric to be dropped from json files.
>>>
>>> Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx>
>>> ---
>>> tools/perf/pmu-events/intel_metrics.py | 51 ++++++++++++++++++++++++++
>>> 1 file changed, 51 insertions(+)
>>>
>>> diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
>>> index 20c25d142f24..1096accea2aa 100755
>>> --- a/tools/perf/pmu-events/intel_metrics.py
>>> +++ b/tools/perf/pmu-events/intel_metrics.py
>>> @@ -7,6 +7,7 @@ import argparse
>>> import json
>>> import math
>>> import os
>>> +from typing import Optional
>>>
>>> parser = argparse.ArgumentParser(description="Intel perf json generator")
>>> parser.add_argument("-metricgroups", help="Generate metricgroups data", action='store_true')
>>> @@ -77,10 +78,60 @@ def Smi() -> MetricGroup:
>>> ])
>>>
>>>
>>> +def Tsx() -> Optional[MetricGroup]:
>>> + if args.model not in [
>>> + 'alderlake',
>>> + 'cascadelakex',
>>> + 'icelake',
>>> + 'icelakex',
>>> + 'rocketlake',
>>> + 'sapphirerapids',
>>> + 'skylake',
>>> + 'skylakex',
>>> + 'tigerlake',> + ]:
>>
>> Can we get ride of the model list? Otherwise, we have to keep updating
>> the list.
>
> Do we expect the list to update? :-)

Yes, at least for the meteorlake and graniterapids. They should be the
same as alderlake and sapphirerapids. I'm not sure about the future
platforms.

Maybe we can have a if args.model in list here to include all the
non-hybrid models which doesn't support TSX. I think the list should not
be changed shortly.

> The issue is the events are in
> sysfs and not the json. If we added the tsx events to json then this
> list wouldn't be necessary, but it also would mean the events would be
> present in "perf list" even when TSX is disabled.

I think there may an alternative way, to check the RTM events, e.g.,
RTM_RETIRED.START event. We only need to generate the metrics for the
platform which supports the RTM_RETIRED.START event.


>
>>> + return None
>>> +> + pmu = "cpu_core" if args.model == "alderlake" else "cpu"
>>
>> Is it possible to change the check to the existence of the "cpu" PMU
>> here? has_pmu("cpu") ? "cpu" : "cpu_core"
>
> The "Unit" on "cpu" events in json always just blank. On hybrid it is
> either "cpu_core" or "cpu_atom", so I can make this something like:
>
> pmu = "cpu_core" if metrics.HasPmu("cpu_core") else "cpu"
>
> which would be a build time test.

Yes, I think using the "Unit" is good enough.

>
>
>>> + cycles = Event('cycles')
>>> + cycles_in_tx = Event(f'{pmu}/cycles\-t/')
>>> + transaction_start = Event(f'{pmu}/tx\-start/')
>>> + cycles_in_tx_cp = Event(f'{pmu}/cycles\-ct/')
>>> + metrics = [
>>> + Metric('tsx_transactional_cycles',
>>> + 'Percentage of cycles within a transaction region.',
>>> + Select(cycles_in_tx / cycles, has_event(cycles_in_tx), 0),
>>> + '100%'),
>>> + Metric('tsx_aborted_cycles', 'Percentage of cycles in aborted transactions.',
>>> + Select(max(cycles_in_tx - cycles_in_tx_cp, 0) / cycles,
>>> + has_event(cycles_in_tx),
>>> + 0),
>>> + '100%'),
>>> + Metric('tsx_cycles_per_transaction',
>>> + 'Number of cycles within a transaction divided by the number of transactions.',
>>> + Select(cycles_in_tx / transaction_start,
>>> + has_event(cycles_in_tx),
>>> + 0),
>>> + "cycles / transaction"),
>>> + ]
>>> + if args.model != 'sapphirerapids':
>>
>> Add the "tsx_cycles_per_elision" metric only if
>> has_event(f'{pmu}/el\-start/')?
>
> It's a sysfs event, so this wouldn't work :-(

The below is the definition of el-start in the kernel.
EVENT_ATTR_STR(el-start, el_start, "event=0xc8,umask=0x1");

The corresponding event in the event list should be HLE_RETIRED.START
"EventCode": "0xC8",
"UMask": "0x01",
"EventName": "HLE_RETIRED.START",

I think we may check the HLE_RETIRED.START instead. If the
HLE_RETIRED.START doesn't exist, I don't see a reason why the
tsx_cycles_per_elision should be supported.

Again, in the virtualization world, it's possible that the
HLE_RETIRED.START exists in the event list but el_start isn't available
in the sysfs. I think it has to be specially handle in the test as well.

Thanks,
Kan

>
> Thanks,
> Ian
>
>> Thanks,
>> Kan
>>
>>> + elision_start = Event(f'{pmu}/el\-start/')
>>> + metrics += [
>>> + Metric('tsx_cycles_per_elision',
>>> + 'Number of cycles within a transaction divided by the number of elisions.',
>>> + Select(cycles_in_tx / elision_start,
>>> + has_event(elision_start),
>>> + 0),
>>> + "cycles / elision"),
>>> + ]
>>> + return MetricGroup('transaction', metrics)
>>> +
>>> +
>>> all_metrics = MetricGroup("", [
>>> Idle(),
>>> Rapl(),
>>> Smi(),
>>> + Tsx(),
>>> ])
>>>
>>> if args.metricgroups:
>