Re: [PATCH v6 4/6] perf jevents: Add a new expression builtin strcmp_cpuid_str()

From: James Clark
Date: Wed Aug 30 2023 - 14:55:52 EST




On 30/08/2023 00:46, Ian Rogers wrote:
> On Wed, Aug 16, 2023 at 4:52 AM James Clark <james.clark@xxxxxxx> wrote:
>>
>> This will allow writing formulas that are conditional on a specific
>> CPU type or CPU version. It calls through to the existing
>> strcmp_cpuid_str() function in Perf which has a default weak version,
>> and an arch specific version for x86 and arm64.
>>
>> The function takes an 'ID' type value, which is a string. But in this
>> case Arm CPU IDs are hex numbers prefixed with '0x'. metric.py
>> assumes strings are only used by event names, and that they can't start
>> with a number ('0'), so an additional change has to be made to the
>> regex to convert hex numbers back to 'ID' types.
>>
>> Signed-off-by: James Clark <james.clark@xxxxxxx>
>> ---
>> tools/perf/arch/arm64/util/pmu.c | 18 +-----------------
>> tools/perf/pmu-events/metric.py | 17 +++++++++++++++--
>> tools/perf/util/expr.c | 18 ++++++++++++++++++
>> tools/perf/util/expr.h | 1 +
>> tools/perf/util/expr.l | 1 +
>> tools/perf/util/expr.y | 8 +++++++-
>> tools/perf/util/pmu.c | 17 +++++++++++++++++
>> tools/perf/util/pmu.h | 1 +
>> 8 files changed, 61 insertions(+), 20 deletions(-)
>>
>> diff --git a/tools/perf/arch/arm64/util/pmu.c b/tools/perf/arch/arm64/util/pmu.c
>> index 512a8f13c4de..615084eb88d8 100644
>> --- a/tools/perf/arch/arm64/util/pmu.c
>> +++ b/tools/perf/arch/arm64/util/pmu.c
>> @@ -2,28 +2,12 @@
>>
>> #include <internal/cpumap.h>
>> #include "../../../util/cpumap.h"
>> +#include "../../../util/header.h"
>> #include "../../../util/pmu.h"
>> #include "../../../util/pmus.h"
>> #include <api/fs/fs.h>
>> #include <math.h>
>>
>> -static struct perf_pmu *pmu__find_core_pmu(void)
>> -{
>> - struct perf_pmu *pmu = NULL;
>> -
>> - while ((pmu = perf_pmus__scan_core(pmu))) {
>> - /*
>> - * The cpumap should cover all CPUs. Otherwise, some CPUs may
>> - * not support some events or have different event IDs.
>> - */
>> - if (RC_CHK_ACCESS(pmu->cpus)->nr != cpu__max_cpu().cpu)
>> - return NULL;
>> -
>> - return pmu;
>> - }
>> - return NULL;
>> -}
>> -
>> const struct pmu_metrics_table *pmu_metrics_table__find(void)
>> {
>> struct perf_pmu *pmu = pmu__find_core_pmu();
>> diff --git a/tools/perf/pmu-events/metric.py b/tools/perf/pmu-events/metric.py
>> index 85a3545f5b6a..0e9ec65d92ae 100644
>> --- a/tools/perf/pmu-events/metric.py
>> +++ b/tools/perf/pmu-events/metric.py
>> @@ -413,6 +413,10 @@ def has_event(event: Event) -> Function:
>> # pylint: disable=invalid-name
>> return Function('has_event', event)
>>
>> +def strcmp_cpuid_str(event: str) -> Function:
>> + # pylint: disable=redefined-builtin
>> + # pylint: disable=invalid-name
>> + return Function('strcmp_cpuid_str', event)
>>
>> class Metric:
>> """An individual metric that will specifiable on the perf command line."""
>> @@ -541,14 +545,23 @@ def ParsePerfJson(orig: str) -> Expression:
>> """
>> # pylint: disable=eval-used
>> py = orig.strip()
>> + # First try to convert everything that looks like a string (event name) into Event(r"EVENT_NAME").
>> + # This isn't very selective so is followed up by converting some unwanted conversions back again
>> py = re.sub(r'([a-zA-Z][^-+/\* \\\(\),]*(?:\\.[^-+/\* \\\(\),]*)*)',
>> r'Event(r"\1")', py)
>> + # If it started with a # it should have been a literal, rather than an event name
>> py = re.sub(r'#Event\(r"([^"]*)"\)', r'Literal("#\1")', py)
>> + # Convert accidentally converted hex constants ("0Event(r"xDEADBEEF)"") back to a constant,
>> + # but keep it wrapped in Event(), otherwise Python drops the 0x prefix and it gets interpreted as
>> + # a double by the Bison parser
>> + py = re.sub(r'0Event\(r"[xX]([0-9a-fA-F]*)"\)', r'Event("0x\1")', py)
>> + # Convert accidentally converted scientific notation constants back
>> py = re.sub(r'([0-9]+)Event\(r"(e[0-9]+)"\)', r'\1\2', py)
>> - keywords = ['if', 'else', 'min', 'max', 'd_ratio', 'source_count', 'has_event']
>> + # Convert all the known keywords back from events to just the keyword
>> + keywords = ['if', 'else', 'min', 'max', 'd_ratio', 'source_count', 'has_event', 'strcmp_cpuid_str',
>> + 'cpuid_not_more_than']
>> for kw in keywords:
>> py = re.sub(rf'Event\(r"{kw}"\)', kw, py)
>> -
>> try:
>> parsed = ast.parse(py, mode='eval')
>> except SyntaxError as e:
>> diff --git a/tools/perf/util/expr.c b/tools/perf/util/expr.c
>> index 7410a165f68b..0985a3cbc6f9 100644
>> --- a/tools/perf/util/expr.c
>> +++ b/tools/perf/util/expr.c
>> @@ -13,6 +13,8 @@
>> #include <util/expr-bison.h>
>> #include <util/expr-flex.h>
>> #include "util/hashmap.h"
>> +#include "util/header.h"
>> +#include "util/pmu.h"
>> #include "smt.h"
>> #include "tsc.h"
>> #include <api/fs/fs.h>
>> @@ -495,3 +497,19 @@ double expr__has_event(const struct expr_parse_ctx *ctx, bool compute_ids, const
>> evlist__delete(tmp);
>> return ret;
>> }
>> +
>> +double expr__strcmp_cpuid_str(const struct expr_parse_ctx *ctx __maybe_unused,
>> + bool compute_ids __maybe_unused, const char *test_id)
>> +{
>> + double ret;
>> + struct perf_pmu *pmu = pmu__find_core_pmu();
>> + char *cpuid = perf_pmu__getcpuid(pmu);
>> +
>> + if (!cpuid)
>> + return NAN;
>> +
>> + ret = !strcmp_cpuid_str(test_id, cpuid);
>> +
>> + free(cpuid);
>> + return ret;
>> +}
>> diff --git a/tools/perf/util/expr.h b/tools/perf/util/expr.h
>> index 3c1e49b3e35d..c0cec29ddc29 100644
>> --- a/tools/perf/util/expr.h
>> +++ b/tools/perf/util/expr.h
>> @@ -55,5 +55,6 @@ double expr_id_data__value(const struct expr_id_data *data);
>> double expr_id_data__source_count(const struct expr_id_data *data);
>> double expr__get_literal(const char *literal, const struct expr_scanner_ctx *ctx);
>> double expr__has_event(const struct expr_parse_ctx *ctx, bool compute_ids, const char *id);
>> +double expr__strcmp_cpuid_str(const struct expr_parse_ctx *ctx, bool compute_ids, const char *id);
>>
>> #endif
>> diff --git a/tools/perf/util/expr.l b/tools/perf/util/expr.l
>> index dbb117414710..0feef0726c48 100644
>> --- a/tools/perf/util/expr.l
>> +++ b/tools/perf/util/expr.l
>> @@ -114,6 +114,7 @@ if { return IF; }
>> else { return ELSE; }
>> source_count { return SOURCE_COUNT; }
>> has_event { return HAS_EVENT; }
>> +strcmp_cpuid_str { return STRCMP_CPUID_STR; }
>
> For the sake of coverage, it'd be nice to have a test case like:
> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/tests/expr.c?h=tmp.perf-tools-next#n257
>
>> {literal} { return literal(yyscanner, sctx); }
>> {number} { return value(yyscanner); }
>> {symbol} { return str(yyscanner, ID, sctx->runtime); }
>> diff --git a/tools/perf/util/expr.y b/tools/perf/util/expr.y
>> index 65d54a6f29ad..6c93b358cc2d 100644
>> --- a/tools/perf/util/expr.y
>> +++ b/tools/perf/util/expr.y
>> @@ -39,7 +39,7 @@ int expr_lex(YYSTYPE * yylval_param , void *yyscanner);
>> } ids;
>> }
>>
>> -%token ID NUMBER MIN MAX IF ELSE LITERAL D_RATIO SOURCE_COUNT HAS_EVENT EXPR_ERROR
>> +%token ID NUMBER MIN MAX IF ELSE LITERAL D_RATIO SOURCE_COUNT HAS_EVENT STRCMP_CPUID_STR EXPR_ERROR
>> %left MIN MAX IF
>> %left '|'
>> %left '^'
>> @@ -207,6 +207,12 @@ expr: NUMBER
>> $$.ids = NULL;
>> free($3);
>> }
>> +| STRCMP_CPUID_STR '(' ID ')'
>> +{
>> + $$.val = expr__strcmp_cpuid_str(ctx, compute_ids, $3);
>> + $$.ids = NULL;
>> + free($3);
>> +}
>> | expr '|' expr
>> {
>> if (is_const($1.val) && is_const($3.val)) {
>> diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
>> index d5406effc169..a7f05e4dda97 100644
>> --- a/tools/perf/util/pmu.c
>> +++ b/tools/perf/util/pmu.c
>> @@ -1790,3 +1790,20 @@ void perf_pmu__delete(struct perf_pmu *pmu)
>> zfree(&pmu->alias_name);
>> free(pmu);
>> }
>> +
>> +struct perf_pmu *pmu__find_core_pmu(void)
>
> I think as this is scanning all PMUs it more logically belongs in
> pmus.c than pmu.c.
>
>> +{
>> + struct perf_pmu *pmu = NULL;
>> +
>> + while ((pmu = perf_pmus__scan_core(pmu))) {
>> + /*
>> + * The cpumap should cover all CPUs. Otherwise, some CPUs may
>> + * not support some events or have different event IDs.
>> + */
>> + if (RC_CHK_ACCESS(pmu->cpus)->nr != cpu__max_cpu().cpu)
>
> The RC_CHK_ACCESS can be avoided using perf_cpu_map__nr.
>
> Not all CPUs need to be online. I think the number for the PMU is
> bound by the number online while cpu__max_cpu gives the maximum
> present. It is common to workaround SMT vulnerabilities by making
> cores offline in which case this would never be true. I think the
> following is nearly equivalent and simpler:
>
> perf_pmus__num_core_pmus > 1 ? NULL : perf_pmus__scan_core(NULL);
>

I'm not sure about this one, I'm hesitant to change it because I only
moved it out of the arm file and into the generic one and it seems like
it could cause subtle issues.

>From the comment above the if statement, it seems like it intentionally
refuses to return a PMU on heterogeneous systems, and the original
commit message (e126bef55f1d) from John lines up with that too. I can't
see exactly why it should be like that though.

Although you could say that now I've pulled some Arm specific quirk into
the generic code and we should re-think it. To me I don't see why it
doesn't just unconditionally return the first core PMU it finds on any
architecture, but I'd like to understand the original intention first.

Apart from this one, I've done the other changes and will send patches.

> Thanks,
> Ian
>
>> + return NULL;
>> +
>> + return pmu;
>> + }
>> + return NULL;
>> +}
>> diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
>> index 6b414cecbad2..7ff925224165 100644
>> --- a/tools/perf/util/pmu.h
>> +++ b/tools/perf/util/pmu.h
>> @@ -289,5 +289,6 @@ int perf_pmu__pathname_fd(int dirfd, const char *pmu_name, const char *filename,
>> struct perf_pmu *perf_pmu__lookup(struct list_head *pmus, int dirfd, const char *lookup_name);
>> struct perf_pmu *perf_pmu__create_placeholder_core_pmu(struct list_head *core_pmus);
>> void perf_pmu__delete(struct perf_pmu *pmu);
>> +struct perf_pmu *pmu__find_core_pmu(void);
>>
>> #endif /* __PMU_H */
>> --
>> 2.34.1
>>