[PATCH V7 1/2] perf: Track per-PMU sched_task() callback users

From: kan . liang
Date: Thu May 13 2021 - 10:37:55 EST


From: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>

Current perf only tracks the per-CPU sched_task() callback users, which
doesn't work if a callback user is a task. For example, the dirty
counters have to be cleared to prevent data leakage when a new RDPMC
task is scheduled in. The task may be created on one CPU but running on
another CPU. It cannot be tracked by the per-CPU variable. A global
variable is not going to work either because of the hybrid PMUs.
Add a per-PMU variable to track the callback users.

In theory, the per-PMU variable should be checked everywhere the
sched_task() can be called. But the X86 RDPMC is the only user for the
per-PMU sched_cb_usage. A callback for the X86 RDPMC is required only
when a different context is scheduled in. To avoid unnecessary
sched_task() invoke, the per-PMU sched_cb_usage is only checked there.
Should there be any other ARCHs which require it in the other places,
it can be added later separately.

Suggested-by: Rob Herring <robh@xxxxxxxxxx>
Signed-off-by: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
---

- New patch. Split the V6 to core and x86 parts.

include/linux/perf_event.h | 3 +++
kernel/events/core.c | 9 ++++++++-
2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index c8a3388..c6ee202 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -301,6 +301,9 @@ struct pmu {
/* number of address filters this PMU can do */
unsigned int nr_addr_filters;

+ /* Track the per PMU sched_task() callback users */
+ atomic_t sched_cb_usage;
+
/*
* Fully disable/enable this PMU, can be used to protect from the PMI
* as well as for lazy/batch writing of the MSRs.
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 1574b70..286b718 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3851,7 +3851,14 @@ static void perf_event_context_sched_in(struct perf_event_context *ctx,
cpu_ctx_sched_out(cpuctx, EVENT_FLEXIBLE);
perf_event_sched_in(cpuctx, ctx, task);

- if (cpuctx->sched_cb_usage && pmu->sched_task)
+ /*
+ * X86 RDPMC is the only user for the per-PMU sched_cb_usage.
+ * A callback for the X86 RDPMC is required only when a
+ * different context is scheduled in.
+ * To avoid unnecessary sched_task() invoke, the per-PMU
+ * sched_cb_usage is only checked here.
+ */
+ if (pmu->sched_task && (cpuctx->sched_cb_usage || atomic_read(&pmu->sched_cb_usage)))
pmu->sched_task(cpuctx->task_ctx, true);

perf_pmu_enable(pmu);
--
2.7.4