Re: [PATCH v8] perf: Sharing PMU counters across compatible events

From: Peter Zijlstra
Date: Thu Dec 12 2019 - 10:44:43 EST


On Thu, Dec 12, 2019 at 03:37:05PM +0000, Song Liu wrote:
>
>
> > On Dec 12, 2019, at 5:49 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >
> > On Fri, Dec 06, 2019 at 04:24:47PM -0800, Song Liu wrote:
> >
> >> @@ -750,6 +752,16 @@ struct perf_event {
> >> void *security;
> >> #endif
> >> struct list_head sb_list;
> >> +
> >> + /* for PMU sharing */
> >> + struct perf_event *dup_master;
> >> + /* check event_sync_dup_count() for the use of dup_base_* */
> >> + u64 dup_base_count;
> >> + u64 dup_base_child_count;
> >> + /* when this event is master, read from master*count */
> >> + local64_t master_count;
> >> + atomic64_t master_child_count;
> >> + int dup_active_count;
> >> #endif /* CONFIG_PERF_EVENTS */
> >> };
> >
> >> +/* PMU sharing aware version of event->pmu->add() */
> >> +static int event_pmu_add(struct perf_event *event,
> >> + struct perf_event_context *ctx)
> >> +{
> >> + struct perf_event *master;
> >> + int ret;
> >> +
> >> + /* no sharing, just do event->pmu->add() */
> >> + if (!event->dup_master)
> >> + return event->pmu->add(event, PERF_EF_START);
> >
> > Possibly we should look at the location of perf_event::dup_master to be
> > in a hot cacheline. Because I'm thinking you just added a guaranteed
> > miss here.
>
> I am not quite sure the best location for these. How about:
>
> diff --git i/include/linux/perf_event.h w/include/linux/perf_event.h
> index 7d49f9251621..218cc7f75775 100644
> --- i/include/linux/perf_event.h
> +++ w/include/linux/perf_event.h
> @@ -643,6 +643,16 @@ struct perf_event {
> local64_t count;
> atomic64_t child_count;
>
> + /* for PMU sharing */
> + struct perf_event *dup_master;
> + /* check event_sync_dup_count() for the use of dup_base_* */
> + u64 dup_base_count;
> + u64 dup_base_child_count;
> + /* when this event is master, read from master*count */
> + local64_t master_count;
> + atomic64_t master_child_count;
> + int dup_active_count;
> +
> /*
> * These are the total time in nanoseconds that the event
> * has been enabled (i.e. eligible to run, and the task has

Ah, no. Only put dup_master somewhere hot. The rest is not that
important. For instance, you can put it right next to event->pmu,
because that's going to be used right next to it, right?