Re: [PATCH v3 4/4] perf/core,x86: synchronize PMU task contexts on optimized context switches

From: Peter Zijlstra
Date: Mon Oct 21 2019 - 06:37:57 EST


On Mon, Oct 21, 2019 at 09:59:42AM +0200, Ingo Molnar wrote:
>
> * Alexey Budankov <alexey.budankov@xxxxxxxxxxxxxxx> wrote:
>
> > + /*
> > + * PMU specific parts of task perf context may require
> > + * additional synchronization, at least for proper Intel
> > + * LBR callstack data profiling;
> > + */
> > + pmu->sync_task_ctx(ctx->task_ctx_data,
> > + next_ctx->task_ctx_data);
>
> Firstly, I'm pretty sure you never run this on a CPU where
> pmu->sync_task_ctx is NULL, right? ;-)
>
> Secondly, even on Intel CPUs in many cases we'll just call into a ~2 deep
> function pointer based call hierarchy, just to find that nothing needs to

See prototype here for getting rid of at least one layer of indirect
calls:

https://lkml.kernel.org/r/20191007083831.26880701.6@xxxxxxxxxxxxx

> be done, because there's no LBR call stack maintained:
>
> + if (!one || !another)
> + return;
>
> So while it's technically a layering violation, it might make sense to
> elevate this check to the generic layer and say that synchronization
> calls by the core layer will always provide two valid pointers?

Alternatively we can write the thing like:

if (pmu->swap_task_ctx)
pmu->swap_task_ctx(ctx, next_ctx)
else
swap(ctx->task_ctx_data, next_ctx->task_ctx_data);