Re: [RFC] perf/amd/ibs: Move ibs pmus under perf_sw_context

From: Peter Zijlstra
Date: Mon Nov 15 2021 - 07:07:43 EST


On Mon, Nov 15, 2021 at 05:31:21PM +0530, Ravi Bangoria wrote:
>
>
> On 15-Nov-21 4:47 PM, Peter Zijlstra wrote:
> > On Mon, Nov 15, 2021 at 03:18:38PM +0530, Ravi Bangoria wrote:
> >> Ideally, a pmu which is present in each hw thread belongs to
> >> perf_hw_context, but perf_hw_context has limitation of allowing only
> >> one pmu (a core pmu) and thus other hw pmus need to use either sw or
> >> invalid context which limits pmu functionalities.
> >>
> >> This is not a new problem. It has been raised in the past, for example,
> >> Arm big.LITTLE (same for Intel ADL) and s390 had this issue:
> >>
> >> Arm: https://lore.kernel.org/lkml/20160425175837.GB3141@leverpostej
> >> s390: https://lore.kernel.org/lkml/20160606082124.GA30154@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
> >>
> >> Arm big.LITTLE (followed by Intel ADL) solved this by allowing multiple
> >> (heterogeneous) pmus inside perf_hw_context. It makes sense as they are
> >> still registering single pmu for each hw thread.
> >>
> >> s390 solved it by moving 2nd hw pmu to perf_sw_context, though that 2nd
> >> hw pmu is count mode only, i.e. no sampling.
> >>
> >> AMD IBS also has similar problem. IBS pmu is present in each hw thread.
> >> But because of perf_hw_context restriction, currently it belongs to
> >> perf_invalid_context and thus important functionalities like per-task
> >> profiling is not possible with IBS pmu. Moving it to perf_sw_context
> >> will:
> >> - allow per-task monitoring
> >> - allow cgroup wise profiling
> >> - allow grouping of IBS with other pmu events
> >> - disallow multiplexing
> >>
> >> Please let me know if I missed any major benefit or drawback of
> >> perf_sw_context. I'm also not sure how easy it would be to lift
> >> perf_hw_context restriction and start allowing more pmus in it.
> >>
> >> Suggestions?
> >
> > Same as I do every time this comes up... this patch is still lingering
> > and wanting TLC:
> >
> > https://lore.kernel.org/lkml/20181010104559.GO5728@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
>
> Thanks for the pointer Peter. I have looked at the patch and it is quite complex,
> altering the very way perf event scheduling works.
>
> I don't dispute that is the right 'fix' for the issue, but do you think adding a
> new perf context can help alleviate some of the issues in the interim?

And take away the motivation for people to do the right thing? How does
that work out in my favour?