Re: [RFC] perf/amd/ibs: Move ibs pmus under perf_sw_context

From: Ravi Bangoria
Date: Mon Nov 15 2021 - 07:02:32 EST




On 15-Nov-21 4:47 PM, Peter Zijlstra wrote:
> On Mon, Nov 15, 2021 at 03:18:38PM +0530, Ravi Bangoria wrote:
>> Ideally, a pmu which is present in each hw thread belongs to
>> perf_hw_context, but perf_hw_context has limitation of allowing only
>> one pmu (a core pmu) and thus other hw pmus need to use either sw or
>> invalid context which limits pmu functionalities.
>>
>> This is not a new problem. It has been raised in the past, for example,
>> Arm big.LITTLE (same for Intel ADL) and s390 had this issue:
>>
>> Arm: https://lore.kernel.org/lkml/20160425175837.GB3141@leverpostej
>> s390: https://lore.kernel.org/lkml/20160606082124.GA30154@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>>
>> Arm big.LITTLE (followed by Intel ADL) solved this by allowing multiple
>> (heterogeneous) pmus inside perf_hw_context. It makes sense as they are
>> still registering single pmu for each hw thread.
>>
>> s390 solved it by moving 2nd hw pmu to perf_sw_context, though that 2nd
>> hw pmu is count mode only, i.e. no sampling.
>>
>> AMD IBS also has similar problem. IBS pmu is present in each hw thread.
>> But because of perf_hw_context restriction, currently it belongs to
>> perf_invalid_context and thus important functionalities like per-task
>> profiling is not possible with IBS pmu. Moving it to perf_sw_context
>> will:
>> - allow per-task monitoring
>> - allow cgroup wise profiling
>> - allow grouping of IBS with other pmu events
>> - disallow multiplexing
>>
>> Please let me know if I missed any major benefit or drawback of
>> perf_sw_context. I'm also not sure how easy it would be to lift
>> perf_hw_context restriction and start allowing more pmus in it.
>>
>> Suggestions?
>
> Same as I do every time this comes up... this patch is still lingering
> and wanting TLC:
>
> https://lore.kernel.org/lkml/20181010104559.GO5728@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/

Thanks for the pointer Peter. I have looked at the patch and it is quite complex,
altering the very way perf event scheduling works.

I don't dispute that is the right 'fix' for the issue, but do you think adding a
new perf context can help alleviate some of the issues in the interim?

Ravi