Re: [PATCH v1 00/15] Introduce threaded trace streaming for basic perf record operation

From: Alexey Budankov
Date: Thu Oct 15 2020 - 06:35:36 EST



On 14.10.2020 20:27, Ingo Molnar wrote:
>
> * Alexey Budankov <alexey.budankov@xxxxxxxxxxxxxxx> wrote:
>
>>
>> Patch set provides threaded trace streaming for base perf record
>> operation. Provided streaming mode (--threads) mitigates profiling
>> data losses and resolves scalability issues of serial and asynchronous
>> (--aio) trace streaming modes on multicore server systems. The patch
>> set is based on the prototype [1], [2] and the most closely relates
>> to mode 3) "mode that creates thread for every monitored memory map".
>>
>> The threaded mode executes one-to-one mapping of trace streaming threads
>> to mapped data buffers and streaming into per-CPU trace files located
>> at data directory. The data buffers and threads are affined to NUMA
>> nodes and monitored CPUs according to system topology. --cpu option
>> can be used to specify exact CPUs to be monitored.
>
> Yay! This should really be the default trace capture model everywhere
> possible.
>
> Can we do this for perf top too? It's really struggling with lots of cores.
>
> If on a 64-core system I run just a moderately higher frequency 'perf top'
> of 1 kHz:
>
> perf top -e cycles -F 1000
>
> perf stays stuck forever in 'Collecting samples...', and I also get a lot
> of:
>
> [548112.871089] Uhhuh. NMI received for unknown reason 31 on CPU 25.
> [548112.871089] Do you have a strange power saving mode enabled?

Yes, we can. I would only prefer to do it in a separate patch set since
for me this patch set is already complex enough as a single change.
Is it ok?

I would also appreciate if you could clarify, advise or guide on the impact
of this perf top advancement or may be even provide some feedback on this feature adoption to help better justify the effort for my management.

Gratefully,
Alexei

>
> Thanks,
>
> Ingo
>