Re: [PATCH v2 06/15] perf session: load data directory into tool process memory

From: Alexey Budankov
Date: Tue Oct 27 2020 - 11:09:00 EST



On 27.10.2020 15:21, Jiri Olsa wrote:
> On Tue, Oct 27, 2020 at 10:37:58AM +0300, Alexey Budankov wrote:
>>
>> On 24.10.2020 18:43, Jiri Olsa wrote:
>>> On Wed, Oct 21, 2020 at 07:01:19PM +0300, Alexey Budankov wrote:
>>>>
>>>> Read trace files located in data directory into tool process memory.
>>>> Basic analysis support of data directories is provided for report
>>>> mode. Raw dump (-D) and aggregated reports are available for data
>>>> directories, still with no memory consumption optimizations. However
>>>> data directories collected with --compression-level option enabled
>>>> can be analyzed with little less memory because trace files are
>>>> unmaped from tool process memory after loading collected data.
>>>> The implementation is based on the prototype [1], [2].
>>>>
>>>> [1] git clone https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git -b perf/record_threads
>>>> [2] https://lore.kernel.org/lkml/20180913125450.21342-1-jolsa@xxxxxxxxxx/
>>>>
>>>> Suggested-by: Jiri Olsa <jolsa@xxxxxxxxxx>
>>>
>>> very loosely ;-) so there was a reason for all that reader refactoring,
>>> so we could have __perf_session__process_dir_events function:
>>>
>>> https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git/commit/?h=perf/record_threads&id=308aa7cff1fed335401cfc02c7bac1a4644af68e
>>
>> Nonetheless. All that are necessary parts to make threaded data streaming
>> and analysis eventually merged into the mainline as joint Perf developers
>> community effort.
>>
>>>
>>> when reporting the threaded record data on really big servers,
>>> you will run out of memory, so you need to read and flush all
>>> the files together by smaller pieces
>>
>> Yes, handling all that _big_ data after collection to make it
>> helpful for analysis of performance issues is the other part
>> of this story so that possible OOM should be somehow avoided.
>>
>>>
>>> IMO we need to have this change before we allow threaded record
>>
>> There are use cases of perf tool as a data provider, btw VTune is not
>> the only one of them, and for those use cases threaded trace streaming
>> lets its users get to their data that the users just were loosing before.
>> This is huge difference and whole new level of support for such users.
>> Post-process scripting around perf (e.g. Python based) will benefit
>> from threaded trace streaming. Pipe mode can be extended to stream into
>> open and passed fds using threads (e.g. perf record -o -fd:13,14,15,16).
>> VTune-like tools can get performance data, load it into a (relational)
>> DB files and provide analysis. And all that uses perf tool at its core.
>>
>> I agree perf report OOM issue can exist on really-big servers but data
>> directories support for report mode for not-so-big servers and desktops
>> is already enabled with this smaller change. Also really-big-servers
>> come with really-big amount of memory and collection could possibly be
>> limited to only interesting phases of execution so the issue could likely
>> be avoided. At the same time threaded trace streaming could clarify on
>> real use cases that are blocked by perf report OOM issue and that would
>> clarify on exact required solution. So perf report OOM issue shouldn't
>> be the showstopper for upstream of threaded trace streaming.
>
> so the short answer is no, right? ;-)
>
> I understand all the excuses, but from my point of view we are
> adding another pain point (and there's already few ;-) ) that

BTW what are those a few pain points that are 'unfriendly' in perf?
Possibly users could be warned about the points in advance to avoid
confusion and disappointment by the fact.

Alexei

> will make perf (even more) not user friendly
>
> if we allow really friendly way to create huge data, we should
> do our best to be able to process it as best as we can
>
> jirka
>