Re: [RFCv2 00/48] perf tools: Add threads to record command

From: Ingo Molnar
Date: Fri Sep 14 2018 - 08:13:16 EST



* Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> On Fri, Sep 14, 2018 at 01:47:25PM +0200, Jiri Olsa wrote:
> > On Fri, Sep 14, 2018 at 01:15:28PM +0200, Peter Zijlstra wrote:
> > > On Fri, Sep 14, 2018 at 11:40:22AM +0200, Ingo Molnar wrote:
> > > > In fact keeping the files separate has scalability advantages for 'perf report' and similar
> > > > parsing tools: they could read all the streams in a per-CPU fashion already, from the very
> > > > beginning.
> > >
> > > Also writing to different files from different CPUs is good for record,
> > > less contention on the inode state (which include pagecache).
> >
> > maybe I should explain a little bit more on this
> >
> > we write to different (per-cpu) files during the record,
> > and at the end of the session, we take them and store
> > them inside perf.data
>
> How long does it take to combine that? If we generated a lot of data,
> that could take a fair amount of time, no?
>
> I feel that record should not mysteriously 'hang' when it is done. It
> used to do that at some point because of that stupid .debug crap, but
> acme fixed that I think.

Agreed - plus at the report stage it would be advantageous to be able to *read* per-cpu files
as well.

If we do things smartly them report will create similar NUMA affinity as the record session
used.

Thanks,

Ingo