Re: [patch] tracing/mm: add page frame snapshot trace

From: Ingo Molnar
Date: Sat May 09 2009 - 07:05:55 EST



* Wu Fengguang <fengguang.wu@xxxxxxxxx> wrote:

> On Sat, May 09, 2009 at 06:01:37PM +0800, Ingo Molnar wrote:
> >
> > * Wu Fengguang <fengguang.wu@xxxxxxxxx> wrote:
> >
> > > 2) support concurrent object iterations
> > > For example, a huge 1TB memory space can be split up into 10
> > > segments which can be queried concurrently (with different options).
> >
> > this should already be possible. If you lseek the trigger file, that
> > will be understood as an 'offset' by the patch, and then write a
> > (decimal) value into the file, that will be the count.
> >
> > So it should already be possible to fork off nr_cpus helper threads,
> > one bound to each CPU, each triggering trace output of a separate
> > segment of the memory map - and each reading that CPU's
> > trace_pipe_raw file to recover the data - all in parallel.
>

> How will this work out in general? More examples, when walking
> pages by file/process, is it possible to divide the
> files/processes into N sets, and dump their pages concurrently?
> When walking the (huge) inode lists of different superblocks, is
> it possible to fork one thread for each superblock?
>
> In the above situations, they would demand concurrent instances
> with different filename/pid/superblock options.

the iterators are certainly more complex, and harder to parallelise,
in those cases, i submit.

But i like the page map example because it is (by far!) the largest
collection of objects. Four million pages on a test-box i have.

So if the design is right and we do dumping on that extreme-end very
well, we might not even care that much about parallelising dumping
in other situations, even if there are thousands of tasks - it will
just be even faster. And then we can keep the iterators and the APIs
as simple as simple.

( End even for tasks, which are perhaps the hardest to iterate, we
can still do the /proc method of iterating up to the offset by
counting. It wastes some time for each separate thread as it has
to count up to its offset, but it still allows the dumping itself
to be parallelised. Or we could dump blocks of the PID hash array.
That distributes tasks well, and can be iterated very easily with
low/zero contention. The result will come out unordered in any
case. )

What do you think?

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/