Re: [RFC PATCH 16/22 -v2] add get_monotonic_cycles

From: Mathieu Desnoyers
Date: Wed Jan 16 2008 - 12:49:37 EST


* Mathieu Desnoyers (mathieu.desnoyers@xxxxxxxxxx) wrote:
> * Steven Rostedt (rostedt@xxxxxxxxxxx) wrote:
> >
....
> >
>
> > One thing I want to clear up. The major difference between this
> > latency_tracer and LTTng is what we consider fast paths. The latency
> > tracer is recording things like enabling and disabling interrupts, preempt
> > count changes, or simply profiling all function calls. Those are what I
> > consider fast paths. The slow path WRT the latency_tracer are things like
> > context switches. This is why I don't have a problem with copying the
> > comm at context switch time. Because that _is_ a slow path for the latency
> > tracer.
>
> LTTng hooks in the lockdep tracer to trace irq on/off, spinlocks, etc..
> in flight recorder mode, we have nothing to write to disk and therefore
> we can handle very frequent events. We then do the analysis off-line
> using the last MB written in the buffers. The advantage is that the
> kernel dumbly writes data to a buffer : we therefore move the complexity
> to user-space.
>
> I agree that some kind of tracing, like the one you are doing, might be
> done more efficiently if you do a first clever analysis phase directly
> in the kernel without writing the raw high event rate data in memory
> buffers. However, I believe that it would be more powerful if we combine
> the two approaches rather than trying to do everything in or out of the
> kernel. LTTng could provide the comm names, priorities, etc, and your
> tracer could provide the top X list of processes that had a bad
> behavior. It would mean that the complete overall information would be
> made available after a post-processing phase done in an analysis tool
> like LTTV, but I don't see any problem with it.
>

Just to expand a bit on the design-side of my proposal :

Your module would create "profiles" based on the hooks called. If we
take the interrupt on/off for example, it would be called by lockdep and
could keep a table of the top X instructions that disables interrupts
for a long time. (it's just an example, you could want to save the pid
instead...)

Then, whenever a "profile dump" is triggered, you would simply have to
send the current state of your profile to lttng with something like :


struct irq_latency_table {
void *ip;
cycles_t time;
};

/*
* Make sure only one profile at a time is written to the trace for the
* whole system.
*/
static DECLARE_MUTEX(latency_profile_mutex);
static struct irq_latency_table latency_table[NR_ENTRIES];


void irq_latency_dump_profile(void)
{
int i;
char namebuf[KSYM_NAME_LEN];

mutex_lock(&latency_profile_mutex);
trace_mark(irq_latency_dump_begin, MARK_NOARGS);
for (i = 0; i < NR_ENTRIES; i++) {
sprint_symbol(namebuf, (unsigned long)latency_table[i].ip);
trace_mark(irq_latency_entry, "ip %p symbol %s time %llu",
latency_table[i].ip, namebuf,
(unsigned long long)latency_table[i].time);
}
trace_mark(irq_latency_dump_end, MARK_NOARGS);
mutex_unlock(&latency_profile_mutex);
}

You can then create a LTTV module that will format your nice output each
time a profile dump is encountered.

By doing this, your specialized profile generator would only have to
hook into the irq on/off events to gather the information it needs,
nothing more. I think that would trim the code size and the complexity
of your module by an interesting factor.

Note that I could optimize the way I currently deal with symbols by not
having to dump them in the trace, but since it's only for low rate
events, this optimization has a low priority on my todo list.

Mathieu

--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/