Re: Kernel marker has no performance impact on ia64.

From: Masami Hiramatsu
Date: Thu Jun 12 2008 - 13:07:41 EST


Hi,

Frank Ch. Eigler wrote:
> Hi -
>
> On Thu, Jun 12, 2008 at 12:16:35PM -0400, Masami Hiramatsu wrote:
>> [...]
>>> Think this through. How should systemtap (or another user-space
>>> separate-compiled tool like lttng) do this exactly?
>>> [...]
>>> (d) or another way?
>> use a lookup table. we can expect that the marking points which
>> regularly inserted in the upstream kernel are stable(not so
>> frequently change). In that case, we can easily maintain
>> a lookup table which has pairs of format strings like as
>> "sched_switch(struct task_struct * next, struct task_struct * prev)":"next %p prev %p"
>> out of tree. Thus, you can use the printf-style format parser.
>
> That's an interesting idea, but errors in this table would themselves
> only be caught at C compilation time.

Hmm, why would you think so?
I think if we can't find corresponding entry from the lookup table,
it becomes an error.

> Worse, it does nothing helpful
> for actually pulling out the next/prev fields of interest. Remember,
> real tracing users don't care so much about the task_struct pointers,
> but about observable things like PIDs. Systemtap would be back to the
> debuginfo or C-header-guessing/parsing job (or embedded-C, yuck).

Yeah, but that is same as previous marker. It depends on what parameter
the kernel pass to the marker. I mean, the parameter issue is not an
issue of the marker framework, but a discussion point of marking points.

> This is another reason why markers are a nice solution. They allow
> passing of actual useful values: not just the %p pointers but the most
> interesting derived values (e.g., prev->pid). And they do this
> *efficiently* - in out-of-line code that imposes no measurable
> overhead in the normal case..

Even if you use trace_mark() markers, you have to post a kernel patch
which passes the prev->pid to the marking point and to discuss it.
for example,
DEFINE_TRACE(sched_switch, (int prev_pid, int next_pid), prev_pid, next_pid)

But it might not so general, we have to discuss what parameters are enough
good for each marking point.

>
>
> - FChE

Thank you,

--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: mhiramat@xxxxxxxxxx

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/