Re: System call instrumentation

From: Ingo Molnar
Date: Mon May 05 2008 - 02:56:53 EST



* Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxx> wrote:

> Hi Ingo,
>
> I looked at the system call instrumentation present in LTTng lately. I
> tried different solutions, e.g. hooking a kernel-wide syscall trace in
> do_syscall_trace, but it appears that I ended up re-doing another
> syscall table, which consists of specialized functions which extracts
> the string and data structure parameters from user-space. Since code
> duplication is not exactly wanted, I think that the original approach
> taken in my patchset, which is to instrument the kernel code at the
> sys_* level (e.g. sys_open), which is the earliest level where the
> parameter information is made available to the kernel, is still the
> best way to go.

hm, i'm not sure about this. I've implemented system call tracing in -rt
[embedded in the latency tracer] and it only needed changes in entry.S,
not in every system call site. Now, granted, that tracer was simpler
than what LTTng tries to do, but do we _really_ need more complexity? A
trace point that simply expresses:

sys_call_event(int sysno, long param1, long param2, long param3,
long param4, long param5, long param6);

would be more than enough in 99% of the cases. If you want to extract
the strings from the system call table, to make the filtering of these
syscall events easier, do it during build time (by for example modifying
the __SYSCALL() macros in unistd.h), instead of a parallel syscall
table.

OTOH, as long as it's just one simple nonintrusive line per syscall,
it's not a big deal - as long as it only traces the parameters as they
come from the syscall ABI - we wont change them anyway. I.e. hide the
ugly string portion, just turn them into a simple:

trace_sys_getpid();
trace_sys_time(tloc);
trace_sys_gettimeofday(tz, tv);

(although even such a solution would still need a general policy level
buy-in i guess - we dont want half of the syscalls traced, the other
half objected to by maintainers. So it's either all or nothing.)

and the question also arises: why not do this on a debuginfo level? This
information can be extracted from the debuginfo. We could change
'asmlinkage' to 'syscall_linkage' to make it clear which functions are
syscalls.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/